SearchGraph¶
The SearchGraph class defines an abstract interface for all graph classes.
- class deglib.graph.SearchGraph(graph_cpp: SearchGraph)¶
Bases:
ABC- search(query: ndarray, eps: float, k: int, filter_labels: None | ndarray | Filter = None, max_distance_computation_count: int = 0, entry_vertex_indices: List[int] | None = None, threads: int = 1, thread_batch_size: int = 0) Tuple[ndarray, ndarray]¶
Approximate nearest neighbor search based on yahoo’s range search algorithm for graphs.
Eps greater 0 extends the search range and takes additional graph vertices into account.
It is possible to limit the amount of work by specifying a maximal number of distances to be calculated. For lower numbers it is recommended to set eps to 0 since its very unlikely the method can make use of the extended the search range.
- Parameters:
query – A feature vector for which similar feature vectors should searched.
eps – Controls how many nodes are checked during search. Lower eps values like 0.001 are faster but less accurate. Higher eps values like 0.1 are slower but more accurate. Should always be greater 0.
k – The number of results that will be returned. If k is smaller than the number of vertices in the graph, k is set to the number of vertices in the graph.
filter_labels – A numpy array with dtype int32, that contains all labels that can be returned or an object of type Filter, that limits the possible results to a given set. All other labels will not be included in the result set.
max_distance_computation_count – Limit the number of distance calculations. If set to 0 this is ignored.
entry_vertex_indices – Start point for exploratory search. If None, a reasonable default is used.
threads – The number of threads to use for parallel processing. It should not excel the number of queries. If set to 0, the minimum of the number of cores of this machine and the number of queries is used.
thread_batch_size – If threads != 1, the number of queries to search in the same thread.
- Returns:
A tuple containing (indices, distances) where indices is a numpy-array of shape [n_queries, k] containing the indices to the closest found neighbors to the queries. Distances is a numpy-array of shape [n_queries, k] containing the distances to the closest found neighbors.
- abstract size() int¶
- Returns:
the number of vertices in the graph
- abstract get_edges_per_vertex() int¶
- Returns:
the number of edges of each vertex
- abstract get_feature_space() SpaceInterface¶
- Returns:
the feature space
- abstract get_external_label(internal_index: int) int¶
Translates external labels to internal index
- Parameters:
internal_index – The internal index to translate
- Returns:
The external label
- abstract get_internal_index(external_label: int) int¶
Translates internal index to external label
- Parameters:
external_label – The external label to translate
- Returns:
The internal index
- abstract get_neighbor_indices(internal_index: int, copy: bool = False) ndarray¶
Get the indices (internal index) of the given vertex.
- Parameters:
internal_index – The internal index to get the neighbors of
copy – If True the returned neighbor indices are a copy, otherwise they reference internal graph data.
- Returns:
The internal index
- get_feature_vector(index: int, copy: bool = False) ndarray¶
Get the feature vector of the given internal index.
- Parameters:
index – The internal index to get the feature vector of
copy – If True the returned feature vector is a copy, otherwise a reference to the graph data is returned.
- Returns:
The feature vector of the given index
- abstract has_vertex(external_label: int) bool¶
- Returns:
whether the given external label is present in the graph.
- abstract has_edge(internal_index: int, neighbor_index: int) bool¶
- Returns:
whether the vertex at internal_index has an edge to the vertex at neighbor_index.
- get_entry_vertex_indices() List[int]¶
Creates a list of internal indices that can be used as starting point for an anns search.
- abstract has_path(entry_vertex_indices: List[int], to_vertex: int, eps: float, k: int) List[ObjectDistance]¶
Returns a path from one of the entry vertex indices to the given to_vertex.
- Parameters:
entry_vertex_indices – List of start vertices
to_vertex – The vertex to find a path to
eps – Controls how many nodes are checked during search. Lower eps values like 0.001 are faster but less accurate. Higher eps values like 0.1 are slower but more accurate. Should always be greater 0.
k – TODO
- abstract explore(entry_vertex_index: int, k: int, include_entry: bool, max_distance_computation_count: int) ResultSet¶
An exploration for similar element, limited by max_distance_computation_count
- Parameters:
entry_vertex_index – The start point for which similar feature vectors should be searched
k – The number of similar feature vectors to return
include_entry – If True, the entry vertex is included in the result set.
max_distance_computation_count – Limit the number of distance calculations. If set to 0 this is ignored.