Chapter 61
Spatial Searching
Hans Tangelder and Andreas Fabri
61.1 Introduction
The spatial searching package implements exact and approximate
distance browsing by providing implementations of algorithms
supporting
- both nearest and furthest neighbor searching
- both exact and approximate searching
- (approximate) range searching
- (approximate) -nearest and -furthest neighbor searching
- (approximate) incremental nearest and incremental furthest neighbor searching
- query items representing points and spatial objects.
In these searching problems a set of data points in
-dimensional space is given. The points can be represented by
Cartesian coordinates or homogeneous coordinates. These points are
preprocessed into a tree data structure, so that given any
query item the points of can be browsed efficiently. The
approximate spatial searching package is designed for data sets that
are small enough to store the search structure in main memory (in
contrast to approaches from databases that assume that the data reside
in secondary storage).
61.1.1 Neighbor Searching
Spatial searching supports browsing through a collection of
-dimensional spatial objects stored in a spatial data structure on
the basis of their distances to a query object. The query object may
be a point or an arbitrary spatial object, e.g, a -dimensional
sphere. The objects in the spatial data structure are -dimensional
points.
Often the number of the neighbors to be computed is not know
beforehand, e.g., because the number may depend on some properties of
the neighbors (for example when querying for the nearest city to Paris with
population greater than a million) or the distance to the query point.
The convential approach is -nearest neighbor searching that
makes use of a -nearest neighbor algorithm, where is known
prior to the invocation of the algorithm. Hence, the number of
nearest neighbors has to be guessed. If the guess is too large
redundant computations are performed. If the number is too small the
computation has to be reinvoked for a larger number of neighbors,
thereby performing redundant computations. Therefore, Hjaltason and
Samet [HS95] introduced incremental nearest neighbor
searching in the sense that having obtained the nearest
neighbors, the + 1 neighbor can be obtained without having
to calculate the + 1 nearest neighbor from scratch.
Spatial searching typically consists of a preprocessing phase and a
searching phase. In the preprocessing phase one builds a search
structure and in the searching phase one makes the queries. In the
preprocessing phase the user builds a tree data structure
storing the spatial data. In the searching phase the user invokes a
searching method to browse the spatial data.
With relatively minor modifications, nearest neighbor searching
algorithms can be used to find the furthest object from the query
object. Therefore, furthest neighbor searching is also
supported by the spatial searching package.
The execution time for exact neighbor searching can be reduced by
relaxing the requirement that the neighbors should be computed
exactly. If the distances of two objects to the query object are
approximately the same, instead of computing the nearest/furthest
neighbor exactly, one of these objects may be returned as the
approximate nearest/furthest neighbor. I.e., given some non-negative
constant the distance of an object returned as an
approximate -nearest neighbor must not be larger than
, where denotes the distance to the real
nearest neighbor. Similar the distance of an approximate -furthest
neighbor must not be smaller than . Obviously, for
we get the exact result, and the larger is,
the less exact the result.
Neighbor searching is implemented by the following four classes.
The class CGAL::Orthogonal_k_neighbor_search<Traits,
OrthogonalDistance, Splitter, SpatialTree> implements the standard
search strategy for orthogonal distances like the weighted Minkowski
distance. It requires the use of extended nodes in the spatial tree
and supports only neighbor searching for point queries.
The class CGAL::K_neighbor_search<Traits, GeneralDistance,
Splitter, SpatialTree> implements the standard search strategy for
general distances like the Manhattan distance for iso-rectangles.
It does not require the use of extended nodes in the spatia tree and supports
only neighbor searching for queries defined by points or spatial
objects.
The class Orthogonal_incremental_neighbor_search<Traits,
GeneralDistance, Splitter, SpatialTree> implements the incrementral
search strategy for general distances like the weighted Minkowski
distance. It requires the use of extended nodes in the spatial tree
and supports incremental neighbor searching and distance browsing for
point queries.
The class CGAL::Incremental_neighbor_search<Traits,
GeneralDistance, Splitter, SpatialTree> implements the incremental
search strategy for general distances like the Manhattan distance for
iso-rectangles. It does not requires the use of extended nodes in the
spatial tree and supports incremental neighbor searching and distance
browsing for queries defined by points or spatial objects.
61.1.2 Range Searching
Exact range searching and approximate range searching is
supported using exact or fuzzy -dimensional objects enclosing a
region. The fuzziness of the query object is specified by a parameter
denoting a maximal allowed distance to the boundary of a
query object. If the distance to the the boundary is at least
, points inside the object are always reported and points
outside the object are never reported. Points within distance
to the boundary may be or may be not reported. For exact
range searching the fuzziness parameter is set to zero.
The class Kd_tree implements range searching in the method search,
which is a template method with an output iterator and a model of the
concept FuzzyQueryItem as CGAL::Fuzzy_iso_box_d
or CGAL::Fuzzy_sphere_d.
For range searching of large data sets the user may set the parameter bucket_size
used in building the - tree to a large value (e.g. 100),
because in general the query time will be less then using the default value.
61.2 Splitting Rules
Instead of using the default splitting rule Sliding_midpoint described below,
a user may, depending upon the data, select
one from the following splitting rules,
which determine how a separating hyperplane is computed:
- Midpoint_of_rectangle
-
This splitting rule cuts a rectangle through its midpoint orthogonal
to the longest side.
- Midpoint_of_max_spread
-
This splitting rule cuts a rectangle through orthogonal
to the dimension with the maximum point spread .
- Sliding_midpoint
-
This is a modification of the midpoint of rectangle splitting rule.
It first attempts to perform a midpoint of rectangle split as
described above. If data points lie on both sides of the separating
plane the sliding midpoint rule computes the same separator as
the midpoint of rectangle rule. If the data points lie only on one
side it avoids this by sliding the separator, computed by
the midpoint of rectangle rule, to the nearest datapoint.
- Median_of_rectangle
-
The splitting dimension is the dimension of the longest side of the rectangle.
The splitting value is defined by the median of the coordinates of the data points
along this dimension.
- Median_of_max_spread
-
The splitting dimension is the dimension of the longest side of the rectangle.
The splitting value is defined by the median of the coordinates of the data points
along this dimension.
- Fair
-
This splitting rule is a compromise between the median of rectangle
splitting rule and the midpoint of rectangle splitting rule. This
splitting rule maintains an upper bound on the maximal allowed ratio
of the longest and shortest side of a rectangle (the value of this
upper bound is set in the constructor of the fair splitting
rule). Among the splits that satisfy this bound, it selects the one in
which the points have the largest spread. It then splits the points
in the most even manner possible, subject to maintaining the bound on
the ratio of the resulting rectangles.
- Sliding_fair
-
This splitting rule is a compromise between the fair splitting rule
and the sliding midpoint rule. Sliding fair-split is based on the
theory that there are two types of splits that are good: balanced
splits that produce fat rectangles, and unbalanced splits provided the
rectangle with fewer points is fat.
Also, this splitting rule maintains an upper bound on the maximal
allowed ratio of the longest and shortest side of a rectangle (the
value of this upper bound is set in the constructor of the fair
splitting rule). Among the splits that satisfy this bound, it selects
the one one in which the points have the largest spread. It then
considers the most extreme cuts that would be allowed by the aspect
ratio bound. This is done by dividing the longest side of the
rectangle by the aspect ratio bound. If the median cut lies between
these extreme cuts, then we use the median cut. If not, then consider
the extreme cut that is closer to the median. If all the points lie
to one side of this cut, then we slide the cut until it hits the first
point. This may violate the aspect ratio bound, but will never
generate empty cells.
61.3 Example Programs
We give six examples. The first example illustrates k nearest neighbor
searching, and the second example incremental neighbor searching.
The third is an example of approximate furthest neighbor searching
using a -dimensional iso-rectangle as an query object. Approximate
range searching is illustrated by the fourth example. The fifth
example illustrates k neighbour searching for a user defined point
class. The last example shows how to choose another splitting rule in the
- tree that is used as search tree.
61.3.1 Example of K Neighbor Searching
The first example illustrates k neighbor searching with an Euclidean
distance and 2-dimensional points. The generated random
data points are inserted in a search tree. We then initialize
the k neighbor search object with the origin as query. Finally, we
obtain the result of the computation in the form of an iterator
range. The value of the iterator is a pair of a point and its square
distance to the query point. We use square distances, or transformed distances for other distance classes, as they are
computationally cheaper.
// file: examples/Spatial_searching/Nearest_neighbor_searching.C
#include <CGAL/Simple_cartesian.h>
#include <CGAL/point_generators_2.h>
#include <CGAL/Orthogonal_k_neighbor_search.h>
#include <CGAL/Search_traits_2.h>
#include <list>
typedef CGAL::Simple_cartesian<double> K;
typedef K::Point_2 Point_d;
typedef CGAL::Search_traits_2<K> TreeTraits;
typedef CGAL::Orthogonal_k_neighbor_search<TreeTraits> Neighbor_search;
typedef Neighbor_search::Tree Tree;
int main() {
const int N = 1;
std::list<Point_d> points;
points.push_back(Point_d(0,0));
Tree tree(points.begin(), points.end());
Point_d query(0,0);
// Initialize the search structure, and search all N points
Neighbor_search search(tree, query, N);
// report the N nearest neighbors and their distance
// This should sort all N points by increasing distance from origin
for(Neighbor_search::iterator it = search.begin(); it != search.end(); ++it){
std::cout << it->first << " "<< sqrt(it->second) << std::endl;
}
return 0;
}
61.3.2 Example of Incremental Searching
This example program illustrates incremental searching for the closest
point with a positive first coordinate. We can use the orthogonal
incremental neighbor search class, as the query is also a point and as
the distance is the Euclidean distance.
As for the neighbor search, we first initialize the search tree with
the data. We then create the search object, and finally obtain the iterator
with the begin() method. Note that the iterator is of the input
iterator category, that is one can make only one pass over the data.
// file: examples/Spatial_searching/Distance_browsing.C
#include <CGAL/Simple_cartesian.h>
#include <CGAL/Orthogonal_incremental_neighbor_search.h>
#include <CGAL/Search_traits_2.h>
typedef CGAL::Simple_cartesian<double> K;
typedef K::Point_2 Point_d;
typedef CGAL::Search_traits_2<K> TreeTraits;
typedef CGAL::Orthogonal_incremental_neighbor_search<TreeTraits> NN_incremental_search;
typedef NN_incremental_search::iterator NN_iterator;
typedef NN_incremental_search::Tree Tree;
// A functor that returns true, iff the x-coordinate of a dD point is not positive
struct X_not_positive {
bool operator()(const NN_iterator& it) { return ((*it).first)[0]<0; }
};
// An iterator that only enumerates dD points with positive x-coordinate
typedef CGAL::Filter_iterator<NN_iterator, X_not_positive> NN_positive_x_iterator;
int main() {
Tree tree;
tree.insert(Point_d(0,0));
tree.insert(Point_d(1,1));
tree.insert(Point_d(0,1));
tree.insert(Point_d(10,110));
tree.insert(Point_d(45,0));
tree.insert(Point_d(0,2340));
tree.insert(Point_d(0,30));
Point_d query(0,0);
NN_incremental_search NN(tree, query);
NN_positive_x_iterator it(NN.end(), X_not_positive(), NN.begin()), end(NN.end(), X_not_positive());
std::cout << "The first 5 nearest neighbours with positive x-coord are: " << std::endl;
for (int j=0; (j < 5)&&(it!=end); ++j,++it)
std::cout << (*it).first << " at squared distance = " << (*it).second << std::endl;
return 0;
}
61.3.3 Example of General Neighbor Searching
This example program illustrates approximate nearest and furthest
neighbor searching using 4-dimensional Cartesian coordinates. Five
approximate nearest neighbors of the query rectangle
are computed. Because the query object is a rectangle
we cannot use the Orthogonal neighbor search. As in the previous
examples we first initialize a search tree, create the search object
with the query, and obtain the result of the search as iterator range.
// file: examples/Spatial_searching/General_neighbor_searching.C
#include <CGAL/Cartesian_d.h>
#include <CGAL/point_generators_2.h>
#include <CGAL/Manhattan_distance_iso_box_point.h>
#include <CGAL/K_neighbor_search.h>
#include <CGAL/Search_traits_2.h>
typedef CGAL::Cartesian_d<double> K;
typedef K::Point_d Point_d;
typedef CGAL::Random_points_in_square_2<Point_d> Random_points_iterator;
typedef K::Iso_box_d Iso_box_d;
typedef K TreeTraits;
typedef CGAL::Manhattan_distance_iso_box_point<TreeTraits> Distance;
typedef CGAL::K_neighbor_search<TreeTraits, Distance> Neighbor_search;
typedef Neighbor_search::Tree Tree;
int main() {
const int N = 1000;
const int K = 10;
Tree tree;
Random_points_iterator rpg;
for(int i = 0; i < N; i++){
tree.insert(*rpg++);
}
Point_d pp(0.1,0.1);
Point_d qq(0.2,0.2);
Iso_box_d query(pp,qq);
Distance tr_dist;
Neighbor_search N1(tree, query, K, 0.0, false); // eps=10.0, nearest=false
std::cout << "For query rectange = [0.1,0.2]^2 " << std::endl
<< "The " << K << " approximate furthest neighbors are: " << std::endl;
for (Neighbor_search::iterator it = N1.begin();it != N1.end();it++) {
std::cout << " Point " << it->first << " at distance = " << tr_dist.inverse_of_transformed_distance(it->second) << std::endl;
}
return 0;
}
61.3.4 Example of a Range Query
This example program illustrates approximate range querying for
4-dimensional fuzzy iso-rectangles and spheres using homogeneous
coordinates. The range queries are member functions of the -
tree class.
// file: examples/Spatial_searching/Fuzzy_range_query.C
#include <CGAL/Cartesian_d.h>
#include <CGAL/point_generators_d.h>
#include <CGAL/Kd_tree.h>
#include <CGAL/Fuzzy_sphere.h>
#include <CGAL/Fuzzy_iso_box.h>
#include <CGAL/Search_traits_d.h>
typedef CGAL::Cartesian_d<double> K;
typedef K::Point_d Point_d;
typedef CGAL::Search_traits_d<K> Traits;
typedef CGAL::Random_points_in_iso_box_d<Point_d> Random_points_iterator;
typedef CGAL::Counting_iterator<Random_points_iterator> N_Random_points_iterator;
typedef CGAL::Kd_tree<Traits> Tree;
typedef CGAL::Fuzzy_sphere<Traits> Fuzzy_sphere;
typedef CGAL::Fuzzy_iso_box<Traits> Fuzzy_iso_box;
int main() {
const int D = 4;
const int N = 1000;
// generator for random data points in the square ( (-1000,-1000), (1000,1000) )
Random_points_iterator rpit(4, 1000.0);
// Insert N points in the tree
Tree tree(N_Random_points_iterator(rpit,0),
N_Random_points_iterator(N));
// define range query objects
double pcoord[D] = { 300, 300, 300, 300 };
double qcoord[D] = { 900.0, 900.0, 900.0, 900.0 };
Point_d p(D, pcoord, pcoord+D);
Point_d q(D, qcoord, qcoord+D);
Fuzzy_sphere fs(p, 700.0, 100.0);
Fuzzy_iso_box fib(p, q, 100.0);
std::cout << "points approximately in fuzzy range query" << std::endl;
std::cout << "with center (300.0, 300.0, 300.0, 300.0)" << std::endl;
std::cout << "and fuzzy radius <200.0,400.0> are:" << std::endl;
tree.search(std::ostream_iterator<Point_d>(std::cout, "\n"), fs);
std::cout << "points approximately in fuzzy range query ";
std::cout << "[<200,4000>,<800,1000>]]^4 are:" << std::endl;
tree.search(std::ostream_iterator<Point_d>(std::cout, "\n"), fib);
return 0;
}
61.3.5 Example Illustrating Use of User Defined Point and Distance Class
The neighbor searching works with all CGAL kernels, as well as with
user defined points and distance classes.
In this example we assume that the user provides the following 3-dimensional
points class.
struct Point {
double vec[3];
Point() { vec[0]= vec[1] = vec[2] = 0; }
Point (double x, double y, double z) { vec[0]=x; vec[1]=y; vec[2]=z; }
double x() const { return vec[ 0 ]; }
double y() const { return vec[ 1 ]; }
double z() const { return vec[ 2 ]; }
double& x() { return vec[ 0 ]; }
double& y() { return vec[ 1 ]; }
double& z() { return vec[ 2 ]; }
bool operator==(const Point& p) const
{
return (x() == p.x()) && (y() == p.y()) && (z() == p.z()) ;
}
bool operator!=(const Point& p) const { return ! (*this == p); }
}; //end of class
namespace CGAL {
template <>
struct Kernel_traits<Point> {
struct Kernel {
typedef double FT;
typedef double RT;
};
};
}
struct Construct_coord_iterator {
const double* operator()(const Point& p) const
{ return static_cast<const double*>(p.vec); }
const double* operator()(const Point& p, int) const
{ return static_cast<const double*>(p.vec+3); }
};
We have put the glue layer in this file as well, that is a class that allows to
iterate over the Cartesian coordinates of the point, and a class to construct
such an iterator for a point. We next need a distance class
struct Distance {
typedef Point Query_item;
double transformed_distance(const Point& p1, const Point& p2) const {
double distx= p1.x()-p2.x();
double disty= p1.y()-p2.y();
double distz= p1.z()-p2.z();
return distx*distx+disty*disty+distz*distz;
}
template <class TreeTraits>
double min_distance_to_rectangle(const Point& p,
const CGAL::Kd_tree_rectangle<TreeTraits>& b) const {
double distance(0.0), h = p.x();
if (h < b.min_coord(0)) distance += (b.min_coord(0)-h)*(b.min_coord(0)-h);
if (h > b.max_coord(0)) distance += (h-b.max_coord(0))*(h-b.max_coord(0));
h=p.y();
if (h < b.min_coord(1)) distance += (b.min_coord(1)-h)*(b.min_coord(1)-h);
if (h > b.max_coord(1)) distance += (h-b.max_coord(1))*(h-b.min_coord(1));
h=p.z();
if (h < b.min_coord(2)) distance += (b.min_coord(2)-h)*(b.min_coord(2)-h);
if (h > b.max_coord(2)) distance += (h-b.max_coord(2))*(h-b.max_coord(2));
return distance;
}
template <class TreeTraits>
double max_distance_to_rectangle(const Point& p,
const CGAL::Kd_tree_rectangle<TreeTraits>& b) const {
double h = p.x();
double d0 = (h >= (b.min_coord(0)+b.max_coord(0))/2.0) ?
(h-b.min_coord(0))*(h-b.min_coord(0)) : (b.max_coord(0)-h)*(b.max_coord(0)-h);
h=p.y();
double d1 = (h >= (b.min_coord(1)+b.max_coord(1))/2.0) ?
(h-b.min_coord(1))*(h-b.min_coord(1)) : (b.max_coord(1)-h)*(b.max_coord(1)-h);
h=p.z();
double d2 = (h >= (b.min_coord(2)+b.max_coord(2))/2.0) ?
(h-b.min_coord(2))*(h-b.min_coord(2)) : (b.max_coord(2)-h)*(b.max_coord(2)-h);
return d0 + d1 + d2;
}
double new_distance(double& dist, double old_off, double new_off,
int cutting_dimension) const {
return dist + new_off*new_off - old_off*old_off;
}
double transformed_distance(double d) const { return d*d; }
double inverse_of_transformed_distance(double d) { return sqrt(d); }
}; // end of struct Distance
We are ready to put the pices together.
The class Search_traits<..> which you see in the next file is then a mere
wrapper for all these types. The searching itself works exactly as for CGAL kernels.
//file: examples/Spatial_searching/User_defined_point_and_distance.C
#include <CGAL/basic.h>
#include <CGAL/Search_traits.h>
#include <CGAL/point_generators_3.h>
#include <CGAL/Orthogonal_k_neighbor_search.h>
#include "Point.h" // defines types Point, Construct_coord_iterator
#include "Distance.h"
typedef CGAL::Random_points_in_cube_3<Point> Random_points_iterator;
typedef CGAL::Counting_iterator<Random_points_iterator> N_Random_points_iterator;
typedef CGAL::Search_traits<double, Point, const double*, Construct_coord_iterator> Traits;
typedef CGAL::Orthogonal_k_neighbor_search<Traits, Distance> K_neighbor_search;
typedef K_neighbor_search::Tree Tree;
int main() {
const int N = 1000;
const int K = 5;
// generator for random data points in the cube ( (-1,-1,-1), (1,1,1) )
Random_points_iterator rpit( 1.0);
// Insert number_of_data_points in the tree
Tree tree(N_Random_points_iterator(rpit,0),
N_Random_points_iterator(N));
Point query(0.0, 0.0, 0.0);
Distance tr_dist;
// search K nearest neighbours
K_neighbor_search search(tree, query, K);
for(K_neighbor_search::iterator it = search.begin(); it != search.end(); it++){
std::cout << " d(q, nearest neighbor)= "
<< tr_dist.inverse_of_transformed_distance(it->second) << std::endl;
}
// search K furthest neighbour searching, with eps=0, search_nearest=false
K_neighbor_search search2(tree, query, K, 0.0, false);
for(K_neighbor_search::iterator it = search2.begin(); it != search2.end(); it++){
std::cout << " d(q, furthest neighbor)= "
<< tr_dist.inverse_of_transformed_distance(it->second) << std::endl;
}
return 0;
}
61.3.6 Example of Selecting a Splitting Rule and Setting the Bucket Size
This example program illustrates selecting a splitting rule and
setting the maximal allowed bucket size. The only differences with
the first example are the declaration of the Fair
splitting rule, needed to set the maximal allowed bucket size.
// file: examples/Spatial_searching/Using_fair_splitting_rule.C
#include <CGAL/Simple_cartesian.h>
#include <CGAL/point_generators_2.h>
#include <CGAL/Search_traits_2.h>
#include <CGAL/Orthogonal_k_neighbor_search.h>
typedef CGAL::Simple_cartesian<double> R;
typedef R::Point_2 Point_d;
typedef CGAL::Random_points_in_square_2<Point_d> Random_points_iterator;
typedef CGAL::Counting_iterator<Random_points_iterator> N_Random_points_iterator;
typedef CGAL::Search_traits_2<R> Traits;
typedef CGAL::Euclidean_distance<Traits> Distance;
typedef CGAL::Fair<Traits> Fair;
typedef CGAL::Orthogonal_k_neighbor_search<Traits,Distance,Fair> Neighbor_search;
typedef Neighbor_search::Tree Tree;
int main() {
const int N = 1000;
// generator for random data points in the square ( (-1,-1), (1,1) )
Random_points_iterator rpit( 1.0);
Fair fair(5); // bucket size=5
// Insert number_of_data_points in the tree
Tree tree(N_Random_points_iterator(rpit,0),
N_Random_points_iterator(N),
fair);
Point_d query(0,0);
// Initialize the search structure, and search all N points
Neighbor_search search(tree, query, N);
// report the N nearest neighbors and their distance
// This should sort all N points by increasing distance from origin
for(Neighbor_search::iterator it = search.begin(); it != search.end(); ++it){
std::cout << it->first << " "<< sqrt(it->second) << std::endl;
}
return 0;
}
61.4 Software Design
61.4.1 The - tree
Bentley [Ben75] introduced the - tree as a
generalization of the binary search tree in higher dimensions. -
trees hierarchically decompose space into a relatively small number of
rectangles such that no rectangle contains too many input objects.
For our purposes, a rectangle in real dimensional space,
, is the product of closed intervals on the coordinate axes.
- trees are obtained by partitioning point sets in using
(-1)-dimensional hyperplanes. Each node in the tree is split into
two children by one such separating hyperplane. Several splitting
rules (see Section can
be used to compute a seperating (-1)-dimensional hyperplane.
Each internal node of the - tree is associated with a rectangle
and a hyperplane orthogonal to one of the coordinate axis, which
splits the rectangle into two parts. Therefore, such a hyperplane,
defined by a splitting dimension and a splitting value, is called a
separator. These two parts are then associated with the two child
nodes in the tree. The process of partitioning space continues until
the number of data points in the rectangle falls below some given
threshold. The rectangles associated with the leaf nodes are called
buckets, and they define a subdivision of the space into
rectangles. Data points are only stored in the leaf nodes of the
tree, not in the internal nodes.
Friedmann, Bentley and Finkel [FBF77] described the
standard search algorithm to find the th nearest neighbor by
searching a - tree recursively.
When encountering a node of the tree, the algorithm first visits the
child that is closest to the query point. On return, if the rectangle
containing the other child lies within 1/ (1+) times the
distance to the th nearest neighbors so far, then the other child
is visited recursively. Priority search [AM93b] visits the
nodes in increasing order of distance from the queue with help of a
priority queue. The search stops when the distance of the query point
to the nearest nodes exceeds the distance to the nearest point found
with a factor 1/ (1+). Priority search supports next
neighbor search, standard search does not.
In order to speed-up the internal distance computations in nearest
neighbor searching in high dimensional space, the approximate
searching package supports orthogonal distance computation. Orthogonal distance
computation
implements the efficient incremental distance computation technique
introduced by Arya and Mount [AM93a]. This technique
works only for neighbor queries with query items represented as points
and with a quadratic form distance, defined by , where the matrix is positive definite,
i.e. . An important class of quadratic form
distances are weighted Minkowski distances. Given a parameter
and parameters , the weighted Minkowski
distance is defined by for and defined by
. The
Manhattan distance (, ) and the Euclidean distance (,
) are examples of a weighted Minkowski metric.
To speed up distance computations also transformed distances are used
instead of the distance itself. For instance for the Euclidean
distance, to avoid the expensive computation of square roots, squared
distances are used instead of the Euclidean distance itself.