$ k$ NN Queries in Spatial Networks

Due to the importance of spatial networks in real-life applications, many approaches have been presented to answer spatial queries on spatial networks. In practice, objects can usually only move on a pre-defined set of trajectories as specified by underlying network (road, railway, river etc.). Thus, the importance measure is the network distance, i.e. the length of the shortest trajectory connecting two objects, rather than their Euclidean distance. Since the most popular application of such queries is finding $ k$ closest objects on a road-network, such queries are also called $ k$ NN queries on road-network. Continuous monitoring of $ k$ NN queries on road-networks has also been studied extensively.

Jensen et al. [JKPT03] formalize the problem of $ k$ NN search in road-networks and present a system prototype for such queries. Papadias et al. [PZMT03] describe a framework that integrates network and Euclidean information, and answers various spatial queries including $ k$ NN and range queries. They index the data objects with an R-tree and utilize connectivity and location information to guide the search. Their approach is based on generating a search region for the query point that expands from the query. The advantages of their approach are: 1) it offers a method that finds the exact distance in networks, and 2) the architecture can support other spatial queries like closest pair and e-join queries. Since the number of links and nodes that are retrieved and examined are inversely proportional to cardinality ratio of entities and number of nodes in the network, the main disadvantage of this approach is a dramatic degradation in performance when the above cardinality ratio is less than 10% which is the usual case for real world scenarios (real data sets representing the road network and different type on entities in the State of California show that this cardinality ratio is usually between 0.04% and 3% [KS04a]).

Kolahdouzan et al. [KS04a] proposed a new approach called $ VN^3$ for NN queries in spatial network databases which precalculates the Network Voronoi Polygons (NVPs) and some network distances. $ VN^3$ is based on the properties of the network Voronoi diagrams. The intuition is that the NVPs can directly be used to find the first nearest neighbor of a query $ q$ . Subsequently, the adjacency information of NVPs can be utilized to provide a candidate set for other nearest neighbors of $ q$ . Though their approach outperforms INE [PZMT03], but for higher densities of objects it suffers from computational overhead of precalculating NVPs.

A faster but approximate $ k$ NN results are retrieved in [KS04b] by an embedding technique that approximates the network distance with computationally simple functions. Given a user-specified trajectory, a path $ k$ NN query retrieves the $ k$ NNs at any point in the trajectory. Assume that we know the $ k+1$ NNs at some point $ o$ in the query trajectory, and that their network distance from $ o$ is $ dist_i$ , for $ i=1,...,k+1$ . Let $ \delta$ be the smallest difference between the distances of two consecutive NNs, i.e; $ \delta=min_{1\leq i\leq k}\{dist_{i+1} - dist_i\}$ . Shahabi et al.[SKS02] propose a path NN algorithm based on the observation that any point in the query trajectory within distance $ \delta/2$ from $ o$ has exactly same NN set as $ o$ .

Cho and Chung [CC05] solve the problem by retrieving $ k$ NN sets of all network nodes in the query path and uniting them with the data objects falling in the path. It can be observed that the resulting set contains $ k$ NNs of any point in the query trajectory. Hu et al. [HLX06] proposed a new approach that simplifies the network by replacing the graph topology with a set of interconnected tree-based structures called SPIE's. For each SPIE, they build an index and the algorithm computes $ k$ NNs by searching SPIE on a predetermined path to avoid costly network expansions.

Mouratidis et al. [MYPM06] address the continuous monitoring of $ k$ NN queries in road-networks with arbitrary object and query moving patterns. Their approach also supports the case when the edge weights of underlying network may change at any time. They present two approaches. The first one maintains the query results by processing only updates that may invalidate the current NN sets. The second method follows the shared execution paradigm. More specifically, it groups together the queries that fall in the path between two consecutive intersections in the networks, and produces the results by monitoring the NN sets of these intersections.

Muhammad Aamir Cheema 2007-10-11