Motivation

A grid based index evenly partitions the space into cells. The extent of each cell $ c$ on each dimension is $ \delta$ . $ c[i, j]$ indicates the cell at column $ i$ and row $ j$ and the lower-left corner cell of the grid is $ c[0, 0]$ . Clearly, point $ p$ with location coordinates ($ p.x,p.y$ ) falls into the cell $ c[\lfloor p.x / \delta \rfloor, \lfloor p.y / \delta \rfloor]$ . An example of grid based index is shown in Figure [*]. Compared with other complicated spatial indexes (e.g., $ R$ -tree) grid based index is simple and can be maintained efficiently in the dynamic environment. The order in which cells of grid are accessed is important for efficient monitoring of the NN queries hence a need is to develop an access method which minimizes the number of cells accessed. The minimum number of cells that are needed to be accessed to answer any $ k$ NN query q are only the cells that lie or intersect the circle with center $ q$ and radius $ q.dist_k$ (the shaded cells in Fig. [*]), where $ q.dist_k$ is the distance of $ k^{th}$ NN from $ q$ . Any algorithm that visits a cell that lies outside the circle, visits a cell which cannot have any result object. In contrast, any algorithm that does not access any cell that lies or intersects the circle may potentially miss a valid result object. To the best of our knowledge, CPM [MHP05] provides the only method that tries to minimize the number of cells accessed. CPM minimizes the number of cells accessed in initial computation of any $ k$ NN query by employing a conceptual partitioning of grid into rectangles and using a heap as discussed in details in Section [*]. However, CPM accesses unnecessary cells during handling of the updates. Moreover, it also needs to store some data structure for each registered query $ q$ .

Figure: Grid Index
\includegraphics[bb=215 341 379 505]{kNN/fig/grid_example.eps}
Figure: Minimal Set of Cells
\includegraphics[bb=215 341 379 505]{kNN/fig/minCells.eps}

CPM partitions the space into conceptual rectangles and the rectangles do not reflect spatial proximity to any point. We note that circle centered at query point represents the better proximity and we use this intuition to develop our grid access methods. We design CircularTrip and ArcTrip that access the cells according to their proximity to $ q$ and minimize the number of cells accessed not only in initial computation of results but also during the updates handling of nearest neighbor queries.

CircularTrip: Given a radius $ r$ , CircularTrip returns all the cells that intersect the circle of radius $ r$ with center at $ q$ .

ArcTrip: Given a radius $ r$ and an angle range $ \langle \theta_1,\theta_2
\rangle$ , ArcTrip returns all the cells that intersect the circle of radius $ r$ with center at $ q$ and lie between angle range $ \langle \theta_1,\theta_2
\rangle$ .

In next chapters, we will show how these access methods can be used to continuously monitor $ k$ NN queries and its variants (i.e., constrained NN queries). We show that by using our access method, these queries can be answered while maintaining the following properties

We also prove that our proposed access methods computes the $ mindist(c,q)$ exactly the same number of times as the number of cells it returns. Below in Table [*] we summarize the math notations used throughout this thesis.


Table: Math Notations
Notation  Definition
 $ p$  a moving data point
 $ q$  the query point
 $ p.x$ , $ p.y$ , $ q.x$ , $ q.y$    the $ x$ -axis ($ y$ -axis) coordinate of $ p$ ($ q$ )
 $ c$ , $ c[i, j]$  a cell (at column $ i$ and row $ j$ )
 $ c^q$  the cell containing $ q$
 $ c.i$ , $ c.j$  column (row) value of cell $ c$
 $ \delta$  cell side length
  $ dist(p,q)$  the distance between $ p$ and $ q$
 $ q.kNN$  the $ k$ NN of $ q$
 $ q.dist_k$  the distance between the $ k$ -th NN and $ q$
  $ mindist(c,q)$ , $ maxdist(c, q)$  the minimum (maximum) distance between $ c$ and $ q$


Muhammad Aamir Cheema 2007-10-11