CPM

The best known algorithm for continuous kNN queries is CPM proposed by Mouratidis et al. [MHP05]. It visits minimal set of cells during initial computation. CPM organizes the cells into conceptual rectangles based on their proximity to $ q$ . Each rectangle is defined by a direction and a level number. The direction is U, D, L or R (for up, down, left, and right), and the level number indicates the number of rectangles between the rectangle and $ q$ . Fig. [*](a) illustrates the conceptual partitioning of space for $ q$ around the cell $ c_{(4,4)}$ .

CPM initializes a heap by inserting $ c^q$ , the resident cell of $ q$ with key 0 , and the zero level rectangle for each direction $ DIR$ , with key $ mindist$ ($ DIR_0,q)$ which is the minimum possible distance between the query and the rectangle $ DIR_0$ . Then it starts deheaping the entries. If the deheaped entry is a cell, it examines all the objects inside it and updates the $ q.kNN$ accordingly, the list of $ k$ nearest neighbors. If the deheaped entry is a rectangle $ DIR_{lvl}$ , CPM enheaps (1) each cell $ c\in DIR_{lvl}$ with key $ mindist(c,q)$ and (2) the next level rectangle $ DIR_{lvl+1}$ with key $ mindist(DIR_{lvl+1},q)$ . The algorithm terminates when the next entry in the heap (either a cell or a rectangle) has key greater than or equal to the $ best\_dist$ (the distance of $ k^{th}$ Nearest Neighbor from $ q$ ). It can be easily seen that the algorithm only processes the cells that intersect with the circle centered at $ q$ with radius equal to $ best\_dist$ . This is the minimal set of cells to visit in order to guarantee the correctness. In Fig. [*](a), the shaded cells in light color are those that are visited by the algorithm.

Figure: CPM
\includegraphics[width=6.0in]{background/fig/cpm_initial.eps}

In order to handle the updates efficiently, CPM needs to store (1) visit list that contains the cells visited during NN search (in ascending order with respect to their $ mindist$ from $ q$ ) and (2) search heap $ H$ that contains the entries $ e$ (cells and rectangles) which were enheaped but were not deheaped (for which $ mindist(e,q)\geq best\_dist$ ). The visit list, in our running example of Fig. [*](a), contains the light shaded cells. The search heap $ H$ contains the shaded cells in dark color plus the four boundary rectangles $ U_2, R_1, D_1$ and $ L_2 $ . The influence region of query $ q$ are the cells $ c$ such that $ mindist(c,q)\leq best\_dist$ . In other words, this is the set of cells that intersect the circle centered at $ q$ with radius $ best\_dist$ . Only updates affecting these cells can influence the NN result.

CPM handles the updates as follows: If an object $ p$ enters into the influence region (incoming update), CPM inserts it in $ q.kNN$ and deletes the $ k^{th}$ nearest neighbor. If an object $ p$ leaves the influence region (outgoing update), CPM removes $ p$ from $ q.kNN$ and calls the re-computation module to compute nearest neighbor set. NN re-computation algorithm first starts visiting the cells in visit list of $ q$ and then it continues with the entries of search heap $ H$ in the same way as it does in NN initial computation. The algorithm stops when the next entry in visit list or search heap $ H$ has $ mindist$ greater than or equal to the distance of $ k^{th}$ NN found so far. Note that NN re-computation algorithm is same as initial computation algorithm except that it re-uses the previous information. i.e., It does not need to calculate the $ mindist$ of cells in visit list and it avoids enheap and deheap operations.

Consider the example of Fig. [*](b) where an object $ p_4$ moves to $ p'_4$ (to illustrate clearly, we ignore the update issued by $ p_2$ ). Since this is an incoming update, the object $ p_4$ is added to the $ q.kNN$ and the old $ k^{th}$ NN $ p_2$ is deleted from $ q.kNN$ . Now take the example where $ p_2$ moves to $ p'_2$ (now we ignore the update of $ p_4$ ). Since it is an outgoing update, the object $ p'_2$ is removed from the $ q.kNN$ and the NN re-computation module is invoked.

To handle multiple updates, CPM counts the number of incoming and outgoing updates for each query $ q$ . If incoming updates are greater than outgoing updates, the new nearest neighbors can be found among the incoming objects. Otherwise, re-computation module is called. If query $ q$ changes its position, it is deleted from the query list and added as a new query at new location. Then, it is computed from scratch.

Muhammad Aamir Cheema 2007-10-11