Abstract
|
Recent years saw the rapid development of peer-to-peer (P2P) networks in a great variety of applications. However, similarity-based k-nearest-neighbor retrieval (k-NN) is still a challenging task in P2P networks due to the multiple constraints such as the dynamic topology and the unpredictable data updates. Caching is an attractive solution that reduces network traffic and hence could remedy the technological constraints of P2P networks. However, traditional caching techniques have three major shortcomings when dealing with nearest-neighbor retrieval: First, they rely on exact match and therefore are not suitable for approximate and similarity-based queries. Second, the description of cached data is defined based on the query context instead of data content, which leads to inefficient use of cache storage. Third, the description of cached data does not reflect the popularity of the data, making it inefficient in providing QoS-related services. To facilitate the efficient similarity search, we propose semantic-aware caching scheme (SAC) in this paper. Several innovative ideas are used in the SAC scheme: 1) describing a collection of data objects using constraint-based expression showing the content distribution, 2) adaptive data content management, and 3) non-flooding query processing. By exploring the content distribution, SAC drastically reduces the cost of similarity-based k-NN retrieval in P2P networks. The performance of SAC is evaluated through simulation study and compared against several search schemes as advanced in the literature.
|