Parallel Semantic Trajectory Similarity Join

Lisi Chen, Shuo Shang, Christian S. Jensen, Bin Yao, Panos Kalnis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

45 Scopus citations

Abstract

Matching similar pairs of trajectories, called trajectory similarity join, is a fundamental functionality in spatial data management. We consider the problem of semantic trajectory similarity join (STS-Join). Each semantic trajectory is a sequence of Points-of-interest (POIs) with both location and text information. Thus, given two sets of semantic trajectories and a threshold θ, the STS-Join returns all pairs of semantic trajectories from the two sets with spatio-textual similarity no less than θ. This join targets applications such as term-based trajectory near-duplicate detection, geo-text data cleaning, personalized ridesharing recommendation, keyword-aware route planning, and travel itinerary recommendation.With these applications in mind, we provide a purposeful definition of spatio-textual similarity. To enable efficient STS-Join processing on large sets of semantic trajectories, we develop trajectory pair filtering techniques and consider the parallel processing capabilities of modern processors. Specifically, we present a two-phase parallel search algorithm. We first group semantic trajectories based on their text information. The algorithm's per-group searches are independent of each other and thus can be performed in parallel. For each group, the trajectories are further partitioned based on the spatial domain. We generate spatial and textual summaries for each trajectory batch, based on which we develop batch filtering and trajectory-batch filtering techniques to prune unqualified trajectory pairs in a batch mode. Additionally, we propose an efficient divide-and-conquer algorithm to derive bounds of spatial similarity and textual similarity between two semantic trajectories, which enable us prune dissimilar trajectory pairs without the need of computing the exact value of spatio-textual similarity. Experimental study with large semantic trajectory data confirms that our algorithm of processing semantic trajectory join is capable of outperforming our well-designed baseline by a factor of 8-12.
Original languageEnglish (US)
Title of host publication2020 IEEE 36th International Conference on Data Engineering (ICDE)
PublisherIEEE
Pages997-1008
Number of pages12
ISBN (Print)978-1-7281-2904-4
DOIs
StatePublished - 2020

Bibliographical note

KAUST Repository Item: Exported on 2020-10-01

Fingerprint

Dive into the research topics of 'Parallel Semantic Trajectory Similarity Join'. Together they form a unique fingerprint.

Cite this