Cluster-Based Subscription Matching for Geo-Textual Data Streams

Lisi Chen, Shuo Shang, Kai Zheng, Panos Kalnis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

27 Scopus citations


Geo-textual data that contain spatial, textual, and temporal information are being generated at a very high rate. These geo-textual data cover a wide range of topics. Users may be interested in receiving local popular topics from geo-textual messages. We study the cluster-based subscription matching (CSM) problem. Given a stream of geo-textual messages, we maintain up-to-date clustering results based on a threshold-based online clustering algorithm. Based on the clustering result, we feed subscribers with their preferred geo-textual message clusters according to their specified keywords and location. Moreover, we summarize each cluster by selecting a set of representative messages. The CSM problem considers spatial proximity, textual relevance, and message freshness during the clustering, cluster feeding, and summarization processes. To solve the CSM problem, we propose a novel solution to cluster, feed, and summarize a stream of geo-textual messages efficiently. We evaluate the efficiency of our solution on two real-world datasets and the experimental results demonstrate that our solution is capable of high efficiency compared with baselines.
Original languageEnglish (US)
Title of host publication2019 IEEE 35th International Conference on Data Engineering (ICDE)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages12
ISBN (Print)9781538674741
StatePublished - Apr 2019

Bibliographical note

KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: This work is supported in part by grants awarded by National Natural Science Foundation of Chine ( NSFC) (No.61832017, 61836007, 61532018)


Dive into the research topics of 'Cluster-Based Subscription Matching for Geo-Textual Data Streams'. Together they form a unique fingerprint.

Cite this