Empowering real-time traffic reporting systems with NLP-Processed social media data

Xiangpeng Wan, Michael C. Lucic, Hakim Ghazzai, Yehia Massoud

Research output: Contribution to journalArticlepeer-review

12 Scopus citations


Current urbanization trends are leading to heightened demand of smarter technologies to facilitate a variety of applications in intelligent transportation systems. Automated crowdsensing constitutes a strong base for ITS applications by providing novel and rich data streams regarding congestion tracking and real-time navigation. Along with these well-leveraged data streams, drivers and passengers tend to report traffic information to social media platforms. Despite their abundance, the use of social media data in ITS has gained more and more attention as of now. In this article, we develop an automated Natural Language Processing (NLP)-based framework to empower and complement traffic reporting solutions by text mining social media, extracting desired information, and generating alerts and warning for drivers. We employ the fine-tuned Bidirectional Encoder Representations from Transformers classification model to filer and classify data. Then, we apply the Question-Answering model to extract necessary information characterizing the reported incident such as its location, occurrence time, and nature of the incidents. Afterwards, we convert the collected information into alerts to be integrated into personal navigation assistants. Finally, we compare the recently posted incident reports from both official authorities and social media in order to provide more complete incident pictures and suggest some open research directions.
Original languageEnglish (US)
Pages (from-to)159-175
Number of pages17
JournalIEEE Open Journal of Intelligent Transportation Systems
StatePublished - Jan 1 2020
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2023-09-21


Dive into the research topics of 'Empowering real-time traffic reporting systems with NLP-Processed social media data'. Together they form a unique fingerprint.

Cite this