Investigating Trojan Attacks on Pre-trained Language Model-powered Database Middleware

Peiran Dong, Song Guo, Junxiao Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The recent success of pre-trained language models (PLMs) such as BERT has resulted in the development of various beneficial database middlewares, including natural language query interfaces and entity matching. This shift has been greatly facilitated by the extensive external knowledge of PLMs. However, as PLMs are often provided by untrusted third parties, their lack of standardization and regulation poses significant security risks that have yet to be fully explored. This paper investigates the security threats posed by malicious PLMs to these emerging database middleware. We specifically propose a novel type of Trojan attack, where a maliciously designed PLM causes unexpected behavior in the database middleware. These Trojan attacks possess the following characteristics: (1) Triggerability: The Trojan-infected database middleware will function normally with normal input, but will likely malfunction when triggered by the attacker. (2) Imperceptibility: There is no need for noticeable modification of the input to trigger the Trojan. (3) Generalizability: The Trojan is capable of targeting a variety of downstream tasks, not just one specific task. We thoroughly evaluate the impact of these Trojan attacks through experiments and analyze potential countermeasures and their limitations. Our findings could aid in the creation of stronger mechanisms for the implementation of PLMs in database middleware.
Original languageEnglish (US)
Title of host publicationProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherACM
DOIs
StatePublished - Aug 4 2023

Bibliographical note

KAUST Repository Item: Exported on 2023-08-07
Acknowledgements: This research was supported by fundings from the Key-Area Research and Development Program of Guangdong Province (No. 2021B0101400003), Hong Kong RGC Research Impact Fund (No. R5060-19), Areas of Excellence Scheme (No. AoE/E-601/22-R), General Research Fund (No. 152203/20E, 152244/21E, 152169/22E), and Shenzhen Science and Technology Innovation Commission (No. JCYJ20200109142008673).

Fingerprint

Dive into the research topics of 'Investigating Trojan Attacks on Pre-trained Language Model-powered Database Middleware'. Together they form a unique fingerprint.

Cite this