Character As Pixels: A Controllable Prompt Adversarial Attacking Framework for Black-Box Text Guided Image Generation Models

Ziyi Kou, Shichao Pei, Yijun Tian, Xiangliang Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

In this paper, we study a controllable prompt adversarial attacking problem for text guided image generation (Text2Image) models in the black-box scenario, where the goal is to attack specific visual subjects (e.g., changing a brown dog to white) in a generated image by slightly, if not imperceptibly, perturbing the characters of the driven prompt (e.g., “brown” to “br0wn”). Our study is motivated by the limitations of current Text2Image attacking approaches that still rely on manual trials to create adversarial prompts. To address such limitations, we develop CharGrad, a character-level gradient based attacking framework that replaces specific characters of a prompt with pixel-level similar ones by interactively learning the perturbation direction for the prompt and updating the attacking examiner for the generated image based on a novel proxy perturbation representation for characters. We evaluate CharGrad using the texts from two public image captioning datasets. Results demonstrate that CharGrad outperforms existing text adversarial attacking approaches on attacking various subjects of generated images by black-box Text2Image models in a more effective and efficient way with less perturbation on the characters of the prompts.
Original languageEnglish (US)
Title of host publicationIJCAI International Joint Conference on Artificial Intelligence
PublisherInternational Joint Conferences on Artificial Intelligence
Pages983-990
Number of pages8
ISBN (Print)9781956792034
StatePublished - Jan 1 2023
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2023-09-20

Fingerprint

Dive into the research topics of 'Character As Pixels: A Controllable Prompt Adversarial Attacking Framework for Black-Box Text Guided Image Generation Models'. Together they form a unique fingerprint.

Cite this