A cooperative-competitive multi-agent framework for auto-bidding in online advertising

Chao Wen, Miao Xu, Zhilin Zhang, Zhenzhe Zheng*, Yuhui Wang, Xiangyu Liu, Yu Rong, Dong Xie, Xiaoyang Tan, Chuan Yu, Jian Xu, Fan Wu, Guihai Chen, Xiaoqiang Zhu, Bo Zheng

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    7 Scopus citations


    In online advertising, auto-bidding has become an essential tool for advertisers to optimize their preferred ad performance metrics by simply expressing high-level campaign objectives and constraints. Previous works designed auto-bidding tools from the view of single-agent, without modeling the mutual influence between agents. In this paper, we instead consider this problem from a distributed multi-agent perspective, and propose a general \underlineM ulti-\underlineA gent reinforcement learning framework for \underlineA uto-\underlineB idding, namely MAAB, to learn the auto-bidding strategies. First, we investigate the competition and cooperation relation among auto-bidding agents, and propose a temperature-regularized credit assignment to establish a mixed cooperative-competitive paradigm. By carefully making a competition and cooperation trade-off among agents, we can reach an equilibrium state that guarantees not only individual advertiser's utility but also the system performance (i.e., social welfare). Second, to avoid the potential collusion behaviors of bidding low prices underlying the cooperation, we further propose bar agents to set a personalized bidding bar for each agent, and then alleviate the revenue degradation due to the cooperation. Third, to deploy MAAB in the large-scale advertising system with millions of advertisers, we propose a mean-field approach. By grouping advertisers with the same objective as a mean auto-bidding agent, the interactions among the large-scale advertisers are greatly simplified, making it practical to train MAAB efficiently. Extensive experiments on the offline industrial dataset and Alibaba advertising platform demonstrate that our approach outperforms several baseline methods in terms of social welfare and revenue.

    Original languageEnglish (US)
    Title of host publicationWSDM 2022 - Proceedings of the 15th ACM International Conference on Web Search and Data Mining
    PublisherAssociation for Computing Machinery, Inc
    Number of pages11
    ISBN (Electronic)9781450391320
    StatePublished - Feb 11 2022
    Event15th ACM International Conference on Web Search and Data Mining, WSDM 2022 - Virtual, Online, United States
    Duration: Feb 21 2022Feb 25 2022

    Publication series

    NameWSDM 2022 - Proceedings of the 15th ACM International Conference on Web Search and Data Mining


    Conference15th ACM International Conference on Web Search and Data Mining, WSDM 2022
    Country/TerritoryUnited States
    CityVirtual, Online

    Bibliographical note

    Funding Information:
    This work was supported in part by Science and Technology Innovation 2030 – “New Generation Artificial Intelligence” Major Project No. 2018AAA0100905, China NSF grant No. 61902248, 61976115, in part by Shanghai Science and Technology Fund 20PJ1407900, in part by Alibaba Group through Alibaba Innovation Research Program and Alibaba Research Intern Program. The opinions, findings, conclusions, and recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the funding agencies or the government. ∗Work done during an internship at Alibaba Group. †Corresponding author.

    Publisher Copyright:
    © 2022 ACM.


    • Auto-bidding
    • Bid optimization
    • E-commerce advertising
    • Multi-agent reinforcement learning

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Computer Science Applications
    • Software


    Dive into the research topics of 'A cooperative-competitive multi-agent framework for auto-bidding in online advertising'. Together they form a unique fingerprint.

    Cite this