Weakly-supervised multi-view multi-instance multi-label learning

Yuying Xing, Guoxian Yu, Jun Wang, Carlotta Domeniconi, Xiangliang Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations


Multi-view, Multi-instance, and Multi-label Learning (M3L) can model complex objects (bags), which are represented with different feature views, made of diverse instances, and annotated with discrete nonexclusive labels. Existing M3L approaches assume a complete correspondence between bags and views, and also assume a complete annotation for training. However, in practice, neither the correspondence between bags, nor the bags' annotations are complete. To tackle such a weakly-supervised M3L task, a solution called WSM3L is introduced. WSM3L adapts multimodal dictionary learning to learn a shared dictionary (representational space) across views and individual encoding vectors of bags for each view. The label similarity and feature similarity of encoded bags are jointly used to match bags across views. In addition, it replenishes the annotations of a bag based on the annotations of its neighborhood bags, and introduces a dispatch and aggregation term to dispatch bag-level annotations to instances and to reversely aggregate instance-level annotations to bags. WSM3L unifies these objectives and processes in a joint objective function to predict the instance-level and bag-level annotations in a coordinated fashion, and it further introduces an alternative solution for the objective function optimization. Extensive experimental results show the effectiveness of WSM3L on benchmark datasets.
Original languageEnglish (US)
Title of host publicationProceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
PublisherInternational Joint Conferences on Artificial Intelligence Organization
Number of pages7
ISBN (Print)9780999241165
StatePublished - Jul 2020

Bibliographical note

KAUST Repository Item: Exported on 2020-12-22


Dive into the research topics of 'Weakly-supervised multi-view multi-instance multi-label learning'. Together they form a unique fingerprint.

Cite this