TY - GEN
T1 - Weakly-supervised multi-view multi-instance multi-label learning
AU - Xing, Yuying
AU - Yu, Guoxian
AU - Wang, Jun
AU - Domeniconi, Carlotta
AU - Zhang, Xiangliang
N1 - KAUST Repository Item: Exported on 2020-12-22
PY - 2020/7
Y1 - 2020/7
N2 - Multi-view, Multi-instance, and Multi-label Learning (M3L) can model complex objects (bags), which are represented with different feature views, made of diverse instances, and annotated with discrete nonexclusive labels. Existing M3L approaches assume a complete correspondence between bags and views, and also assume a complete annotation for training. However, in practice, neither the correspondence between bags, nor the bags' annotations are complete. To tackle such a weakly-supervised M3L task, a solution called WSM3L is introduced. WSM3L adapts multimodal dictionary learning to learn a shared dictionary (representational space) across views and individual encoding vectors of bags for each view. The label similarity and feature similarity of encoded bags are jointly used to match bags across views. In addition, it replenishes the annotations of a bag based on the annotations of its neighborhood bags, and introduces a dispatch and aggregation term to dispatch bag-level annotations to instances and to reversely aggregate instance-level annotations to bags. WSM3L unifies these objectives and processes in a joint objective function to predict the instance-level and bag-level annotations in a coordinated fashion, and it further introduces an alternative solution for the objective function optimization. Extensive experimental results show the effectiveness of WSM3L on benchmark datasets.
AB - Multi-view, Multi-instance, and Multi-label Learning (M3L) can model complex objects (bags), which are represented with different feature views, made of diverse instances, and annotated with discrete nonexclusive labels. Existing M3L approaches assume a complete correspondence between bags and views, and also assume a complete annotation for training. However, in practice, neither the correspondence between bags, nor the bags' annotations are complete. To tackle such a weakly-supervised M3L task, a solution called WSM3L is introduced. WSM3L adapts multimodal dictionary learning to learn a shared dictionary (representational space) across views and individual encoding vectors of bags for each view. The label similarity and feature similarity of encoded bags are jointly used to match bags across views. In addition, it replenishes the annotations of a bag based on the annotations of its neighborhood bags, and introduces a dispatch and aggregation term to dispatch bag-level annotations to instances and to reversely aggregate instance-level annotations to bags. WSM3L unifies these objectives and processes in a joint objective function to predict the instance-level and bag-level annotations in a coordinated fashion, and it further introduces an alternative solution for the objective function optimization. Extensive experimental results show the effectiveness of WSM3L on benchmark datasets.
UR - http://hdl.handle.net/10754/666575
UR - https://www.ijcai.org/proceedings/2020/432
UR - http://www.scopus.com/inward/record.url?scp=85097337356&partnerID=8YFLogxK
U2 - 10.24963/ijcai.2020/432
DO - 10.24963/ijcai.2020/432
M3 - Conference contribution
SN - 9780999241165
SP - 3124
EP - 3130
BT - Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
PB - International Joint Conferences on Artificial Intelligence Organization
ER -