Abstract
The cognitive system of humans, which allows them to create representations of their surroundings exploiting multiple senses, has inspired several applications to mimic this remarkable property. The key for learning rich representations of data collected by multiple, diverse sensors, is to design generative models that can ingest multimodal inputs, and merge them in a common space. This enables to: i) obtain a coherent generation of samples for all modalities, ii) enable cross-sensor generation, by using available modalities to generate missing ones and iii) exploit synergy across modalities, to increase reconstruction quality. In this work, we study multimodal variational autoencoders, and propose new methods for learning a joint representation that can both improve synergy and enable cross generation of missing sensor data. We evaluate these approaches on well-established datasets as well as on a new dataset that involves multimodal object detection with three modalities. Our results shed light on the role of joint posterior modeling and training objectives, indicating that even simple and efficient heuristics enable both synergy and cross generation properties to coexist.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021 |
Editors | M. Arif Wani, Ishwar K. Sethi, Weisong Shi, Guangzhi Qu, Daniela Stan Raicu, Ruoming Jin |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1069-1076 |
Number of pages | 8 |
ISBN (Electronic) | 9781665443371 |
DOIs | |
State | Published - 2021 |
Event | 20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021 - Virtual, Online, United States Duration: Dec 13 2021 → Dec 16 2021 |
Publication series
Name | Proceedings - 20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021 |
---|
Conference
Conference | 20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021 |
---|---|
Country/Territory | United States |
City | Virtual, Online |
Period | 12/13/21 → 12/16/21 |
Bibliographical note
Publisher Copyright:© 2021 IEEE.
Keywords
- Autoencoder
- Multimodal
- Variational
ASJC Scopus subject areas
- Safety, Risk, Reliability and Quality
- Health Informatics
- Artificial Intelligence
- Computer Science Applications