Traditional multi-view learning methods often rely on two assumptions: ( i ) the samples in different views are well-aligned, and ( ii ) their representations obey the same distribution in a latent space. Unfortunately, these two assumptions may be questionable in practice, which limits the application of multi-view learning. In this work, we propose a differentiable hierarchical optimal transport (DHOT) method to mitigate the dependency of multi-view learning on these two assumptions. Given arbitrary two views of unaligned multi-view data, the DHOT method calculates the sliced Wasserstein distance between their latent distributions. Based on these sliced Wasserstein distances, the DHOT method further calculates the entropic optimal transport across different views and explicitly indicates the clustering structure of the views. Accordingly, the entropic optimal transport, together with the underlying sliced Wasserstein distances, leads to a hierarchical optimal transport distance defined for unaligned multi-view data, which works as the objective function of multi-view learning and leads to a bi-level optimization task. Moreover, our DHOT method treats the entropic optimal transport as a differentiable operator of model parameters. It considers the gradient of the entropic optimal transport in the backpropagation step and thus helps improve the descent direction for the model in the training phase. We demonstrate the superiority of our bi-level optimization strategy by comparing it to the traditional alternating optimization strategy. The DHOT method is applicable for both unsupervised and semi-supervised learning. Experimental results show that our DHOT method is at least comparable to state-of-the-art multi-view learning methods on both synthetic and real-world tasks, especially for challenging scenarios with unaligned multi-view data.
|Original language||English (US)|
|Number of pages||14|
|Journal||IEEE Transactions on Pattern Analysis and Machine Intelligence|
|State||Published - Nov 16 2022|
Bibliographical noteKAUST Repository Item: Exported on 2022-11-18
Acknowledgements: Dixin Luo was supported in part by the Beijing Institute of Technology Research Fund Program for Young Scholars (XSQD-202107001) and the project 2020YFF0305200. Hongteng Xu was supported in part by Beijing Outstanding Young Scientist Program (NO. BJJWZYJH012019100020098), National Natural Science Foundation of China (No. 61832017), the Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China.
ASJC Scopus subject areas
- Artificial Intelligence
- Computational Theory and Mathematics
- Applied Mathematics
- Computer Vision and Pattern Recognition