Abstract
In this paper, we achieve a semantic segmentation of aerial imagery based on the fusion of multi-modal data in an effective way. The multi-modal data contains a true orthophoto and the corresponding normalized Digital Surface Model (nDSM), which are stacked together before they are fed into a Convolutional Neural Network (CNN). Though the two modalities are fused at the early stage, their features are learned independently with group convolutions firstly and then the learned features of different modalities are fused at multiple scales with standard convolutions. Therefore, the multi-scale fusion of multi-modal features is completed in a single-branch convolutional network. In this way, the computational cost is reduced while the experimental results reveal that we can still get promising results.
Original language | English (US) |
---|---|
Title of host publication | International Geoscience and Remote Sensing Symposium (IGARSS) |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 3911-3914 |
Number of pages | 4 |
ISBN (Print) | 9781538691540 |
DOIs | |
State | Published - Jul 1 2019 |
Externally published | Yes |