Herbaria worldwide are housing a treasure of hundreds of millions of herbarium specimens, which are increasingly being digitized and thereby more accessible to the scientific community. At the same time, deep-learning algorithms are rapidly improving pattern recognition from images and these techniques are more and more being applied to biological objects. In this study, we are using digital images of herbarium specimens in order to identify taxa and traits of these collection objects by applying convolutional neural networks (CNN). Images of the 1000 species most frequently documented by herbarium specimens on GBIF have been downloaded and combined with morphological trait data, preprocessed and divided into training and test datasets for species and trait recognition. Good performance in both domains suggests substantial potential of this approach for supporting taxonomy and natural history collection management. Trait recognition is also promising for applications in functional ecology.
Bibliographical noteKAUST Repository Item: Exported on 2020-10-01
Acknowledgements: SY, MS and SD received funding from the DFG Project Mobilization of trait data from digital image files by deep learning approaches (grant 316452578). Parts of RH’s & CW’s work were funded by the National Bioscience Database Center (NBDC) and the Database Center for Life Science (DBCLS) Biohackathon 2017 grants. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the TITAN Xp GPU to CW used for this research.