Tens of millions of images from biological collections have become available online over the last two decades. In parallel, there has been a dramatic increase in the capabilities of image analysis technologies, especially those involving machine learning and computer vision. While image analysis has become mainstream in consumer applications, it is still used only on an artisanal basis in the biological collections community, largely because the image corpora are dispersed. Yet, there is massive untapped potential for novel applications and research if images of collection objects could be made accessible in a single corpus. In this paper, we make the case for infrastructure that could support image analysis of collection objects. We show that such infrastructure is entirely feasible and well worth investing in.
Keywords: biodiversity; computer vision; functional traits; machine learning; species identification; specimens.
Quentin Groom, Mathias Dillen, Wouter Addink, Arturo H. H. Ariño, Christian Bölling, Pierre Bonnet, Lorenzo Cecchi, Elizabeth R. Ellwood, Rui Figueira, Pierre-Yves Gagnier, Olwen M Grace, Anton Güntsch, Helen Hardy, Pieter Huybrechts, Roger Hyam, Alexis A. J. Joly, Vamsi Krishna Kommineni, Isabel Larridon, Laurence Livermore, Ricardo Jorge Lopes, Sofie Meeus, Jeremy A. Miller, Kenzo Milleville, Renato Panda, Marc Pignal, Jorrit Poelen, Blagoj Ristevski, Tim Robertson, Ana C Rufino, Joaquim Santos, Maarten Schermer, Ben Scott, Katja Chantre Seltmann, Heliana Teixeira, Maarten Trekels, Jitendra Gaikwad.