Create your own conference schedule! Click here for full instructions

The Virtual Conference is located at

Abstract Detail


F Rodriguez Rodriguez, Ivan Felipe [1], Rose, Jacob A [2], FEL, Thomas [3], Vaishnav, Mohit [4], Wilf, Peter [5], Serre, Thomas [1].

A deep-learning-based approach for automated fossil leaf identification.

Fossil leaf identification represents a compelling use case for machine vision. The leaves’ potential scientific value is tremendous because while isolated leaf fossils are very abundant in the field and museum collections, their identification is often problematic. We have developed a deep-learning-based computer-vision system for identifying extant-leaf images to botanical family, using a new image database of cleared, x-rayed, and fossil leaves consisting mostly of angiosperms (see related abstracts at this meeting). Here, we describe novel methods to extend the system for the identification of fossil leaves at the family level. The challenge for the development of computer vision systems, which normally rely on tens of thousands to millions of training images, is that comparatively few vetted fossil leaf samples are available to train the system; the majority of angiosperm families have no reliable leaf fossils. Here, we describe the development of computer-vision methods to successfully transfer machine knowledge from cleared to fossil leaves. Our approach leverages so-called image-to-image translation methods (conditional cycleGAN) to generate synthetic fossils by learning mappings between one image distribution (cleared leaves) and another (fossil leaves). We use these methods to augment our real-image database with a high quantity and phylogenetic diversity of synthetic samples not available from real fossils alone. We train a deep neural network architecture using both real and synthetic images to learn a joint representation for known families of cleared leaves and fossils. We evaluate the network’s accuracy in multiple scenarios. First, we demonstrate high classification accuracy for cleared leaves. We further find a high (albeit lower) accuracy for real fossil leaves. This is presumably due to the comparatively much smaller number of fossil vs. cleared-leaf samples, combined with taphonomic signal loss. We further evaluate the ability of the proposed methods to generalize to real fossils of families for which no real fossil was presented during training. We use a leave-one-family-out cross-validation approach whereby real leaf fossils are used for training for all families but one (i.e., only synthetic samples are available for the test family). We report significantly above-chance classification accuracy in this scenario. A study using explainability methods is carried out in order to identify some of the strategies used for the classification. Our results strongly suggest that AI methods will provide significant assistance to paleobotanists with the identification of leaf fossils.

Log in to add this item to your schedule

1 - Brown University , Cognitive, Linguistic & Psychological Sciences, 184 Hope St Providence, RI, 02912, Providence, RI, 02912, USA
2 - Brown University, School of Engineering & Cognitive, Linguistics and Psychological sciences. , 184 Hope St Providence, RI, 02912, Providence, RI, 02912, Estados Unidos
3 - Universite de Toulouse, Artificial and Natural Intelligence Toulouse Institute, France
4 - Universite de Toulouse, Artificial and Natural Intelligence Toulouse Institute, Toulouse, France
5 - Pennsylvania State University, Dept. of Geosciences, University Park, PA, 16802, USA

Machine learning
Deep Learning
Cross Domain Adaptation
Metric Learning.

Presentation Type: Oral Paper
Session: PL2, Paleobotany: Cookson Student Presentations - Session II
Location: /
Date: Monday, July 19th, 2021
Time: 2:00 PM(EDT)
Number: PL2007
Abstract ID:286
Candidate for Awards:Isabel Cookson Award

Copyright © 2000-2021, Botanical Society of America. All rights reserved