| Abstract Detail
Enhancing Quality and Use of Herbarium Collection Data through Community Data Curation Miller, Joe [1], Nicolson, Nicky [2], Robertson, Tim [3]. Identifying and clustering duplicate vouchers across herbarium collections using GBIF tools. Collecting multiple vouchers is an important and common practice during botanical fieldwork. This practice allows the sharing of the duplicate material at local herbaria, to large global herbaria and with other experts. Since most collection data is not born digitally written collection notes are sent with the physical specimen to the collections. Each collection then takes over and incorporates the specimen into the collection and the voucher data usually takes a life of its own, separate from the other duplicates. When these duplicate specimens are digitized and shared with GBIF they become complementary data points that were not easily identified until now. GBIF has implemented an exploratory feature to cluster occurrences that share attributes such as: similar collector, taxon, identifiers, dates, type status and locality. The algorithm is particularly useful in clustering herbarium duplicates which brings together the varied curation histories. The clusters can identify duplicates that have been imaged, geo-referenced, re-identified and sequenced. We will describe the clustering methods with several examples, point to possible future development and ask for your input on how to proceed at GBIF. This talk will also describe potential cost savings due to deduplication of efforts. Log in to add this item to your schedule
1 - GBIF, GBIF Secrtariat, Universitetparken 15, Universitetsparken 15, Copenhagen, 2100, Denmark 2 - Royal Botanical Gardens, Kew, London, TW9 3AE, UK 3 - GBIF Secretariat ,, Universitetparken 15, Copenhagen, 2100, Denmark
Keywords: digital data curation Infrastructure.
Presentation Type: Colloquium Presentations Session: C02, Enhancing Quality and Use of Herbarium Collection Data through Community Data Curation Location: / Date: Monday, July 19th, 2021 Time: 12:30 PM(EDT) Number: C02001 Abstract ID:322 Candidate for Awards:None |