Peter Macdiarmid/Getty Images
Computer algorithms trained on the images of thousands of preserved plants have learned to automatically identify species that have been pressed, dried and mounted on herbarium sheets, researchers report.
The work, published in BMC Evolutionary Biology on 11 August1, is the first attempt to use deep learning — an artificial-intelligence technique that teaches neural networks using large, complex data sets — to tackle the difficult taxonomic task of identifying species in natural-history collections.
It’s unlikely to be the last attempt, says palaeobotanist Peter Wilf of Pennsylvania State University in University Park. “This kind of work is the future; this is where we’re going in natural history.”
Natural-history museums around the world are racing to digitize their collections, depositing images of their specimens into open databases that researchers anywhere can rifle through. One data aggregator, the US National Science Foundation’s iDigBio project, boasts more than 150 million images of plants and animals from collections around the country.
There are roughly 3,000 herbaria in the world, hosting an estimated 350 million specimens — only a fraction of which has been digitized. But the swelling data sets, along with advances in computing techniques, enticed computer scientist Erick Mata-Montero of the Costa Rica Institute of Technology in Cartago and botanist Pierre Bonnet of the French Agricultural Research Centre for International Development in Montpellier, to see what they could make of the data.
Bonnet’s team had already made progress automating plant identification through…