New dataset opens Estonian soil information for versatile use

A comprehensive database of Estonian soils and a map application has been completed in cooperation with researchers of the University of Tartu and the Estonian University of Life Sciences. The database makes Estonian soil information easily accessible and can be used from local farm-scale to national-level big data statistical analysis and machine-learning models.

“Soil data is possibly the most undervalued and yet complicated type of environmental data there is. The diversity of organic, chemical, living and dead materials that make up a handful of dirt is astounding,” said Alexander Kmoch, Research Fellow in Geoinformatics at the University of Tartu and the leading author of the study.

Estonia has had very detailed soil information available for decades. It is digitally available on the Geoportal of the Estonian Land Board in several formats under a permissive open data license. Its main purposes include land evaluation and assessing potential for agricultural use.
Unfortunately, it was not easy to make much use of the data so far. One of the reasons that limited the wider use of the soil map was the way the data was structured in the database. “For each soil unit, a series of complicated text codes and numbers describe very specialized soil type and soil texture, organic layer, rock content, and the potential for agricultural use. Only few experts can interpret that on a field-by-field basis, but it was close to impossible to derive large-scale actionable insights,” said Kmoch.

Researchers of the University of Tartu and the Estonian University of Life Sciences have undertaken the mammoth project of deciphering that information and providing it in an easily readable table-based form, with all the bits and pieces extracted into numbers and categories that are much easier to analyze and use in a variety of use cases.

The new dataset is called EstSoil-EH, the Estonian soil dataset with ecological and hydrological variables all derived from the original soil map of Estonia. In addition, the new dataset is enriched with area and percentage information on six simple land-use types: arable, grassland, forest, wetland, urban/buildup and water. One can also find topographical variables like slope per each distinct soil unit.
Furthermore, machine learning was used to complement the new dataset with the soil organic carbon estimates. This way it opens Estonian soil information to many new specialized use cases from digital agriculture support to forest management, environmental assessments, biodiversity restoration, eco-tourism and much more.

“Countries like Lithuania and Latvia may have similar historical soil records from the Soviet era that could be turned into value-added datasets by using the same methodology,” said Kmoch.

A scientific peer-reviewed article describing the data processing and validation in detail was published in the journal Earth System Science Data.

The authors would like to pay tribute to Associate Professor Arno Kanal, who passed away and could not see the fruits of his work. The authors would also like to thank all the participants and partners who have provided data samples used for improving the prediction of soil organic carbon.

The new EstSoil-EH dataset is available as open data.

The work is part of a project funded by the Estonian Environmental Investment Centre. This research has also been supported by the Marie Skłodowska-Curie Actions individual fellowship under the Horizon 2020 Programme, the Mobilitas Pluss postdoctoral researcher grant, and  the Estonian Research Council (ETAg), the European Regional Development Fund (Centre of Excellence EcolChange), and the NUTIKAS programme of the Archimedes Foundation.

Further information:
Alexander Kmoch, Research Fellow in Geoinformatics at the University of Tartu,
Alar Astover, Professor of Soil Science at the Estonian University of Life Sciences,