Menu Close

Data requirements

To share biodiversity data effectively, certain requirements need to be met. The data needs to be expressed in a standardized format, and it needs to pass some quality checks. It is also necessary to describe the dataset in a standardized way. Finally, the data provider needs to select an appropriate data sharing license.

Darwin Core for data

To share data effectively, it is necessary to follow standardized formats. In the biodiversity informatics community, the most important standard is Darwin Core (DwC). It was originally developed by Biodiversity Information Standards (TDWG), the standardization organization for biodiversity data. It is an evolving community-developed biodiversity data standard. SBDI data providers are encouraged to participate in the development of the standard.

Our help pages on data standards explain how to structure data according to Darwin Core and link to how-to guides. If you familiarize yourself with the standard before collecting or assembling the data, especially required or strongly recommended fields, you can save a lot of work when sharing the data.

Metadata describing datasets

When sharing data, it is essential to provide a good description of each dataset. It makes it possible for others to discover the data and analyze it correctly. Information about datasets is called metadata. The relevant metadata standard is the Ecological Metadata Language. Our help pages on data standards provide more information on metadata and link to how-to guides.

Data quality checks

Depending on the publishing tools you will use, the data will have to pass through different quality checks. We recommend that you test your data at an early stage using one of the available tools, such as the GBIF Data Validator. Our help pages on data-quality and validation checks provide more information on the available tools and the checks implemented by SBDI publishing tools.

Choosing a data license

Before sharing data through SBDI and GBIF, you need to choose a data publishing license. The biodiversity informatics community uses the Creative Commons licenses. Creative Commons is a nonprofit organization that helps individuals and organizations to overcome legal obstacles to the sharing of knowledge and creativity. Their licenses provide a free, simple, and standardized way to grant copyright permissions, ensure proper attribution, and allow others to copy, distribute, and make use of the data.

SBDI and GBIF require data providers to use one of three open Creative Commons licenses, as explained in our help pages on data licenses.

More info

You can read more about data requirements and the data publishing process on our help pages on data publishing.