discussion / Open Source Solutions  / 21 January 2021

Data standards: How can WILDLABS support?

Hi Wildlabbers

WILDLABS has been thinking about data standards practices recently and I wanted to throw a couple questions out to our community:

1) What data standards do you use for maintaining and/or sharing data?

2) It's often voiced that we need better / more unified standards for maintaining and sharing data, but are there any specific areas or issues within this broad need you can identify that require resolution?

Any insight would be much appreciated! 

Tatjana




As it is a very broad topic that we are also working on. Some ideas that could be useful depending on where you are interested in:

From the IT practice there is a common standard for datamanagement (DAMA/DMBOK) which in itself is a heavy ISO-like standard, but there are many introductions to it that are easier to digest like this one on slideshare:  A well written and comprehensive book on this is available here but there will be more lightweight introductions.

I have found a good example on the principles of working with data comes from the Responsible Data Forum (they have a fine guide on this) and you could also look here (not only data)

And there are of course different methods and guidelines in the conservation practice on managing your project in which data-collection and analysis are part of the project cycle like the CMP

Regards,

Henk

Thanks for your reply, Henk, and also for those resources.

You're right - it is a broad topic. I should outline the context a little:

We often hear from various tech users that there is a need for better or more unified data standards in many forms - ranging from best practices at the design phase to data sharing & management. Some of these problems are specific to a particular tech, and others we've heard voiced again and again for multiple pieces of tech.

WILDLABS is trying to figure out how we can best support our community on this. For example, would it be beneficial for WILDLABS to facilitate working groups on data / conservation tech standards? Or to connect conservation tech users with developers to provide a platform for them to voice their standardisation troubles? Or to act as a repository for methods and data standards? Or, something else entirely?

Looking forward to hearing any insights into how WILDLABS can help our community tackle this issue.

 

From my perspective (n=1) the tech standards are not the issue. Using the bigger platforms like Azure, AWS, etc. is no issue because there are many tech companies that can help you.

The largest issue in innovative tech is that people are often reinventing the wheel in different places all over the world. Wildlabs is doing a great job in preventing that and connecting us. Also https://www.seafoodandfisheriesemergingtechnology.com/ is a great source on new technology (only Oceans, but that could be a source of inspiration to others) . Probably there are more of these organisations. It would be great if Wildlabs could connect us with more of these initiatives and play a broker role and in this help to standardise datacollection methods and -standards.

The main issue I have personally is standardazation of indicators: what are good indicators/data in a field of conservation. How do you measure the success of a certain strategy, how costly are indicators, what are best and easiest methods to collect the data, etc. When designing a project it would be great is there was an inventory of well know, validated example indicators you could choose from. Of course every project will have its own more specific indicators, but I not convinced that every project or strategy is so unique that we have to reinvent the wheel in that every time. The only example I know Miradi Share , but there you have to dive in the factors of every project to find the indicators. A more general view on data and methods alone would be more efficient.

Regards,

Henk

 

As someone who has practiced in information technology standards groups and actively worked on data schema standards to facilitate data sharing and even legally valid business transactions (https://www.oasis-open.org/committees/download.php/31222/ENML-1.0-Specification.pdf), my experience indicates it is important that you do the following:

1) Create a small workgroup of the right stakeholders who are committed to the process and results;

2) Pick a schema definition language - JSON or XML - to create the standard (JSON is the flavor of the day, but it is possible to do both with the right commitment);

3) Get people trained on the methodology and grammar for creating the schema;

4) Build the right semantic model up to a level of detail that is feasible for implementation within one year - if you go deeper than that the people who need to get things done in the field will move on while the workgroup will still be arguing details;

5) Build the schema; and

6) Build at least reference tools and applications in at least 2 different programming languages that show the model and schema working.

Without this process, it will be easy to miss the forest by getting lost in the trees (apologies for that).

Arshad