Packaging Ouso's Data

This week I had the opportunity to work with my colleague's data. He created a Datapackage which I replicated. In doing so, I learned a lot about the Datapackage web interface.

This was made possible by the fact that Ouso made his data available on GitHub. Using these data Ouso and his co-authors evaluate the ability of high-resolution melting analysis to identify illegally targeted wildlife species. Analysis of these data are pertinent to informing conservation and illegal trade mitigation efforts.

In fact, Ouso just published in Nature Scientific Reports his findings based on these data!

nature

Congrats Ouso! You can read the article here and learn about the implications the application of their low-cost post-PCR approach has for East African forensic science!

Ouso's project data consists of two tabular data files.

data1

1) data on analyzed samples and sequences from GenBank.

data2

2) species identification via three mitochondria markers.

I uploaded these files as separate resources in Frictionless Data's Datapackage Creator. For each of these resources I specified a name for the resource and a path. Because Ouso's files were available online, rather than uploading the data files locally, I copied and pasted the links to his GitHub data. A mistake I initially made was that I forgot that you have to click the raw data link and copy the raw version url. The Datapackage Creator platform automatically assessed the data's structure and the type of data within each column. I then added titles and descriptions to each column that matched Ouso's own JSON file (i.e. field metadata).

From Ouso's blog I recalled that the gear-wheel (settings) in the resource tab allows you to edit the title of each resource and provide descriptions, format and encoding. I followed his lead by doing this for his two data resources. That is to say, I added resource metadata. As my previous Datapackages had only included one resource each the datapackage metadata more or less matched my resource metadata. With more than one resource, I see now that this differentiation between resource metadata and Datapackage metadata becomes increasingly important.

.

brush code-outline