Week 11

- August 10, 2019

Greetings!

I hope you are in great health. As my previous week had been very busy due to online tests and interviews during an internship drive in my college, I wanted to complete the next task for this project, that was to produce `glossary` files in `rdf` format takin inspiration from this file.

To get an idea, please look at this model and its respective glossary file in this folder for non-standardised SBML for BIOMD0000000176.
Here you can observe that all the annotations from the model are extracted and are put in as children of a single XMLNode. The glossary can be linked to the model by `rdf:about` tag which basically contains the `metaid` for various elements from the model.

This glossary file becomes important and purposeful when you consider models which are hard to annotate. Biological models of the same thing can be present in various formats and it can be hard to annotate models in some formats whereas others (like SBML) can be easily annotated (with present tools like ModelPolisher). Both models contain the same elements, and thus if a model is annotated in one format, the same annotations can be used for another format with help of mapping through `metaid`s.
Also, a Combine Archive can be used to save models in different formats together in a single zip file. When such an archive is used, it can be convenient in the future to have a single glossary file for all the models which basically contains annotations for each element mapped using `metaid`s.

So, my target for this week was to strip out all the annotations from a polished model and save them separately in a single glossary file (format: rdf). Initially, I tried to find different rdf parsers (like Apache Jena) and go through their documentation to find a suited one, but the output format that was required for the glossary could be easily produced using an existing class in jsbml repository: SBMLRDFAnnotationParser.

I was able to figure out the usage of this particular class but had some problems in passing the inputs as the input was an XMLNode which was confusing for me. I took help from Nicolas Rodriguez regarding the issue via some meetings and was finally able to produce the output rdf by iterating through all the elements in the model, get the annotations using the first child of produced XMLNode(s), and putting them as children of a parent XMLNode. This was further written to a file: model_glossary.rdf.

After successful production of the glossary rdf, just one problem persisted: the produced glossary was not in a tidy format.
To produce the glossary in a tidier format, I took inspiration from TidySBMLWriter and successfully produced the glossary rdf in a tidy format using repository jtidy.

Some minor bug fixes and comments were added along the way, which concluded in the successful completion of this task.

Thank You!

Search This Blog

Project-ModelPolisher: GSoC 2019

Week 11

Comments

Post a Comment

Popular posts from this blog

Final Report: GSoC 2019

Week 12