Final Report: GSoC 2019
This is the final report for project Extending ModelPolisher to a universal model annotation tool under Google Summer of Code 2019, organization: NRNB.
ModelPolisher is a model annotation tool originally developed for the BiGG Models Knowledgebase. Annotations enhance the reusability and interoperability of biological models. During this project, I extended the model annotation capabilities of ModelPolisher, updated the MatlabParser (reading models in the COBRA matlab format), containerized ModelPolisher using Docker, added functionality to use AnnotateDB for model annotation, and added a feature to produce a model annotation glossary and CombineArchives as outputs. As part of this work, the SQLite version of the ModelPolisher database backend could be removed.
The project proceeded according to the goals and timeline of the project proposal, except for one major change: I completed my goal of containerizing the project at the mid-evaluation, the remaining goals were shifted from mid-evaluation to final-evaluation.
WORK DETAILS
Following is the list of issues worked upon / solved during the coding periods, detailed steps can be seen in weekly blogs.
First Coding Period
List of issues:
- lightJar, bareJar fetches bigg.zip
- Error in building : setupDB FAILED
- Update ModelPolisher compatibility with openjdk-11
- Update or deprecate matlab parser
- Extend ModelPolisher's annotation capabilities for species and reactions not having BiGGIds but links to other databases' identifiers
- Extend ModelPolisher's annotation capabilities for geneProducts not having BiGGIds
Second Coding Period
List of issues:
List of commits: https://github.com/draeger-lab/ModelPolisher/commits?author=codekaust
In summary, I have completed all the major goals from my proposal in the above commits, though in the future I would like to continue contributing to ModelPolisher, e.g. implementing annotations from multiple sources based on a single identifier.
- Containerisation of ModelPolisher using Docker.
- Restructure ModelPolisher.
- Use AnnotateDB to annotate species and reactions
- Check if species is BiGG Metabolite, reaction is BiGG Reaction before getting ADB annotations
- Update default parameters for running ModelPolisher.
- ModelPolisher cannot find any relations in ADB.
- Separate instructions for developers and users for running and building ModelPolisher.
Third Coding Period
List of issues:
- Produce glossary from annotated models
- Produce a single COMBINE archive containing the annotated model and glossary
- Add Table of Contents in README
- Add documentation of all command line arguments in README
- Remove SQLite version
- Produce glossary only when BiGG Annotations added
- Production of output fails if `output` location is given as file and file not present
List of commits: https://github.com/draeger-lab/ModelPolisher/commits?author=codekaust
In summary, I have completed all the major goals from my proposal in the above commits, though in the future I would like to continue contributing to ModelPolisher, e.g. implementing annotations from multiple sources based on a single identifier.
DELIVERABLES
This project has resulted in the development of ModelPolisher 2.0 which has the following major updates with respect to the previous version 1.7.
Features and Enhancements:
- Extended Annotation Capabilities: Earlier versions of ModelPolisher could annotate only those elements of models which have a BiGG Id mentioned for them. This project added functionality to annotate elements without BiGG Id also.
- Docker Containerisation: Previously, ModelPolisher used BiGG Models Database as the only resource for annotation. Now, the new project AnnotateDB has been added. To simplify the end-users interaction with the software, ModelPolisher is now available in Docker containers so that the database backend does not longer have to be restored from a dump file.
- AnnotateDB Integration: ModelPolisher now uses AnnotateDB, a database containing mappings of annotations found in computational biological models, also in addition to BiGGDB to annotate models.
- Glossary: ModelPolisher 2.0 can now produce glossary files for models which are annotated by ModelPolisher. This functionality has been added during the project.
- CombineArchive Support: ModelPolisher 2.0 produces multiple files (output model, glossary). Thus CombineArchive support was added to produce the complete output of one model as a single Combine Archive.
- Improved Documentation: ModelPolisher documentation is updated with the recent features and now provides better instructions to build and run ModelPolisher.
Version 2.0 of ModelPolisher does no longer support an SQLite database backend.
Bug Fixes:
- Updated Matlab Parser: Matlab Parser used by ModelPolisher to parse `.mat` models were based on library `com.jmatio` which was no longer maintained. I updated the Matlab Parser using HebiRobotics/MFL.
EXPERIENCE
I had an amazing experience working on my first major Open-Source project. I not just learned different technologies like Docker Containerisation but I also became familiar with another application of computer science, that is in Network Biology. While working under the guidance of my mentors, I realized that open-source projects can bring together developers from different geographical locations and cultures together to build the best of services and products.
At last, I thank my mentors for their amazing guidance and valuable time. I hope to collaborate with them in the future.
Thank You!
At last, I thank my mentors for their amazing guidance and valuable time. I hope to collaborate with them in the future.
Thank You!
Comments
Post a Comment