Week 2, 3: Work in Background

Greetings!

As I have not posted the blog for week 2, thus this blog will contain all the progress made in the project during week 2 and 3 both.

In the first week, I had majorly completed learning Postgresql and JDBC which prove to be of very much use in these two weeks. Also, there were major discussions regarding change of focus for this project, as proposed by Matthias Koenig, which I will cover here.

The milestone for the evaluation one focuses on extending the ModelPolisher annotation capabilities for models and elements of models which do not have proper BiGG Id(issue #39). It's obvious that we need some data from that element to map it to its BiGG Id. It was finalized to use the data present in the form of annotation in the ModelPolisher and map it down to BiGGId of the particular data (to the best of my current knowledge, that was the only way of getting the BiGG Id).

For example, the annotations may contain one or more references like http://identifiers.org/chebi/CHEBI:37637, which gives us a data source('chebi') and an identifier for an element('CHEBI:37637'). Note that, each reference must point to the same element.

The task at hand was to develop queries to map this data to a BiGG Id which includes:


  • Finding differences in the string representing the data source present in the URI and the string in 'data_source' table of BiGG database.
  • Formulating queries for species, reactions, gene_products, compartments.
    • After going through the database it was decided that BiGG Id can be retrieved for species and reactions as 'gene' table from BiGG database doesn't give BiGG Ids.
Major time from week 2 was spent in formulating and checking all the queries as that in the fundamental for this feature to work. I manually tested my queries. Also, Thomas has prepared a scrambler for models to scramble the BiGG Ids, these models can be used once the whole code is implemented in the ModelPolisher main codebase.

By Friday of the 2nd week, all queries were finalized (see here) and as I had learned JDBC already, I had an idea about the implementation too.

A meeting was conducted on Friday, joined by Zachary A. too, in which we discussed the idea from Matthias about making a separate database for ModelPolisher which will basically contain the synonym mapping (the most necessary data for annotations). This was meant to make the queries simpler and provide freedom to add data to the database, thus better annotations. See complete details here. Matthias was going to work on this idea during the 3rd week, thus I postponed adding code for the previously discussed formulized query.

Also, the current status of that idea is that Matthias has put up the database here : annotatedb. We are currently not using it directly, but it may be included in one of my future milestones as its just an extension to ModelPolisher and can be used probably by REST API.

As the 3rd week started the current work at my hand was to solve the issue of updating the matlab parser in ModelPolisher. Basically, I had to update the library being used to HebiRobotics/MFL. Thec changes were to be made in class COBRAparser.java . I faced an issue of not being able to retrieve data the same way the previous library was doing and the implementations of different data structure were quite different in the two libraries (especially Double and NumericArray). By Tuesday, I had formulated mapping between the data structure differences but was still facing the issue of reading files (which proved to be trivial in the future).


As the idea of a new database from Matthias was still being developed, I thought it to be a good time to learn docker as then we could containerize the database and ModelPolisher for easy usage. Currently, every user needs to do the following steps to be able to use ModelPolisher(considering use of postgresql):
  • Install the correct version of JDK(<=8), install gradle, install postgresql
  • Setup BiGG db on postgresql(create empty db with superuser, restore dumps in it)
  • Download code-base and build the correct JAR using gradle lightJar command
  • Run ModelPolisher using command line while passing an instance of postgresql db

As these step can be problematic for a user, as he may not require tools like gadle, postgres and may face issues in installing them. Also, the user may have a different version of JDK in his system. Thus containerization of ModelPolisher was a great advantage. See issue #38.

During week 3, I learned about docker. This work is more focused on MS-4 but if suggestions from Matthias were followed, this would be the best time to learn it. Also, I think as I have already completed learning docker and figured out abstractly the implementation, this will greatly help in the 3rd evaluation and I will be able to complete the MS-6, which was currently kept as a challenge.

Meeting for week 3 was conducted on Friday. Matthias explained about his work with annotatedb, and we discussed my plans of ModelPolisher containerization. It was a short meeting.

I was not able to work this weekend as I had a small cosmetic surgery. The next week I will complete adding the queries in the main codebase of ModelPolisher and will solve the MatlabParser issue.

I hope a great week ahead.

Thank You.

Comments

Popular posts from this blog

Final Report: GSoC 2019

Week 9, 10

Week 12