Week 1: Build and Understand

Greetings!

Yesterday marked the ending of the first week of the coding period. At the start, I was quite overwhelmed with the codebase as for the first time I was working with software using mostly in-house libraries, and as I am not familiar with bio-informatics so I could not intuitively understand the purpose of various functions, which posed one of the biggest problems.

Following is the summary of proceedings in the first week:

  • Learning PostgreSQL and JDBC:  As ModelPolisher is written in Java and for ModelPolishing it uses the BiGG Database, thus it is mandatory for one to have a good understanding of PostgreSQL Queries and Java Database Connectivity. This was one of the first tasks I completed this week.
  • Meeting: Discussion on Code: As I was having trouble understanding the purpose of various functions, I asked my mentor, Thomas, for his help. Thomas arranged a meeting on 29 May, the following were discussed:
    • Data-flow in ModelPolisher codebase and purpose of various functions in BiGGAnnotations class.
    • Basic understanding of BiGG Database and the format of biological models. (very helpful)
    • Taking an example of the yeast model, we discussed what already present data can be used to calculate BiGG IDs, which are required in BiGGAnnotations class.
  • Issues Noticed: While trying to build ModelPolisher and testing it over various models(present here), the following issues were noticed:
    • Gradle build failed at setupDB task: The reason for this issue was the use of sh command to run ./scripts/configureSQLiteDB.sh . Further details can be seen here - issue #36.
    • While using open_jdk 11, ModelPolisher failed to polish some of the models. The reason behind this was probably that the JAXB APIs are considered to be Java EE APIs, and therefore are no longer contained on the default class path in Java SE 9. In Java 11 they are completely removed from the JDK. See details - issue #34.
  • Suggestions From Mentors
    • Getting BiGG IDs from already present annotations:
      Let's say we have this annotation: annotation = http://identifiers.org/chebi/CHEBI:37637
      It gives information that for the element under consideration, collection = chebi, identifier = CHEBI:37637. We can use this information to calculate the BiGG ID from data_source and synonym table of the BiGG Database.
      For this example, the following query can be used:
      SELECT c.bigg_id FROM component c, data_source d, synonym s WHERE d.bigg_id = 'chebi' AND d.id = s.data_source_id AND s.synonym = 'CHEBI:37637' AND s.ome_id = c.id;

      This should work fine for most databases like chebi and kegg,  but still we will need to check for other databases also.
    • ModelPolisher database for managing synonyms and mappings between ontologies:
      Mapping ontologies can be imported, like one from BiGG Database synonyms table, this will finally simplify all the SQL queries and would allow having a database model suited for the task of ontology and synonym mapping. This will extend the scope of ModelPolisher tool, but needs change at the core of the ModelPolisher. Further discussion will be done on this topic, as this change may lead to divergence as at the moment at least we always have BiGG as one curated source of information.

At the end week though I am much comfortable with the codebase but BiGGAnnotation class still needs time for a better understanding. I think focussing more on my particular milestones should be my focus now.
I hope for good progress next week.

Thank You.

Comments

Popular posts from this blog

Final Report: GSoC 2019

Week 12

Week 9, 10