Posts

Showing posts from July, 2025

GSoC Week3

This week, I mainly focused on unit tests for all functions in the ols_fetch_from_github package , refactoring the github_file_updater.py , and optimize customer choice without updates in github and enrich readme. Here’s the complete summary of the meeting with the work update during the week. Extract functionality from github_file_updater.py (1306 lines) into separate modules file_comparator.py: File comparison and diff functionality file_converter.py: JSON/OBO format conversion logic file_downloader.py: File download and HTTP handling file_validator.py: File validation and integrity checks obo_parser.py: OBO file parsing functionality utils.py: Shared utility functions Provide customer choice to upload own file or use current version when no updates Add configuration management system (config.py) Testing Implemented full unit test coverage for all main modules: Over 170 tests written and passing Edge cases and error conditions covered Added detailed descriptions to README.md:  I...

GSoC week2

Hi, As discussed in the last blog, I started working on my first milestone of fetching SBO table from OLS to SBO annotator. This blog contains a summary of the meeting with the new features I added and the issues I worked on and fixed this week to add SBML L3V2 support. So, let's quickly dive in summary: 1. Change Detection Mechanism for SBO Terms ❓ Issue : OLS API’s updated field changed, but no observable differences in content. ✅ Observation : Compared OLS (API) and GitHub .owl versions – content is identical . Latest update timestamp on OLS is 2025, but GitHub file was last modified in 2023. New terms are already included in the GitHub version. ✅ Decision : We can rely on GitHub .owl file for consistency and reduced request time. 2. Fetching ‘is_a’ / Parent Info Performance Issue ⏱️ Fetching subclass_of (parent info) via OLS takes ~10 minutes. ⚡ Other fields can be fetched in <1 minute. ✅ Optimization : Parent info should be pulled from ...