Current focus
Computational genomics and TE data
- Transposable element annotation
- TE taxonomy construction
- ML-ready biological dataset curation
- Computational genomics
- Biological sequence modeling
- Multimodal TE annotation model concept
Computational biology · Computational genomics
Computational Genomics & Biological AI
I work on transposable element annotation and ML-ready genomics datasets, and I am developing a multimodal TE annotation model concept. I am expanding toward physics-informed biological foundation models and virtual cell systems.
Early-career researcher with a background in math-biology, computational genomics, data modeling, and visual communication.
01 · Research focus
My current work is grounded in computational genomics and TE annotation. From this base, I am exploring how biological foundation models can become more mechanistically grounded, physically constrained, and useful for reasoning across biological scales.
Current focus
Expanding toward
02 · Questions I’m exploring
A few questions currently shaping my research direction:
How can genomic sequence modeling connect to cell-level biological representation learning?
How can biological foundation models become more mechanistically grounded rather than purely predictive?
What dynamical, physical, or energy-based constraints are meaningful for biological AI?
How can messy biological annotation systems become model-ready, interpretable structures?
How can sequence, domain, structure, and taxonomy evidence be integrated into annotation models?
03 · Current research projects
Three connected areas: building sound data foundations, developing an interpretable annotation framework, and defining the next research bridge.
Harmonizing TE labels and metadata into a structured dataset for downstream modeling.
Curating and organizing transposable element consensus-sequence data from RepBase and Dfam-derived sources into a cleaner, ML-ready format. The work includes sequence deduplication, label-conflict review, ontology-aware taxonomy mapping, source metadata organization, and hierarchical label harmonization.
Why it matters: This work turns heterogeneous biological database records into a more structured, traceable resource for downstream modeling.
Conceptual architecture for multimodal TE annotation combining sequence, domain, structural, and taxonomy evidence.
Developing a research framework for library-independent transposable element annotation that would integrate genomic sequence representations, protein-domain evidence, structural signals, and hierarchical TE taxonomy. The aim is to explore a path beyond purely library-based annotation toward interpretable, multimodal biological sequence modeling.
Current scope: This is a conceptual research framework and developing set of modeling ideas—not a completed production system.
A research direction connecting computational genomics to physics-informed biological foundation models.
Exploring how biological foundation models might become more mechanistically grounded through dynamical-systems thinking, energy-dissipation principles, and cross-scale biological representation learning. This direction connects computational genomics with future-facing questions in physics-informed biological AI and virtual cell systems.
Current scope: A future-facing research direction being developed from the concrete base of TE data and genomic sequence modeling.
04 · Cross-domain evidence
Earlier work that shaped how I model data, communicate complex ideas, and organize projects across research, design, and public-facing contexts.
Turning messy data and abstract problems into structured analytical workflows.
Research relevance Biological datasets are similarly heterogeneous and sensitive to modeling assumptions.
Making complex and public-facing information visible, readable, and persuasive.
Research relevance Supports research presentations, proposal writing, visual explanation, and outreach.
Coordinating small analytical projects and contributing to public-interest communication.
Research relevance Supports collaborative research, lab communication, outreach, and project execution.
05 · Selected visual work
A small selection of placeholders for visual communication, data storytelling, and public-facing design work. Final images can replace these without changing the layout.
06 · Education
B.S. Mathematics, Biology Track
Research and coursework in computational biology, molecular genetics, statistics, mathematics, and genomics.
Earlier training included mathematics, modeling, and data analysis.
07 · Contact
Open to research conversations, research-assistant opportunities, and collaborations in computational genomics, AI for biology, biological foundation models, and physics-informed biological modeling.
A PDF version of the research memo will be added before public launch.