About Me

I am a Ph.D. candidate specializing in machine learning, deep learning, and computer vision, with a strong focus on biomedical image analysis. My research develops data-efficient and reliable medical AI through domain adaptation, vision–language models, and active learning—enabling systems that adapt to domain shifts and significantly reduce annotation burdens, particularly for segmentation tasks. I also build segmentation, tracking, and visualization pipelines for high-dimensional microscopy data, and have experience with GANs, GNNs, transformers, multimodal fusion, federated learning, human pose estimation, and spatio-temporal tracking.

Education

Selected Research Projects

Online Domain Adaptive Medical Image Segmentation (ODES)

A source-free, single-pass online domain adaptation framework designed for medical image segmentation under severe domain shift. Integrates expert-guided active learning, noisy pseudo-label pruning, and diversity-aware sample selection to minimize annotation cost while outperforming modern test-time adaptation baselines.
paper - code

Active Learning via Vision–Language Models (LINGUAL)

A language-guided annotation strategy that replaces pixel-level corrections with natural-language instructions. These instructions are translated into executable segmentation refinement programs through large language models, reducing annotation time by ~80% while maintaining or surpassing state-of-the-art active learning baselines.
pre-print

Bio-Image Analysis (3D + Time)

papercode

Agentic Medical Report Generation (Ongoing)

Building an agentic report-generation system for chest X-rays that performs stepwise, uncertainty-aware reasoning grounded directly in visual evidence. The pipeline distinguishes ambiguous findings from confident impressions and dynamically adapts its reasoning trajectory to generate more reliable clinical reports.

Gait Phase Analysis

Developed a novel sEMG-driven gait phase classification algorithm for DNS, UPS, and WAK locomotion modes, targeting robust real-time locomotion analysis.
paper

Music-to-Dance Synthesis (CNN–LSTM–MDN + GAN)

A multimodal generative framework that learns music-conditioned human motion using a CNN–LSTM–MDN architecture for rhythmic pose generation and a Pix2pixHD GAN for high-fidelity dance synthesis. Demonstrated superior style retention across Ballet, Rumba, Cha-Cha, Tango, and Waltz.
papervideo

BRIAR

Developed modules for multi-person tracking under occlusion, silhouette segmentation, and GAIT/ReID-based identity recognition within the BRIAR dataset pipeline.

Asynchronous Federated Learning

Designed a delay-aware asynchronous FL framework addressing client stragglers via buffer diversification and contribution-weighted aggregation, improving convergence stability under heterogeneous system conditions.

Selected Publications