Senior Research Scientist on the Surreal team at Meta Reality Labs, working on multimodal agents for smart glasses, egocentric and long-video understanding, and vision-language research.
My work focuses on first-person video, long-context understanding, multimodal post-training, and research systems for real-time contextual assistance. More at theshadow29.github.io.
- Researching multimodal agents for smart glasses, with a focus on egocentric and long-video understanding.
- Building multimodal models and post-training setups for instruction following, reasoning, and real-time contextual assistance.
- Working across the full research stack: benchmark design, distributed training, evaluation, low-latency inference, and deployment-facing integration.
- I enjoy reading the latest papers and articles on LLMs, VLMs, instruction following, reasoning, and agentic workflows.
Website · Google Scholar · CV · LinkedIn · Twitter/X · Email




