Senior Research Scientist on the Surreal team at Meta Reality Labs, working on grounded video understanding for egocentric and long videos, multimodal reasoning, and vision-language research.
My work focuses on grounding language in visual data, especially first-person video, long-video understanding, and multimodal systems for richer visual reasoning. More at theshadow29.github.io.
- Researching grounded video understanding for egocentric and long videos on the Surreal team at Meta Reality Labs.
- Building multimodal models and evaluation setups for first-person video, visual grounding, and structured visual reasoning.
- Exploring long-video understanding, streaming multimodal systems, and research tools that make these models more usable.
- Sharing papers, code, and curated resources through projects such as awesome-grounding, VidSitu, and Video-QAP.
Website · Google Scholar · CV · LinkedIn · Twitter/X · Email




