Parshiv Kapoor
Hi, I'm Parshiv Kapoor, a pre-final year undergrad at IIT Roorkee, working on turning research into reliable and practical AI systems. My work primarily revolves around medical imaging, robustness and interpretability, reasoning in vision-language models via scene graph generation, and AI agents. Previously, I’ve designed and deployed AI systems during my internship at WhyMinds Global, and co-authored papers at AAAI SAPP and ICVGIP.
Recently, I achieved 5th rank globally in the SeePhys Challenge @ ICML 2025 — a large-scale vision-language competition on physics reasoning involving 2,000+ multimodal problems and 2,200+ diagrams — by building a VLM-based system that combined schematic interpretation with natural language understanding.
Always eager to bridge research and real-world impact through scalable, explainable AI.
I build reliable, practical AI systems — surgical video analysis, medical imaging, and scene-graph reasoning for VLMs.

Highlights
SeePhys @ ICML 2025
Ranked 5th globally among 2000+ teams with a VLM-based reasoning pipeline combining schematic interpretation and natural language understanding.
Research Paper Acceptances
- 6 Oct 2025 — Paper accepted at ICVGIP 2025 (Indian Conference on Computer Vision, Graphics and Image Processing).
- 21 Oct 2025 — Two papers accepted at AAAI 2025 Student Abstract Track (AAAI Conference on Artificial Intelligence).