Research Topics
My research spans across computer vision, deep learning, multi-modal learning, and AI for science, health, and society.
Computer Vision
    Image relighting, image synthesis, 3D reconstruction, scene understanding, human analysis, object detection, context modeling.
    
    View Publications
  
  - Concept-Centric Token Interpretation for Vector-Quantized Generative ModelsForty-second International Conference on Machine Learning (ICML) (2025)
  
  
  
  
- Towers of babel: Combining images, language, and 3d geometry for learning multimodal visionProceedings of the IEEE/CVF International Conference on Computer Vision (2021)
- Visual chiralityProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)
- Leveraging vision reconstruction pipelines for satellite imageryProceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Multi-Modal Learning
    Cross-modal retrieval and generation, multi-modal representation learning, VLMs and LLMs.
    
    View Publications
  
  - Concept-Centric Token Interpretation for Vector-Quantized Generative ModelsForty-second International Conference on Machine Learning (ICML) (2025)
  
  - On the opportunities and challenges of foundation models for geospatial artificial intelligenceACM Transactions on Spatial Algorithms and Systems (2024)