Multi-modal learning
Many interesting data we encounter in real applications are multi-modal: there exists multiple types of data tha reflect the same concept. How can we learn about them?
References
2024
- On the opportunities and challenges of foundation models for geospatial artificial intelligenceACM Transactions on Spatial Algorithms and Systems, Mar 2024
2021
- Towers of babel: Combining images, language, and 3d geometry for learning multimodal visionIn Proceedings of the IEEE/CVF International Conference on Computer Vision, Mar 2021
2017
- Generating holistic 3d scene abstractions for text-based image retrievalIn Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Mar 2017