Joint Image/Video and Text Analysis
Recent papers in CVPR, ACM MM, and ACM ICMR have analyzed the issue of jointly analyzing visual and textual data. The CVPR-19 paper focuses on video moment retrieval based on associated captions, the ACM MM-18 paper considers web supervision for retrieval in joint image-text databases with limited labeled data, while the ICMR paper considered cross-modal video-text retrieval and received the Best Paper Award.
- 
	Weakly Supervised Video Moment Retrieval from Text Queries, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019. 
- 
	Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval, ACM International Conference on Multimedia, 2018 
- 
	Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval, ACM International Conference on Multimedia Retrieval, 2018 (Best Paper) 
 
 
 
