Joint Image/Video and Text Analysis
Recent papers in CVPR, ACM MM, and ACM ICMR have analyzed the issue of jointly analyzing visual and textual data. The CVPR-19 paper focuses on video moment retrieval based on associated captions, the ACM MM-18 paper considers web supervision for retrieval in joint image-text databases with limited labeled data, while the ICMR paper considered cross-modal video-text retrieval and received the Best Paper Award.
Weakly Supervised Video Moment Retrieval from Text Queries, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019.
Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval, ACM International Conference on Multimedia, 2018
Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval, ACM International Conference on Multimedia Retrieval, 2018 (Best Paper)