视觉推理相关论文调研survey

简介

视觉推理是结合了计算机视觉和自然语言处理的一个重要方向。以下为我于7月11-7月22号之间整理的相关论文和数据集,特此记录

数据集相关

[1]CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

[2]A Dataset and Architecture for Visual Reasoning with a Working Memory

[3]GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering

[4]FigureQA: An Annotated Figure Dataset for Visual Reasoning

[5]RAVEN: A Dataset for Relational and Analogical Visual Reasoning

[6]TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering

https://github.com/YunseokJANG/tgif-qa

 

 

 

 

 

紧密相关

[1] Johnson J, Hariharan B, van der Maaten L, et al. CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning.

The CLEVR dataset  http://cs.stanford.edu/people/jcjohns/clevr/.

 

[2] Santoro A, Raposo D, Barrett D G T, et al. A simple neural network module for relational reasoning.

 

[3] Hu R, Andreas J, Rohrbach M, et al. Learning to reason: End-to-end module networks for visual question answering.

 

[4] Johnson J, Hariharan B, van der Maaten L, et al. Inferring and Executing Programs for Visual Reasoning.

 https://github.com/facebookresearch/clevr-iep.

 

[5] Perez E, de Vries H, Strub F, et al. Learning Visual Reasoning Without Strong Priors

 

 

[6]A Read-Write Memory Network for Movie Story Understanding(ICCV.2017.80)

https://github.com/seilna/RWMN

 

[7]Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning  (CVPR), 2018,

http://openaccess.thecvf.com/content_cvpr_2018/html/Mascharka_Transparency_by_Design_CVPR_2018_paper.html

https://github.com/davidmascharka/tbd-nets

 

https://baijiahao.baidu.com/s?id=1595193639060011837&wfr=spider&for=pc

 

[8]Iterative Visual Reasoning Beyond Convolutions( CVPR,2018)

http://openaccess.thecvf.com/content_cvpr2018/html/Chen_Iterative_Visual_Reasoning_CVPR_2018_paper.html

https://blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/79787737

[9]Film Visual reasoning with a general conditioning layer(AAAI, 2018)

[10]Object Level Visual Reasoning in Videos(ECCV, 2108)

 

相关

[1]Compositional Attention Networks for Machine Reasoning(ICLA,2018)

[2]Learning Visual Reasoning Without Strong Priors

[3]Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding

[4]Dual Attention Networks for Multimodal Reasoning and Matching(CVPR,2017)

[5]An enhanced SSD with feature fusion and visual reasoning for object detection

 

[6]Spatial Knowledge Distillation to Aid Visual Reasoning (WACV,2019)

 

[7]MUREL: Multimodal Relational Reasoning for Visual Question Answering(CVPR,2019)

https://github.com/Cadene/murel.bootstrap.pytorch

 

[8]Explainable and Explicit Visual Reasoning over Scene Graphs(CVPR,2019)

 

 

 

略微相关

[1]iVQA: Inverse Visual Question Answering (CVPR.2018.00898)

学习笔记 https://blog.csdn.net/jiang6869732/article/details/81392761

 

[2]Making the V in VQA Matter: Elevating the Role of Image Under standing in Visual Question Answering

 

[3]Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering(ICCV.2017.202)

https://github.com/yuzcccc/mfb.

 

[4]Visual Question Answering A Survey of Methods and Datasets

 

[5]Image Captioning with Semantic Attention(CVPR, 2016)

https://github.com/siavash9000/im2txt_demo

下载NLTK数据 http://www.nltk.org/data.html

训练好的模型 https://github.com/tensorflow/models

 

你可能感兴趣的:(视觉推理,视觉推理,深度学习)