本专栏是计算机视觉方向论文收集积累,时间:2021年7月30日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: Efficient Human Pose Estimation By Maximizing Fusion and High-Level Spatial Attention
AUTHORS: Zhiyuan Ren ; Yaohai Zhou ; Yizhe Chen ; Ruisong Zhou ; Yayu Gao
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose an efficient human pose estimation network -- SFM (slender fusion model) by fusing multi-level features and adding lightweight attention blocks -- HSA (High-Level Spatial Attention).
2, TITLE: Sign and Search: Sign Search Functionality for Sign Language Lexica
AUTHORS: Manolis Fragkiadakis ; Peter van der Putten
CATEGORY: cs.CV [cs.CV, cs.IR, cs.MM, I.4.9; I.5.4; H.3.3]
HIGHLIGHT: By extracting different body joints combinations (upper body, dominant hand's arm and wrist) using the pose estimation framework OpenPose, we compare four techniques (PCA, UMAP, DTW and Euclidean distance) as distance metrics between 20 query signs, each performed by eight participants on a 1200 sign lexicon.
3, TITLE: Mapping Vulnerable Populations with AI
AUTHORS: BENJAMIN KELLENBERGER et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this project we aim to automate building footprint and function mapping using heterogeneous data sources.
4, TITLE: Discovering 3D Parts from Image Collections
AUTHORS: Chun-Han Yao ; Wei-Chih Hung ; Varun Jampani ; Ming-Hsuan Yang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we tackle the problem of 3D part discovery from only 2D image collections.
5, TITLE: United We Learn Better: Harvesting Learning Improvements From Class Hierarchies Across Tasks
AUTHORS: Sindi Shkodrani ; Yu Wang ; Marco Manfredi ; N�ra Baka
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work we establish a theoretical framework based on probability and set theory for extracting parent predictions and a hierarchical loss that can be used across tasks, showing results across classification and detection benchmarks and opening up the possibility of hierarchical learning for sigmoid-based detection architectures.
6, TITLE: What Does TERRA-REF's High Resolution, Multi Sensor Plant Sensing Public Domain Data Offer The Computer Vision Community?
AUTHORS: DAVID LEBAUER et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: A core objective of the TERRA-REF project was to generate an open-access reference dataset for the study of evaluation of sensing technology to study plants under field conditions.
7, TITLE: Underwater Inspection and Intervention Dataset
AUTHORS: TOMASZ LUCZYNSKI et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents a novel dataset for the development of visual navigation and simultaneous localisation and mapping (SLAM) algorithms as well as for underwater intervention tasks.
8, TITLE: Machine Learning Advances Aiding Recognition and Classification of Indian Monuments and Landmarks
AUTHORS: ADITYA JYOTI PAUL et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.CY, cs.LG, eess.IV]
HIGHLIGHT: This paper serves as a survey of the research endeavors undertaken in this direction which would eventually provide insights for building an automated decision system that could be utilized to make the experience of tourism in India more modernized for visitors.
9, TITLE: Semi-Supervised Active Learning with Temporal Output Discrepancy
AUTHORS: Siyu Huang ; Tianyang Wang ; Haoyi Xiong ; Jun Huan ; Dejing Dou
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Inspired by the fact that the samples with higher loss are usually more informative to the model than the samples with lower loss, in this paper we present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
10, TITLE: Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows
AUTHORS: Tom Wehrbein ; Marco Rudolph ; Bodo Rosenhahn ; Bastian Wandt
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, we propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem. In contrast, we generate a diverse set of hypotheses that represents the full posterior distribution of feasible 3D poses.
11, TITLE: The Need and Status of Sea Turtle Conservation and Survey of Associated Computer Vision Advances
AUTHORS: Aditya Jyoti Paul
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG, eess.IV]
HIGHLIGHT: The Need and Status of Sea Turtle Conservation and Survey of Associated Computer Vision Advances
12, TITLE: Fully-Automatic Pipeline for Document Signature Analysis to Detect Money Laundering Activities
AUTHORS: Nikhil Woodruff ; Amir Enshaei ; Bashar Awwad Shiekh Hasan
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose an integrated pipeline of signature extraction and curation, with no human assistance from the obtaining of company documents to the clustering of individual signatures.
13, TITLE: From Continuity to Editability: Inverting GANs with Consecutive Images
AUTHORS: Yangyang Xu ; Yong Du ; Wenpeng Xiao ; Xuemiao Xu ; Shengfeng He
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we resolve this paradox by introducing consecutive images (\eg, video frames or the same person with different poses) into the inversion process.
14, TITLE: FREE: Feature Refinement for Generalized Zero-Shot Learning
AUTHORS: SHIMING CHEN et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we propose a simple yet effective GZSL method, termed feature refinement for generalized zero-shot learning (FREE), to tackle the above problem.
15, TITLE: Probabilistic and Geometric Depth: Detecting Objects in Perspective
AUTHORS: Tai Wang ; Xinge Zhu ; Jiangmiao Pang ; Dahua Lin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Therefore, we construct geometric relation graphs across predicted objects and use the graph to facilitate depth estimation.
16, TITLE: Similarity and Symmetry Measures Based on Fuzzy Descriptors of Image Objects` Composition
AUTHORS: Marcin Iwanowski ; Marcin Grzabka
CATEGORY: cs.CV [cs.CV, 03B52, 94A08, I.4.8; I.4.10]
HIGHLIGHT: The paper describes a method for measuring the similarity and symmetry of an image annotated with bounding boxes indicating image objects.
17, TITLE: Self-Supervised Learning for Fine-Grained Image Classification
AUTHORS: Farha Al Breiki ; Muhammad Ridzuan ; Rushali Grandhe
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: Our idea is to leverage self-supervision such that the model learns useful representations of fine-grained image classes.
18, TITLE: RigNet: Repetitive Image Guided Network for Depth Completion
AUTHORS: ZHIQIANG YAN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle these problems, we explore a repetitive design in our image guided network to sufficiently and gradually recover depth values.
19, TITLE: CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation
AUTHORS: TIANXIAO GAO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a network injected with contextual information (CI-Net) to solve the problem.
20, TITLE: UIBert: Learning Generic Multimodal Representations for UI Understanding
AUTHORS: CHONGYANG BAI et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: To address such challenges we introduce UIBert, a transformer-based joint image-text model trained through novel pre-training tasks on large-scale unlabeled UI data to learn generic feature representations for a UI and its components.
21, TITLE: Egyptian Sign Language Recognition Using CNN and LSTM
AUTHORS: Ahmed Elhagry ; Rawan Gla
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we are providing applied research with its video-based Egyptian sign language recognition system that serves the local community of deaf people in Egypt, with a moderate and reasonable accuracy.
22, TITLE: A Unified Efficient Pyramid Transformer for Semantic Segmentation
AUTHORS: FANGRUI ZHU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we advocate a unified framework(UN-EPT) to segment objects by considering both context information and boundary artifacts.
23, TITLE: ReFormer: The Relational Transformer for Image Captioning
AUTHORS: Xuewen Yang ; Yingru Liu ; Xin Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To improve the quality of image captioning, we propose a novel architecture ReFormer -- a RElational transFORMER to generate features with relation information embedded and to explicitly express the pair-wise relationships between objects in the image.
24, TITLE: Feature Importance-aware Transferable Adversarial Attacks
AUTHORS: ZHIBO WANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: By contrast, we propose the Feature Importance-aware Attack (FIA), which disrupts important object-aware features that dominate model decisions consistently.
25, TITLE: Lighter Stacked Hourglass Human Pose Estimation
AUTHORS: Ahmed Elhagry ; Mohamed Saeed ; Musie Araia
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this paper, we focus on one of the deep learning-based approaches of HPE proposed by Newell et al., which they named the stacked hourglass network.
26, TITLE: Personalized Trajectory Prediction Via Distribution Discrimination
AUTHORS: Guangyi Chen ; Junlong Li ; Nuoxing Zhou ; Liangliang Ren ; Jiwen Lu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a distribution discrimination (DisDis) method to predict personalized motion patterns by distinguishing the potential distributions.
27, TITLE: PPT Fusion: Pyramid Patch Transformerfor A Case Study in Image Fusion
AUTHORS: Yu Fu ; TianYang Xu ; XiaoJun Wu ; Josef Kittler
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, therefore, we propose a Patch PyramidTransformer(PPT) to effectively address the above issues.
28, TITLE: Human Trajectory Prediction Via Counterfactual Analysis
AUTHORS: Guangyi Chen ; Junlong Li ; Jiwen Lu ; Jie Zhou
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Hence, we propose a counterfactual analysis method for human trajectory prediction to investigate the causality between the predicted trajectories and input clues and alleviate the negative effects brought by environment bias.
29, TITLE: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation
AUTHORS: ZEYU HU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In view of this, we present Voxel-Mesh Network (VMNet), a novel 3D deep architecture that operates on the voxel and mesh representations leveraging both the Euclidean and geodesic information.
30, TITLE: Convolutional Transformer Based Dual Discriminator Generative Adversarial Networks for Video Anomaly Detection
AUTHORS: XINYANG FENG et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG, cs.MM, 68T45, I.4.8; I.4.9; I.2.10]
HIGHLIGHT: To this end, we propose Convolutional Transformer based Dual Discriminator Generative Adversarial Networks (CT-D2GAN) to perform unsupervised video anomaly detection.
31, TITLE: Bridging Gap Between Image Pixels and Semantics Via Supervision: A Survey
AUTHORS: Jiali Duan ; C. -C. Jay Kuo
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To begin with, this paper offers a historical retrospective on supervision, makes a gradual transition to the modern data-driven methodology and introduces commonly used datasets.
32, TITLE: Fast and Scalable Image Search For Histology
AUTHORS: CHENGKUAN CHEN et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, q-bio.TO]
HIGHLIGHT: Here we present Fast Image Search for Histopathology (FISH), a histology image search pipeline that is infinitely scalable and achieves constant search speed that is independent of the image database size while being interpretable and without requiring detailed annotations.
33, TITLE: Bayesian Embeddings for Few-Shot Open World Recognition
AUTHORS: JOHN WILLES et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work we extend embedding-based few-shot learning algorithms to the open-world recognition setting.
34, TITLE: Cross-Camera Feature Prediction for Intra-Camera Supervised Person Re-identification Across Distant Scenes
AUTHORS: Wenhang Ge ; Chunyan Pan ; Ancong Wu ; Hongwei Zheng ; Wei-Shi Zheng
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this work, we study intra-camera supervised person re-identification across distant scenes (ICS-DS Re-ID), which uses cross-camera unpaired data with intra-camera identity labels for training.
35, TITLE: Open-World Entity Segmentation
AUTHORS: LU QI et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Based on our unified entity representation, we propose a center-based entity segmentation framework with two novel modules to improve mask quality.
36, TITLE: Guided Disentanglement in Generative Networks
AUTHORS: Fabio Pizzati ; Pietro Cerri ; Raoul de Charette
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG, cs.RO]
HIGHLIGHT: In this paper, we present a comprehensive method for disentangling physics-based traits in the translation, guiding the learning process with neural or physical models.
37, TITLE: Improving Robustness and Accuracy Via Relative Information Encoding in 3D Human Pose Estimation
AUTHORS: Wenkang Shan ; Haopeng Lu ; Shanshe Wang ; Xinfeng Zhang ; Wen Gao
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: To alleviate these two problems, we propose a relative information encoding method that yields positional and temporal enhanced representations.
38, TITLE: Why You Should Try The Real Data for The Scene Text Recognition
AUTHORS: Vladimir Loginov
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Recent works in the text recognition area have pushed forward the recognition results to the new horizons.
39, TITLE: Learning Geometry-Guided Depth Via Projective Modeling for Monocular 3D Object Detection
AUTHORS: YINMIN ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose to learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
40, TITLE: Rethinking and Improving Relative Position Encoding for Vision Transformer
AUTHORS: Kan Wu ; Houwen Peng ; Minghao Chen ; Jianlong Fu ; Hongyang Chao
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Our methods consider directional relative distance modeling as well as the interactions between queries and relative position embeddings in self-attention mechanism.
41, TITLE: Learning with Noisy Labels for Robust Point Cloud Segmentation
AUTHORS: Shuquan Ye ; Dongdong Chen ; Songfang Han ; Jing Liao
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: In this work, we take the lead in solving this issue by proposing a novel Point Noise-Adaptive Learning (PNAL) framework.
42, TITLE: Profile to Frontal Face Recognition in The Wild Using Coupled Conditional GAN
AUTHORS: Fariborz Taherkhani ; Veeru Talreja ; Jeremy Dawson ; Matthew C. Valenti ; Nasser M. Nasrabadi
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we hypothesize that the profile face domain possesses a latent connection with the frontal face domain in a latent feature subspace.
43, TITLE: Self-Paced Contrastive Learning for Semi-supervisedMedical Image Segmentation with Meta-labels
AUTHORS: Jizong Peng ; Ping Wang ; Chrisitian Desrosiers ; Marco Pedersoli
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose to adapt contrastive learning to work with meta-label annotations, for improving the model's performance in medical image segmentation even when no additional unlabeled data is available.
44, TITLE: Cascaded Residual Density Network for Crowd Counting
AUTHORS: Kun Zhao ; Luchuan Song ; Bin Liu ; Qi Chu ; Nenghai Yu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel Cascaded Residual Density Network (CRDNet) in a coarse-to-fine approach to generate the high-quality density map for crowd counting more accurately.
45, TITLE: Hierarchical Self-supervised Augmented Knowledge Distillation
AUTHORS: Chuanguang Yang ; Zhulin An ; Linhang Cai ; Yongjun Xu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose to append several auxiliary classifiers to hierarchical intermediate feature maps to generate diverse self-supervised knowledge and perform the one-to-one transfer to teach the student network thoroughly.
46, TITLE: Geometry Uncertainty Projection Network for Monocular 3D Object Detection
AUTHORS: YAN LU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a Geometry Uncertainty Projection Network (GUP Net) to tackle the error amplification problem at both inference and training stages.
47, TITLE: Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation
AUTHORS: Yunfei Liu ; Ruicong Liu ; Haofei Wang ; Feng Lu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a plug-and-play gaze adaptation framework (PnP-GA), which is an ensemble of networks that learn collaboratively with the guidance of outliers.
48, TITLE: Video Generation from Text Employing Latent Path Construction for Temporal Modeling
AUTHORS: Amir Mazaheri ; Mubarak Shah
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we tackle the text to video generation problem, which is a conditional form of video generation.
49, TITLE: Viewpoint-Invariant Exercise Repetition Counting
AUTHORS: Yu Cheng Hsu ; Qingpeng Zhang ; Efstratios Tsougenis ; Kwok-Leung Tsui
CATEGORY: cs.CV [cs.CV, cs.HC]
HIGHLIGHT: This work presents a vision-based human motion repetition counting applicable to counting concurrent motions through the skeleton location extracted from various pose estimation methods.
50, TITLE: Abnormal Behavior Detection Based on Target Analysis
AUTHORS: Luchuan Song ; Bin Liu ; Huihui Zhu ; Qi Chu ; Nenghai Yu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, we propose a multivariate fusion method that analyzes each target through three branches: object, action and motion.
51, TITLE: Enhancing Adversarial Robustness Via Test-time Transformation Ensembling
AUTHORS: JUAN C. P�REZ et. al.
CATEGORY: cs.LG [cs.LG, cs.CR, cs.CV]
HIGHLIGHT: In this work, we study how equipping models with Test-time Transformation Ensembling (TTE) can work as a reliable defense against such attacks.
52, TITLE: Few-Shot and Continual Learning with Attentive Independent Mechanisms
AUTHORS: Eugene Lee ; Cheng-Han Huang ; Chen-Yi Lee
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: To tackle these problems, we introduce Attentive Independent Mechanisms (AIM).
53, TITLE: Generalizing Fairness: Discovery and Mitigation of Unknown Sensitive Attributes
AUTHORS: William Paul ; Philippe Burlina
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.CY]
HIGHLIGHT: This paper investigates methods that discover and separate out individual semantic sensitive factors from a given dataset to conduct this characterization as well as addressing mitigation of these factors' sensitivity.
54, TITLE: Structure and Performance of Fully Connected Neural Networks: Emerging Complex Network Properties
AUTHORS: Leonardo F. S. Scabini ; Odemir M. Bruno
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, physics.app-ph, physics.comp-ph, 68T07, I.2.6; I.5.4; J.2]
HIGHLIGHT: Therefore, we propose Complex Network (CN) techniques to analyze the structure and performance of fully connected neural networks. For that, we build a dataset with 4 thousand models and their respective CN properties.
55, TITLE: Social Processes: Self-Supervised Forecasting of Nonverbal Cues in Social Conversations
AUTHORS: Chirag Raman ; Hayley Hung ; Marco Loog
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: To address these, we take a meta-learning approach and propose the Social Process (SP) models--socially aware sequence-to-sequence (Seq2Seq) models within the Neural Process (NP) family.
56, TITLE: Spot What Matters: Learning Context Using Graph Convolutional Networks for Weakly-Supervised Action Detection
AUTHORS: Michail Tsiaousis ; Gertjan Burghouts ; Fieke Hillerstr�m ; Peter van der Putten
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: To this end, we introduce an architecture based on self-attention and Graph Convolutional Networks in order to model contextual cues, such as actor-actor and actor-object interactions, to improve human action detection in video.
57, TITLE: Using Visual Anomaly Detection for Task Execution Monitoring
AUTHORS: Santosh Thoduka ; Juergen Gall ; Paul G. Pl�ger
CATEGORY: cs.RO [cs.RO, cs.CV]
HIGHLIGHT: We evaluate our method on a dataset of a robot placing a book on a shelf, which includes anomalies such as falling books, camera occlusions, and robot disturbances.
58, TITLE: Swap-Free Fat-Water Separation in Dixon MRI Using Conditional Generative Adversarial Networks
AUTHORS: NICOLAS BASTY et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this work we propose such a method based on style transfer using a conditional generative adversarial network.
59, TITLE: Recurrent U-net for Automatic Pelvic Floor Muscle Segmentation on 3D Ultrasound
AUTHORS: Frieda van den Noort ; Beril Sirmacek ; Cornelis H. Slump
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this study we present a U-net like neural network with some convolutional long short term memory (CLSTM) layers to automate the 3D segmentation of the levator ani muscle (LAM) in TPUS volumes.
60, TITLE: The Interpretation of Endobronchial Ultrasound Image Using 3D Convolutional Neural Network for Differentiating Malignant and Benign Mediastinal Lesions
AUTHORS: CHING et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: The purpose of this study is to differentiate malignant and benign mediastinal lesions by using the three-dimensional convolutional neural network through the endobronchial ultrasound (EBUS) image.
61, TITLE: A Similarity Measure of Histopathology Images By Deep Embeddings
AUTHORS: Mehdi Afshari ; H. R. Tizhoosh
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: This study proposes a content-based similarity measure for high-resolution gigapixel histopathology images.