本专栏是计算机视觉方向论文收集积累,时间:2021年7月9日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: Crowd Counting Via Perspective-Guided Fractional-Dilation Convolution
AUTHORS: Zhaoyi Yan ; Ruimao Zhang ; Hongzhi Zhang ; Qingfu Zhang ; Wangmeng Zuo
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address this issue, this paper proposes a novel convolution neural network-based crowd counting method, termed Perspective-guided Fractional-Dilation Network (PFDNet).
2, TITLE: Tensor Methods in Computer Vision and Deep Learning
AUTHORS: YANNIS PANAGAKIS et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This article provides an in-depth and practical review of tensors and tensor methods in the context of representation learning and deep learning, with a particular focus on visual data analysis and computer vision applications.
3, TITLE: Feature Pyramid Network for Multi-task Affective Analysis
AUTHORS: Ruian He ; Zhen Xing ; Bo Yan
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a novel model named feature pyramid networks for multi-task affect analysis.
4, TITLE: Comparing ML Based Segmentation Models on Jet Fire Radiation Zone
AUTHORS: CARMINA P�REZ-GUERRERO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: One such characterization would be the segmentation of different radiation zones within the flame, so this paper presents an exploratory research regarding several traditional computer vision and Deep Learning segmentation approaches to solve this specific problem.
5, TITLE: An Embedded Iris Recognition System Optimization Using Dynamically ReconfigurableDecoder with LDPC Codes
AUTHORS: Longyu Ma ; Chiu-Wing Sham ; Chun Yan Lo ; Xinchao Zhong
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, the proposed design includes a minimal set of computer vision modules and multi-mode QC-LDPC decoder which can alleviate variability and noise caused by iris acquisition and follow-up process.
6, TITLE: Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs
AUTHORS: Yikang Zhang ; Zhuo Chen ; Zhao Zhong
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose a Collaboration of Experts (CoE) framework to pool together the expertise of multiple networks towards a common aim.
7, TITLE: $S^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks
AUTHORS: XINLIN LI et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To address these issues, we propose S low-bit re-parameterization, a novel technique for training low-bit shift networks.
8, TITLE: An Audiovisual and Contextual Approach for Categorical and Continuous Emotion Recognition In-the-wild
AUTHORS: Panagiotis Antoniadis ; Ioannis Pikoulis ; Panagiotis P. Filntisis ; Petros Maragos
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work we tackle the task of video-based audio-visual emotion recognition, within the premises of the 2nd Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW).
9, TITLE: A Dataset and Method for Hallux Valgus Angle Estimation Based on Deep Learing
AUTHORS: Ningyuan Xu ; Jiayan Zhuang ; Yaojun Wu ; Jiangjian Xiao
CATEGORY: cs.CV [cs.CV, cs.AI, I.4.7; I.2.10; I.5.1]
HIGHLIGHT: However, it lack of dataset and the keypoints based method which made a great success in pose estimation is not suitable for this field.To solve the problems, we made a dataset and developed an algorithm based on deep learning and linear regression.
10, TITLE: Automated Object Behavioral Feature Extraction for Potential Risk Analysis Based on Video Sensor
AUTHORS: Byeongjoon Noh ; Wonjun Noh ; David Lee ; Hwasoo Yeo
CATEGORY: cs.CV [cs.CV, cs.CY]
HIGHLIGHT: In this paper, we propose an automated and simpler system for effectively extracting object behavioral features from video sensors deployed on the road.
11, TITLE: Causal Affect Prediction Model Using A Facial Image Sequence
AUTHORS: Geesung Oh ; Euiseok Jeong ; Sejoon Lim
CATEGORY: cs.CV [cs.CV, cs.HC]
HIGHLIGHT: In this paper, we propose the causal affect prediction network (CAPNet), which uses only past facial images to predict corresponding affective valence and arousal.
12, TITLE: Instance-Level Relative Saliency Ranking with Graph Reasoning
AUTHORS: Nian Liu ; Long Li ; Wangbo Zhao ; Junwei Han ; Ling Shao
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we investigate a practical problem setting that requires simultaneously segment salient instances and infer their relative saliency rank order.
13, TITLE: Use of Affective Visual Information for Summarization of Human-Centric Videos
AUTHORS: Berkay K�pr� ; Engin Erzin
CATEGORY: cs.CV [cs.CV, cs.HC]
HIGHLIGHT: In this study, we investigate the affective-information enriched supervised video summarization task for human-centric videos.
14, TITLE: NccFlow: Unsupervised Learning of Optical Flow With Non-occlusion from Geometry
AUTHORS: Guangming Wang ; Shuaiqi Ren ; Hesheng Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper reveals novel geometric laws of optical flow based on the insight and detailed definition of non-occlusion.
15, TITLE: Uncertainty-Aware Camera Pose Estimation from Points and Lines
AUTHORS: Alexander Vakhitov ; Luis Ferraz Colomina ; Antonio Agudo ; Francesc Moreno-Noguer
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose PnP(L) solvers based on EPnP and DLS for the uncertainty-aware pose estimation.
16, TITLE: Exploiting The Relationship Between Visual and Textual Features in Social Networks for Image Classification with Zero-shot Deep Learning
AUTHORS: Luis Lucas ; David Tomas ; Jose Garcia-Rodriguez
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this work, we propose a classifier ensemble based on the transferable learning capabilities of the CLIP neural network architecture in multimodal environments (image and text) from social media.
17, TITLE: Technical Report for Valence-Arousal Estimation in ABAW2 Challenge
AUTHORS: Hong-Xia Xie ; I-Hsuan Li ; Ling Lo ; Hong-Han Shuai ; Wen-Huang Cheng
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we describe our method for tackling the valence-arousal estimation challenge from ABAW2 ICCV-2021 Competition.
18, TITLE: Uncertainty-aware Human Motion Prediction
AUTHORS: Pengxiang Ding ; Jianqin Yin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Hence, we propose an uncertainty-aware framework for human motion prediction (UA-HMP).
19, TITLE: Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation Under Zero-Shot Pedestrian Identity Setting
AUTHORS: Jian Jia ; Houjing Huang ; Xiaotang Chen ; Kaiqi Huang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Thus, we propose two datasets, PETA\textsubscript{$ZS$} and RAP\textsubscript{$ZS$}, constructed following the zero-shot settings on pedestrian identity.
20, TITLE: Weight Reparametrization for Budget-Aware Network Pruning
AUTHORS: Robin Dupont ; Hichem Sahbi ; Guillaume Michel
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we introduce an "end-to-end" lightweight network design that achieves training and pruning simultaneously without fine-tuning.
21, TITLE: Video 3D Sampling for Self-supervised Representation Learning
AUTHORS: Wei Li ; Dezhao Luo ; Bo Fang ; Yu Zhou ; Weiping Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel self-supervised method for video representation learning, referred to as Video 3D Sampling (V3S).
22, TITLE: SCSS-Net: Superpoint Constrained Semi-supervised Segmentation Network for 3D Indoor Scenes
AUTHORS: Shuang Deng ; Qiulei Dong ; Bo Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Specifically, we use the pseudo labels predicted from unlabeled point clouds for self-training, and the superpoints produced by geometry-based and color-based Region Growing algorithms are combined to modify and delete pseudo labels with low confidence.
23, TITLE: Case-based Similar Image Retrieval for Weakly Annotated Large Histopathological Images of Malignant Lymphoma Using Deep Metric Learning
AUTHORS: NORIAKI HASHIMOTO et. al.
CATEGORY: cs.CV [cs.CV, H.3.3; I.2.1; J.3]
HIGHLIGHT: In the present study, we propose a novel case-based similar image retrieval (SIR) method for hematoxylin and eosin (H&E)-stained histopathological images of malignant lymphoma.
24, TITLE: Investigate The Essence of Long-Tailed Recognition from A Unified Perspective
AUTHORS: Lei Liu ; Li Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we systematically investigate the essence of the long-tailed problem from a unified perspective.
25, TITLE: Multi-frame Collaboration for Effective Endoscopic Video Polyp Detection Via Spatial-Temporal Feature Transformation
AUTHORS: Lingyun Wu ; Zhiqiang Hu ; Yuanfeng Ji ; Ping Luo ; Shaoting Zhang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present Spatial-Temporal Feature Transformation (STFT), a multi-frame collaborative framework to address these issues.
26, TITLE: EEG-ConvTransformer for Single-Trial EEG Based Visual Stimuli Classification
AUTHORS: Subhranil Bagchi ; Deepti R. Bathula
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This work introduces an EEG-ConvTranformer network that is based on multi-headed self-attention.
27, TITLE: Grid Partitioned Attention: Efficient TransformerApproximation with Inductive Bias for High Resolution Detail Generation
AUTHORS: Nikolay Jetchev ; G�khan Yildirim ; Christian Bracher ; Roland Vollgraf
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We present Grid Partitioned Attention (GPA), a new approximate attention algorithm that leverages a sparse inductive bias for higher computational and memory efficiency in image domains: queries attend only to few keys, spatially close queries attend to close keys due to correlations.
28, TITLE: Relation-Based Associative Joint Location for Human Pose Estimation in Videos
AUTHORS: Yonghao Dang ; Jianqin Yin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, unlike the prior methods, we propose a Relation-based Pose Semantics Transfer Network (RPSTN) to locate joints associatively.
29, TITLE: Complete Scanning Application Using OpenCv
AUTHORS: Ayushe Gangal ; Peeyush Kumar ; Sunita Kumari
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In the following paper, we have combined the various basic functionalities provided by the NumPy library and OpenCv library, which is an open source for Computer Vision applications, like conversion of colored images to grayscale, calculating threshold, finding contours and using those contour points to take perspective transform of the image inputted by the user, using Python version 3.7.
30, TITLE: Multi-Modality Task Cascade for 3D Object Detection
AUTHORS: Jinhyung Park ; Xinshuo Weng ; Yunze Man ; Kris Kitani
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG, cs.RO]
HIGHLIGHT: To provide a more integrated approach, we propose a novel Multi-Modality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions, which are then used to further refine the 3D boxes.
31, TITLE: Task Fingerprinting for Meta Learning in Biomedical Image Analysis
AUTHORS: Patrick Godau ; Lena Maier-Hein
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we address the problem of quantifying task similarity with a concept that we refer to as task fingerprinting.
32, TITLE: Prior Aided Streaming Network for Multi-task Affective Recognitionat The 2nd ABAW2 Competition
AUTHORS: WEI ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we introduce our submission to the 2nd Affective Behavior Analysis in-the-wild (ABAW2) Competition.
33, TITLE: Optimizing Data Processing in Space for Object Detection in Satellite Imagery
AUTHORS: Martina Lofqvist ; Jos� Cano
CATEGORY: cs.CV [cs.CV, cs.DC, cs.LG, eess.IV]
HIGHLIGHT: In this work, we investigate the performance of CNN-based object detectors on constrained devices by applying different image compression techniques to satellite data.
34, TITLE: Adiabatic Quantum Graph Matching with Permutation Matrix Constraints
AUTHORS: Marcel Seelbach Benkner ; Vladislav Golyanik ; Christian Theobalt ; Michael Moeller
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we address such problems with emerging quantum computing technology and propose several reformulations of QAPs as unconstrained problems suitable for efficient execution on quantum hardware.
35, TITLE: TGHop: An Explainable, Efficient and Lightweight Method for Texture Generation
AUTHORS: Xuejing Lei ; Ganning Zhao ; Kaitai Zhang ; C. -C. Jay Kuo
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: An explainable, efficient and lightweight method for texture generation, called TGHop (an acronym of Texture Generation PixelHop), is proposed in this work.
36, TITLE: Image Resolution Susceptibility of Face Recognition Models
AUTHORS: Martin Knoche ; Stefan H�rmann ; Gerhard Rigoll
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To tackle this problem, we propose the following two methods: 1) Train a state-of-the-art face-recognition model straightforward with $50\%$ low-resolution images directly within each batch.
37, TITLE: Staying in Shape: Learning Invariant Shape Representations Using Contrastive Learning
AUTHORS: Jeffrey Gu ; Serena Yeung
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To producerepresentations that are specifically isometry andalmost-isometry invariant, we propose new dataaugmentations that randomly sample these transfor-mations.
38, TITLE: Malware Classification Using Deep Boosted Learning
AUTHORS: Muhammad Asam ; Saddam Hussain Khan ; Tauseef Jamal ; Umme Zahoora ; Asifullah Khan
CATEGORY: cs.CR [cs.CR, cs.CV, cs.LG]
HIGHLIGHT: This work proposes a novel deep boosted hybrid learning-based malware classification framework and named as Deep boosted Feature Space-based Malware classification (DFS-MC).
39, TITLE: Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers
AUTHORS: Ruihan Yang ; Minghao Zhang ; Nicklas Hansen ; Huazhe Xu ; Xiaolong Wang
CATEGORY: cs.LG [cs.LG, cs.CV, cs.RO]
HIGHLIGHT: In this paper, we introduce LocoTransformer, an end-to-end RL method for quadrupedal locomotion that leverages a Transformer-based model for fusing proprioceptive states and visual observations.
40, TITLE: Active Safety Envelopes Using Light Curtains with Probabilistic Guarantees
AUTHORS: Siddharth Ancha ; Gaurav Pathak ; Srinivasa G. Narasimhan ; David Held
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.RO]
HIGHLIGHT: We evaluate our approach in a simulated urban driving environment and a real-world environment with moving pedestrians using a light curtain device and show that we can estimate safety envelopes efficiently and effectively.
41, TITLE: RMA: Rapid Motor Adaptation for Legged Robots
AUTHORS: Ashish Kumar ; Zipeng Fu ; Deepak Pathak ; Jitendra Malik
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.RO]
HIGHLIGHT: This paper presents Rapid Motor Adaptation (RMA) algorithm to solve this problem of real-time online adaptation in quadruped robots.
42, TITLE: CamTuner: Reinforcement-Learning Based System for Camera Parameter Tuning to Enhance Analytics
AUTHORS: SIBENDU PAUL et. al.
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: We propose CamTuner, which is a system to automatically, and dynamically adapt the complex sensor to changing environments.
43, TITLE: LanguageRefer: Spatial-Language Model for 3D Visual Grounding
AUTHORS: Junha Roh ; Karthik Desingh ; Ali Farhadi ; Dieter Fox
CATEGORY: cs.RO [cs.RO, cs.CL, cs.CV]
HIGHLIGHT: In this paper, we develop a spatial-language model for a 3D visual grounding problem.
44, TITLE: 4D Attention: Comprehensive Framework for Spatio-Temporal Gaze Mapping
AUTHORS: Shuji Oishi ; Kenji Koide ; Masashi Yokozuka ; Atsuhiko Banno
CATEGORY: cs.RO [cs.RO, cs.CV]
HIGHLIGHT: This study presents a framework for capturing human attention in the spatio-temporal domain using eye-tracking glasses.
45, TITLE: 3D Neural Scene Representations for Visuomotor Control
AUTHORS: Yunzhu Li ; Shuang Li ; Vincent Sitzmann ; Pulkit Agrawal ; Antonio Torralba
CATEGORY: cs.RO [cs.RO, cs.CV, cs.LG]
HIGHLIGHT: In this work, we desire to learn models for dynamic 3D scenes purely from 2D visual observations.
46, TITLE: Modality Completion Via Gaussian Process Prior Variational Autoencoders for Multi-Modal Glioma Segmentation
AUTHORS: Mohammad Hamghalam ; Alejandro F. Frangi ; Baiying Lei ; Amber L. Simpson
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose a novel model, Multi-modal Gaussian Process Prior Variational Autoencoder (MGP-VAE), to impute one or more missing sub-modalities for a patient scan.
47, TITLE: Elastic Deformation of Optical Coherence Tomography Images of Diabetic Macular Edema for Deep-learning Models Training: How Far to Go?
AUTHORS: DANIEL BAR-DAVID et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: Elastic Deformation of Optical Coherence Tomography Images of Diabetic Macular Edema for Deep-learning Models Training: How Far to Go?
48, TITLE: Regional Differential Information Entropy for Super-Resolution Image Quality Assessment
AUTHORS: Ningyuan Xu ; Jiayan Zhuang ; Jiangjian Xiao ; Chengbin Peng
CATEGORY: eess.IV [eess.IV, cs.CV, I.4.3; I.4.4]
HIGHLIGHT: To solve the problem, we proposed a method called regional differential information entropy to measure both of the similarities and perceptual quality.
49, TITLE: Joint Motion Correction and Super Resolution for Cardiac Segmentation Via Latent Optimisation
AUTHORS: SHUO WANG et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: Here we propose a novel latent optimisation framework that jointly performs motion correction and super resolution for cardiac image segmentations.
50, TITLE: Deep Learning Based Image Retrieval in The JPEG Compressed Domain
AUTHORS: Shrikant Temburwar ; Bulla Rajesh ; Mohammed Javed
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: Here, we propose a unified model for image retrieval which takes DCT coefficients as input and efficiently extracts global and local features directly in the JPEG compressed domain for accurate image retrieval.
51, TITLE: A Hybrid Deep Learning Framework for Covid-19 Detection Via 3D Chest CT Images
AUTHORS: Shuang Liang
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we present a hybrid deep learning framework named CTNet which combines convolutional neural network and transformer together for the detection of COVID-19 via 3D chest CT images.
52, TITLE: Label-set Loss Functions for Partial Supervision: Application to Fetal Brain 3D MRI Parcellation
AUTHORS: LUCAS FIDON et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose the first axiomatic definition of label-set loss functions that are the loss functions that can handle partially segmented images.
53, TITLE: Atlas-Based Segmentation of Intracochlear Anatomy in Metal Artifact Affected CT Images of The Ear with Co-trained Deep Neural Networks
AUTHORS: JIANING WANG et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We propose an atlas-based method to segment the intracochlear anatomy (ICA) in the post-implantation CT (Post-CT) images of cochlear implant (CI) recipients that preserves the point-to-point correspondence between the meshes in the atlas and the segmented volumes.