本专栏是计算机视觉方向论文收集积累,时间:2021年6月17日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: Structure First Detail Next: Image Inpainting with Pyramid Generator
AUTHORS: SHUYI QU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, we propose to build a Pyramid Generator by stacking several sub-generators, where lower-layer sub-generators focus on restoring image structures while the higher-layer sub-generators emphasize image details.
2, TITLE: ICDAR 2021 Competition on Components Segmentation Task of Document Photos
AUTHORS: Celso A. M. Lopes Junior ; Ricardo B. das Neves Junior ; Byron L. D. Bezerra ; Alejandro H. Toselli ; Donato Impedovo
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: This paper describes the short-term competition on Components Segmentation Task of Document Photos that was prepared in the context of the 16th International Conference on Document Analysis and Recognition (ICDAR 2021).
3, TITLE: Dynamically Grown Generative Adversarial Networks
AUTHORS: Lanlan Liu ; Yuting Zhang ; Jia Deng ; Stefano Soatto
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose a method to dynamically grow a GAN during training, optimizing the network architecture and its parameters together with automation.
4, TITLE: The Oxford Road Boundaries Dataset
AUTHORS: Tarlan Suleymanov ; Matthew Gadd ; Daniele De Martini ; Paul Newman
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this paper we present the Oxford Road Boundaries Dataset, designed for training and testing machine-learning-based road-boundary detection and inference approaches.
5, TITLE: Differentiable Diffusion for Dense Depth Estimation from Multi-view Images
AUTHORS: Numair Khan ; Min H. Kim ; James Tompkin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a method to estimate dense depth by optimizing a sparse set of points such that their diffusion into a depth map minimizes a multi-view reprojection error from RGB supervision.
6, TITLE: Tackling The Challenges in Scene Graph Generation with Local-to-Global Interactions
AUTHORS: Sangmin Woo ; Junhyug Noh ; Kangil Kim
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we seek new insights into the underlying challenges of the Scene Graph Generation (SGG) task.
7, TITLE: Invertible Attention
AUTHORS: Jiajun Zha ; Yiran Zhong ; Jing Zhang ; Liang Zheng ; Richard Hartley
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose invertible attention that can be plugged into existing invertible models.
8, TITLE: AtrialGeneral: Domain Generalization for Left Atrial Segmentation of Multi-Center LGE MRIs
AUTHORS: Lei Li ; Veronika A. Zimmer ; Julia A. Schnabel ; Xiahai Zhuang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we collect 210 LGE MRIs from different centers with different levels of image quality.
9, TITLE: Learning Implicit Glyph Shape Representation
AUTHORS: Ying-Tian Liu ; Yuan-Chen Guo ; Yi-Xiao Li ; Chen Wang ; Song-Hai Zhang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a novel implicit glyph shape representation, which models glyphs as shape primitives enclosed by quadratic curves, and naturally enables generating glyph images at arbitrary high resolutions.
10, TITLE: Compound Frechet Inception Distance for Quality Assessment of GAN Created Images
AUTHORS: Eric J. Nunn ; Pejman Khadivi ; Shadrokh Samavi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose to improve the robustness of the evaluation process by integrating lower-level features to cover a wider array of visual defects.
11, TITLE: TextStyleBrush: Transfer of Text Aesthetics from A Single Example
AUTHORS: Praveen Krishnan ; Rama Kovvuri ; Guan Pang ; Boris Vassilev ; Tal Hassner
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a novel approach for disentangling the content of a text image from all aspects of its appearance.
12, TITLE: Anomaly Detection in Video Sequences: A Benchmark and Computational Model
AUTHORS: Boyang Wan ; Wenhui Jiang ; Yuming Fang ; Zhiyuan Luo ; Guanqun Ding
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle these problems, we contribute a new Large-scale Anomaly Detection (LAD) database as the benchmark for anomaly detection in video sequences, which is featured in two aspects.
13, TITLE: Detection of Morphed Face Images Using Discriminative Wavelet Sub-bands
AUTHORS: Poorya Aghdaie ; Baaria Chaudhary ; Sobhan Soleymani ; Jeremy Dawson ; Nasser M. Nasrabadi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To detect morphing attacks, we propose a method which is based on a discriminative 2D Discrete Wavelet Transform (2D-DWT).
14, TITLE: Smoothing The Disentangled Latent Style Space for Unsupervised Image-to-Image Translation
AUTHORS: YAHUI LIU et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose a new training protocol based on three specific losses which help a translation network to learn a smooth and disentangled latent style space in which: 1) Both intra- and inter-domain interpolations correspond to gradual changes in the generated images and 2) The content of the source image is better preserved during the translation.
15, TITLE: End-to-End Semi-Supervised Object Detection with Soft Teacher
AUTHORS: MENGDE XU et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: This paper presents an end-to-end semi-supervised object detection approach, in contrast to previous more complex multi-stage methods.
16, TITLE: 2nd Place Solution for Waymo Open Dataset Challenge - Real-time 2D Object Detection
AUTHORS: YUEMING ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this report, we introduce a real-time method to detect the 2D objects from images.
17, TITLE: Cascading Modular Network (CAM-Net) for Multimodal Image Synthesis
AUTHORS: Shichong Peng ; Alireza Moazeni ; Ke Li
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we focus on this problem of multimodal conditional image synthesis and build on the recently proposed technique of Implicit Maximum Likelihood Estimation (IMLE).
18, TITLE: Evolving Image Compositions for Feature Representation Learning
AUTHORS: Paola Cascante-Bonilla ; Arshdeep Sekhon ; Yanjun Qi ; Vicente Ordonez
CATEGORY: cs.CV [cs.CV, cs.LG, cs.NE]
HIGHLIGHT: This paper proposes PatchMix, a data augmentation method that creates new samples by composing patches from pairs of images in a grid-like pattern.
19, TITLE: Unsupervised Domain Adaptation with Variational Approximation for Cardiac Segmentation
AUTHORS: Fuping Wu ; Xiahai Zhuang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose a new framework, where the latent features of both domains are driven towards a common and parameterized variational form, whose conditional distribution given the image is Gaussian.
20, TITLE: PatchNet: Unsupervised Object Discovery Based on Patch Embedding
AUTHORS: Hankyu Moon ; Heng Hao ; Sima Didari ; Jae Oh Woo ; Patrick Bangert
CATEGORY: cs.CV [cs.CV, cs.AI, I.2.10; I.4.10; I.5.3]
HIGHLIGHT: We demonstrate that frequently appearing objects can be discovered by training randomly sampled patches from a small number of images (100 to 200) by self-supervision.
21, TITLE: Revisit Visual Representation in Analytics Taxonomy: A Compression Perspective
AUTHORS: Yueyu Hu ; Wenhan Yang ; Haofeng Huang ; Jiaying Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we raise and study the novel problem of supporting multiple machine vision analytics tasks with the compressed visual representation, namely, the information compression problem in analytics taxonomy.
22, TITLE: Watching Too Much Television Is Good: Self-Supervised Audio-Visual Representation Learning from Movies and TV Shows
AUTHORS: Mahdi M. Kalayeh ; Nagendra Kamath ; Lingyi Liu ; Ashok Chandrashekar
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we study the efficacy of learning from Movies and TV Shows as forms of uncurated data for audio-visual self-supervised learning.
23, TITLE: Temporal Convolution Networks with Positional Encoding for Evoked Expression Estimation
AUTHORS: VanThong Huynh ; Guee-Sang Lee ; Hyung-Jeong Yang ; Soo-Huyng Kim
CATEGORY: cs.CV [cs.CV, cs.HC]
HIGHLIGHT: This paper presents an approach for Evoked Expressions from Videos (EEV) challenge, which aims to predict evoked facial expressions from video.
24, TITLE: Disentangling Semantic-to-visual Confusion for Zero-shot Learning
AUTHORS: Zihan Ye ; Fuyuan Hu ; Fan Lyu ; Linyan Li ; Kaizhu Huang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To alleviate this drawback, we propose in this work a multi-modal triplet loss (MMTL) which utilizes multimodal information to search a disentangled representation space.
25, TITLE: Domain Consistency Regularization for Unsupervised Multi-source Domain Adaptive Classification
AUTHORS: Zhipeng Luo ; Xiaobing Zhang ; Shijian Lu ; Shuai Yi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose an end-to-end trainable network that exploits domain Consistency Regularization for unsupervised Multi-source domain Adaptive classification (CRMA).
26, TITLE: Learning to Disentangle GAN Fingerprint for Fake Image Attribution
AUTHORS: TIANYUN YANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Adopting a multi-task framework, we propose a GAN Fingerprint Disentangling Network (GFD-Net) to simultaneously disentangle the fingerprint from GAN-generated images and produce a content-irrelevant representation for fake image attribution.
27, TITLE: Federated Semi-supervised Medical Image Classification Via Inter-client Relation Matching
AUTHORS: Quande Liu ; Hongzheng Yang ; Qi Dou ; Pheng-Ann Heng
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a novel approach for this problem, which improves over traditional consistency regularization mechanism with a new inter-client relation matching scheme.
28, TITLE: Scene Transformer: A Unified Multi-task Model for Behavior Prediction and Planning
AUTHORS: JIQUAN NGIAM et. al.
CATEGORY: cs.CV [cs.CV, cs.LG, cs.RO]
HIGHLIGHT: In this work, we formulate a model for predicting the behavior of all agents jointly in real-world driving environments in a unified manner.
29, TITLE: Understanding and Evaluating Racial Biases in Image Captioning
AUTHORS: Dora Zhao ; Angelina Wang ; Olga Russakovsky
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we study bias propagation pathways within image captioning, focusing specifically on the COCO dataset.
30, TITLE: Multi-scale Neural ODEs for 3D Medical Image Registration
AUTHORS: Junshen Xu ; Eric Z. Chen ; Xiao Chen ; Terrence Chen ; Shanhui Sun
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we proposed to learn a registration optimizer via a multi-scale neural ODE model.
31, TITLE: Achieving Domain Robustness in Stereo Matching Networks By Removing Shortcut Learning
AUTHORS: WeiQin Chuah ; Ruwan Tennakoon ; Alireza Bab-Hadiashar ; David Suter
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We will show that by removing such shortcuts, we can achieve domain robustness in the state-of-the-art stereo matching frameworks and produce a remarkable performance on multiple realistic datasets, despite the fact that the networks were trained on synthetic data, only.
32, TITLE: ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-shot Learning
AUTHORS: Chaofan Chen ; Xiaoshan Yang ; Changsheng Xu ; Xuhui Huang ; Zhe Ma
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we propose an Explicit Class Knowledge Propagation Network (ECKPN), which is composed of the comparison, squeeze and calibration modules, to address this problem.
33, TITLE: Toward Affective XAI: Facial Affect Analysis for Understanding Explainable Human-AI Interactions
AUTHORS: Luke Guerdan ; Alex Raymond ; Hatice Gunes
CATEGORY: cs.CV [cs.CV, cs.HC]
HIGHLIGHT: Therefore, in this work, we aim to (1) identify which facial affect features are pronounced when people interact with XAI interfaces, and (2) develop a multitask feature embedding for linking facial affect signals with participants' use of explanations.
34, TITLE: Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects
AUTHORS: Denys Rozumnyi ; Martin R. Oswald ; Vittorio Ferrari ; Marc Pollefeys
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We address the novel task of jointly reconstructing the 3D shape, texture, and motion of an object from a single motion-blurred image.
35, TITLE: Structured DropConnect for Uncertainty Inference in Image Classification
AUTHORS: Wenqing Zheng ; Jiyang Xie ; Weidong Liu ; Zhanyu Ma
CATEGORY: cs.CV [cs.CV, cs.AI, 14J60 (Primary) 14F05, 14J26 (Secondary), F.2.2; I.2.7]
HIGHLIGHT: In this paper, this framework is implemented on LeNet$5$ and VGG$16$ models for misclassification detection and out-of-distribution detection on MNIST and CIFAR-$10$ datasets.
36, TITLE: Toward Robotic Weed Control: Detection of Nutsedge Weed in Bermudagrass Turf Using Inaccurate and Insufficient Training Data
AUTHORS: Shuangyu Xie ; Chengsong Hu ; Muthukumar Bagavathiannan ; Dezhen Song
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose an algorithm to generate high fidelity synthetic data, adopting different levels of annotations to reduce labeling cost.
37, TITLE: Shuffle Transformer with Feature Alignment for Video Face Parsing
AUTHORS: RUI ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we introduce a strong backbone which is cross-window based Shuffle Transformer for presenting accurate face parsing representation.
38, TITLE: Seeing Through Clouds in Satellite Images
AUTHORS: Mingmin Zhao ; Peder A. Olsen ; Ranveer Chandra
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents a neural-network-based solution to recover pixels occluded by clouds in satellite images. We will release the processed dataset to facilitate future research.
39, TITLE: Unsupervised Person Re-identification Via Multi-Label Prediction and Classification Based on Graph-Structural Insight
AUTHORS: Jongmin Yu ; Hyeontaek Oh
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: Unsupervised Person Re-identification Via Multi-Label Prediction and Classification Based on Graph-Structural Insight
40, TITLE: Unsupervised-learning-based Method for Chest MRI-CT Transformation Using Structure Constrained Unsupervised Generative Attention Networks
AUTHORS: HIDETOSHI MATSUO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: The results obtained in this study revealed the proposed U-GAT-IT + MIND approach to outperform all other competing approaches.
41, TITLE: CMF: Cascaded Multi-model Fusion for Referring Image Segmentation
AUTHORS: Jianhua Yang ; Yan Huang ; Zhanyu Ma ; Liang Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we address the task of referring image segmentation (RIS), which aims at predicting a segmentation mask for the object described by a natural language expression.
42, TITLE: Robustness of Object Detectors in Degrading Weather Conditions
AUTHORS: MUHAMMAD JEHANZEB MIRZA et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we address this issue and perform one of the most detailed evaluation on single and dual modality architectures on data captured in real weather conditions.
43, TITLE: FastAno: Fast Anomaly Detection Via Spatio-temporal Patch Transformation
AUTHORS: Chaewon Park ; MyeongAh Cho ; Minhyeok Lee ; Sangyoun Lee
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address these shortcomings, we propose spatial rotation transformation (SRT) and temporal mixing transformation (TMT) to generate irregular patch cuboids within normal frame cuboids in order to enhance the learning of normal features.
44, TITLE: X-MAN: Explaining Multiple Sources of Anomalies in Video
AUTHORS: Stanislaw Szymanowicz ; James Charles ; Roberto Cipolla
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Our objective is to detect anomalies in video while also automatically explaining the reason behind the detector's response.
45, TITLE: EdgeConv with Attention Module for Monocular Depth Estimation
AUTHORS: Minhyeok Lee ; Sangwon Hwang ; Chaewon Park ; Sangyoun Lee
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a novel Patch-Wise EdgeConv Module (PEM) and EdgeConv Attention Module (EAM) to solve the difficulty of monocular depth estimation.
46, TITLE: Over-and-Under Complete Convolutional RNN for MRI Reconstruction
AUTHORS: PENGFEI GUO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose an Over-and-Under Complete Convolu?tional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network(CRNN).
47, TITLE: Multi-Resolution Continuous Normalizing Flows
AUTHORS: Vikram Voleti ; Chris Finlay ; Adam Oberman ; Christopher Pal
CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV]
HIGHLIGHT: In this work we introduce a Multi-Resolution variant of such models (MRCNF), by characterizing the conditional distribution over the additional information required to generate a fine image that is consistent with the coarse image.
48, TITLE: Contrastive Learning with Continuous Proxy Meta-Data for 3D MRI Classification
AUTHORS: BENOIT DUFUMIER et. al.
CATEGORY: cs.CV [cs.CV, stat.ML]
HIGHLIGHT: Here, we propose to leverage continuous proxy metadata, in the contrastive learning framework, by introducing a new loss called y-Aware InfoNCE loss.
49, TITLE: Explaining Decision of Model from Its Prediction
AUTHORS: Dipesh Tamboli
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This document summarizes different visual explanations methods such as CAM, Grad-CAM, Localization using Multiple Instance Learning - Saliency-based methods, Saliency-driven Class-Impressions, Muting pixels in input image - Adversarial methods and Activation visualization, Convolution filter visualization - Feature-based methods.
50, TITLE: JRDB-Act: A Large-scale Multi-modal Dataset for Spatio-temporal Action, Social Group and Activity Detection
AUTHORS: Mahsa Ehsanpour ; Fatemeh Saleh ; Silvio Savarese ; Ian Reid ; Hamid Rezatofighi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we introduce JRDB-Act, a multi-modal dataset, as an extension of the existing JRDB, which is captured by asocial mobile manipulator and reflects a real distribution of human daily life actions in a university campus environment.
51, TITLE: SiamAPN++: Siamese Attentional Aggregation Network for Real-Time UAV Tracking
AUTHORS: Ziang Cao ; Changhong Fu ; Junjie Ye ; Bowen Li ; Yiming Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this concern, in this paper, a novel attentional Siamese tracker (SiamAPN++) is proposed for real-time UAV tracking.
52, TITLE: Metamorphic Image Registration Using A Semi-Lagrangian Scheme
AUTHORS: Anton Fran�ois ; Pietro Gori ; Joan Glaun�s
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose an implementation of both Large Deformation Diffeomorphic Metric Mapping (LDDMM) and Metamorphosis image registration using a semi-Lagrangian scheme for geodesic shooting.
53, TITLE: Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence
AUTHORS: JACKY CAO et. al.
CATEGORY: cs.HC [cs.HC, cs.AI, cs.CV, cs.CY, cs.LG]
HIGHLIGHT: This survey aims to benefit both researchers and MAR system developers alike.
54, TITLE: $C^3$: Compositional Counterfactual Constrastive Learning for Video-grounded Dialogues
AUTHORS: Hung Le ; Nancy F. Chen ; Steven C. H. Hoi
CATEGORY: cs.LG [cs.LG, cs.CL, cs.CV]
HIGHLIGHT: In this paper, we propose a novel approach of Compositional Counterfactual Contrastive Learning ($C^3$) to develop contrastive training between factual and counterfactual samples in video-grounded dialogues.
55, TITLE: Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch
AUTHORS: Hossein Souri ; Micah Goldblum ; Liam Fowl ; Rama Chellappa ; Tom Goldstein
CATEGORY: cs.LG [cs.LG, cs.CR, cs.CV]
HIGHLIGHT: We develop a new hidden trigger attack, Sleeper Agent, which employs gradient matching, data selection, and target model re-training during the crafting process.
56, TITLE: Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation
AUTHORS: Haoxiang Wang ; Han Zhao ; Bo Li
CATEGORY: cs.LG [cs.LG, cs.CV, stat.ML]
HIGHLIGHT: In this paper, we take one important step further to understand the close connection between these two learning paradigms, through both theoretical analysis and empirical investigation.
57, TITLE: An Unifying Point of View on Expressive Power of GNNs
AUTHORS: Giuseppe Alessio D'Inverno ; Monica Bianchini ; Maria Lucia Sampoli ; Franco Scarselli
CATEGORY: cs.LG [cs.LG, cs.CV, F.2.2; G.2.2; I.2.6]
HIGHLIGHT: In this paper, we prove that the Weisfeiler--Lehman test induces an equivalence relationship on the graph nodes that exactly corresponds to the unfolding equivalence, defined on the original GNN model.
58, TITLE: ParticleAugment: Sampling-Based Data Augmentation
AUTHORS: Alexander Tsaregorodtsev ; Vasileios Belagiannis
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: We present an automated data augmentation approach for image classification.
59, TITLE: Machine Learning-based Analysis of Hyperspectral Images for Automated Sepsis Diagnosis
AUTHORS: MAXIMILIAN DIETRICH et. al.
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, I.2.10; I.4; I.5; J.3]
HIGHLIGHT: We conclude that further prospective studies, carefully designed with respect to these confounders, are necessary to confirm the preliminary results obtained in this study.
60, TITLE: GKNet: Grasp Keypoint Network for Grasp Candidates Detection
AUTHORS: Ruinian Xu ; Fu-Jen Chu ; Patricio A. Vela
CATEGORY: cs.RO [cs.RO, cs.CV]
HIGHLIGHT: This paper presents a different approach to grasp detection by treating it as keypoint detection.
61, TITLE: GelSight Wedge: Measuring High-Resolution 3D Contact Geometry with A Compact Robot Finger
AUTHORS: Shaoxiong Wang ; Yu She ; Branden Romero ; Edward Adelson
CATEGORY: cs.RO [cs.RO, cs.CV]
HIGHLIGHT: In this work, we present the GelSight Wedge sensor, which is optimized to have a compact shape for robot fingers, while achieving high-resolution 3D reconstruction.
62, TITLE: A Multi-Layered Approach for Measuring The Simulation-to-Reality Gap of Radar Perception for Autonomous Driving
AUTHORS: Anthony Ngo ; Max Paul Bauer ; Michael Resch
CATEGORY: cs.RO [cs.RO, cs.CV, cs.LG, eess.SP]
HIGHLIGHT: We address this problem by introducing a multi-layered evaluation approach, which consists of a combination of an explicit and an implicit sensor model evaluation.
63, TITLE: A Fair and Comprehensive Comparison of Multimodal Tweet Sentiment Analysis Methods
AUTHORS: Gullal S. Cheema ; Sherzod Hakimov ; Eric M�ller-Budack ; Ralph Ewerth
CATEGORY: cs.SI [cs.SI, cs.CL, cs.CV]
HIGHLIGHT: In this paper, we present a comprehensive experimental evaluation and comparison with six state-of-the-art methods, from which we have re-implemented one of them.
64, TITLE: Silent Speech and Emotion Recognition from Vocal Tract Shape Dynamics in Real-Time MRI
AUTHORS: Laxmi Pandey ; Ahmed Sabbir Arif
CATEGORY: eess.IV [eess.IV, cs.CV, cs.HC, cs.LG, cs.SD, eess.AS, I.4.9; I.2.10]
HIGHLIGHT: We propose a novel deep neural network-based learning framework that understands acoustic information in the variable-length sequence of vocal tract shaping during speech production, captured by real-time magnetic resonance imaging (rtMRI), and translate it into text.
65, TITLE: Improved CNN-based Learning of Interpolation Filters for Low-Complexity Inter Prediction in Video Coding
AUTHORS: Luka Murn ; Saverio Blasi ; Alan F. Smeaton ; Marta Mrak
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG, cs.MM]
HIGHLIGHT: This paper introduces a novel explainable neural network-based inter-prediction scheme, to improve the interpolation of reference samples needed for fractional precision motion compensation.