Whats new on arXiv

GPU-based Commonsense Paradigms Reasoning for Real-Time Query Answering and Multimodal Analysis
We utilize commonsense knowledge bases to address the problem of real- time multimodal analysis. In particular, we focus on the problem of multimodal sentiment analysis, which consists in the simultaneous analysis of different modalities, e.g., speech and video, for emotion and polarity detection. Our approach takes advantages of the massively parallel processing power of modern GPUs to enhance the performance of feature extraction from different modalities. In addition, in order to extract important textual features from multimodal sources we generate domain-specific graphs based on commonsense knowledge and apply GPU-based graph traversal for fast feature detection. Then, powerful ELM classifiers are applied to build the sentiment analysis model based on the extracted features. We conduct our experiments on the YouTube dataset and achieve an accuracy of 78% which outperforms all previous systems. In term of processing speed, our method shows improvements of several orders of magnitude for feature extraction compared to CPU-based counterparts.
Talent Flow Analytics in Online Professional Network
Analyzing job hopping behavior is important for the understanding of job preference and career progression of working individuals. When analyzed at the workforce population level, job hop analysis helps to gain insights of talent flow among different jobs and organizations. Traditionally, surveys are conducted on job seekers and employers to study job hop behavior. Beyond surveys, job hop behavior can also be studied in a highly scalable and timely manner using a data driven approach in response to fast-changing job landscape. Fortunately, the advent of online professional networks (OPNs) has made it possible to perform a large-scale analysis of talent flow. In this paper, we present a new data analytics framework to analyze the talent flow patterns of close to 1 million working professionals from three different countries/regions using their publicly-accessible profiles in an established OPN. As OPN data are originally generated for professional networking applications, our proposed framework re-purposes the same data for a different analytics task. Prior to performing job hop analysis, we devise a job title normalization procedure to mitigate the amount of noise in the OPN data. We then devise several metrics to measure the amount of work experience required to take up a job, to determine that existence duration of the job (also known as the job age), and the correlation between the above metric and propensity of hopping. We also study how job hop behavior is related to job promotion/demotion. Lastly, we perform connectivity analysis at job and organization levels to derive insights on talent flow as well as job and organizational competitiveness.
Recurrent Neural Networks for Long and Short-Term Sequential Recommendation
Recommender systems objectives can be broadly characterized as modeling user preferences over short-or long-term time horizon. A large body of previous research studied long-term recommendation through dimensionality reduction techniques applied to the historical user-item interactions. A recently introduced session-based recommendation setting highlighted the importance of modeling short-term user preferences. In this task, Recurrent Neural Networks (RNN) have shown to be successful at capturing the nuances of user’s interactions within a short time window. In this paper, we evaluate RNN-based models on both short-term and long-term recommendation tasks. Our experimental results suggest that RNNs are capable of predicting immediate as well as distant user interactions. We also find the best performing configuration to be a stacked RNN with layer normalization and tied item embeddings.
Algorithm Selection for Collaborative Filtering: the influence of graph metafeatures and multicriteria metatargets
To select the best algorithm for a new problem is an expensive and difficult task. However, there are automatic solutions to address this problem: using Metalearning, which takes advantage of problem characteristics (i.e. metafeatures), one is able to predict the relative performance of algorithms. In the Collaborative Filtering scope, recent works have proposed diverse metafeatures describing several dimensions of this problem. Despite interesting and effective findings, it is still unknown whether these are the most effective metafeatures. Hence, this work proposes a new set of graph metafeatures, which approach the Collaborative Filtering problem from a Graph Theory perspective. Furthermore, in order to understand whether metafeatures from multiple dimensions are a better fit, we investigate the effects of comprehensive metafeatures. These metafeatures are a selection of the best metafeatures from all existing Collaborative Filtering metafeatures. The impact of the most representative metafeatures is investigated in a controlled experimental setup. Another contribution we present is the use of a Pareto-Efficient ranking procedure to create multicriteria metatargets. These new rankings of algorithms, which take into account multiple evaluation measures, allow to explore the algorithm selection problem in a fairer and more detailed way. According to the experimental results, the graph metafeatures are a good alternative to related work metafeatures. However, the results have shown that the feature selection procedure used to create the comprehensive metafeatures is is not effective, since there is no gain in predictive performance. Finally, an extensive metaknowledge analysis was conducted to identify the most influential metafeatures.
RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data
With the improvement of medical data capturing, vast amount of continuous patient monitoring data, e.g., electrocardiogram (ECG), real-time vital signs and medications, become available for clinical decision support at intensive care units (ICUs). However, it becomes increasingly challenging to model such data, due to high density of the monitoring data, heterogeneous data types and the requirement for interpretable models. Integration of these high-density monitoring data with the discrete clinical events (including diagnosis, medications, labs) is challenging but potentially rewarding since richness and granularity in such multimodal data increase the possibilities for accurate detection of complex problems and predicting outcomes (e.g., length of stay and mortality). We propose Recurrent Attentive and Intensive Model (RAIM) for jointly analyzing continuous monitoring data and discrete clinical events. RAIM introduces an efficient attention mechanism for continuous monitoring data (e.g., ECG), which is guided by discrete clinical events (e.g, medication usage). We apply RAIM in predicting physiological decompensation and length of stay in those critically ill patients at ICU. With evaluations on MIMIC- III Waveform Database Matched Subset, we obtain an AUC-ROC score of 90.18% for predicting decompensation and an accuracy of 86.82% for forecasting length of stay with our final model, which outperforms our six baseline models.
Supporting Very Large Models using Automatic Dataflow Graph Partitioning
There is a trend towards using very large deep neural networks (DNN) to improve the accuracy of complex machine learning tasks. However, the size of DNN models that can be explored today is limited by the amount of GPU device memory. This paper presents Tofu, a system for partitioning very large DNN models across multiple GPU devices. Tofu is designed for a tensor-based dataflow system: for each operator in the dataflow graph, it partitions its input/output tensors and parallelizes its execution across workers. Tofu can automatically discover how each operator can be partitioned by analyzing its semantics expressed in a simple specification language. Tofu uses a search algorithm based on dynamic programming to determine the best partition strategy for each operator in the entire dataflow graph. Our experiments on an 8-GPU machine show that Tofu enables the training of very large CNN and RNN models. It also achieves better performance than alternative approaches to train very large models on multiple GPUs.
A Structured Perspective of Volumes on Active Learning
Active Learning (AL) is a learning task that requires learners interactively query the labels of the sampled unlabeled instances to minimize the training outputs with human supervisions. In theoretical study, learners approximate the version space which covers all possible classification hypothesis into a bounded convex body and try to shrink the volume of it into a half-space by a given cut size. However, only the hypersphere with finite VC dimensions has obtained formal approximation guarantees that hold when the classes of Euclidean space are separable with a margin. In this paper, we approximate the version space to a structured {hypersphere} that covers most of the hypotheses, and then divide the available AL sampling approaches into two kinds of strategies: Outer Volume Sampling and Inner Volume Sampling. After providing provable guarantees for the performance of AL in version space, we aggregate the two kinds of volumes to eliminate their sampling biases via finding the optimal inscribed hyperspheres in the enclosing space of outer volume. To touch the version space from Euclidean space, we propose a theoretical bridge called Volume-based Model that increases the `sampling target-independent’. In non-linear feature space, spanned by kernel, we use sequential optimization to globally optimize the original space to a sparse space by halving the size of the kernel space. Then, the EM (Expectation Maximization) model which returns the local center helps us to find a local representation. To describe this process, we propose an easy-to-implement algorithm called Volume-based AL (VAL).
Anomaly Detection of Complex Networks Based on Intuitionistic Fuzzy Set Ensemble
Ensemble learning for anomaly detection of data structured into complex network has been barely studied due to the inconsistent performance of complex network characteristics and lack of inherent objective function. In this paper, we propose the IFSAD, a new two-phase ensemble method for anomaly detection based on intuitionistic fuzzy set, and applies it to the abnormal behavior detection problem in temporal complex networks. First, it constructs the intuitionistic fuzzy set of single network characteristic which quantifies the degree of membership, non-membership and hesitation of each of network characteristic to the defined linguistic variables so that makes the unuseful or noise characteristics become part of the detection. To build an objective intuitionistic fuzzy relationship, we propose an Gaussian distribution-based membership function which gives a variable hesitation degree. Then, for the fuzzification of multiple network characteristics, the intuitionistic fuzzy weighted geometric operator is adopted to fuse multiple IFSs and to avoid the inconsistent of multiple characteristics. Finally, the score function and precision function are used to sort the fused IFS. Finally we carried out extensive experiments on several complex network datasets for anomaly detection, and the results demonstrate the superiority of our method to state-of-the-art approaches, validating the effectiveness of our method.
Anomaly detection in static networks using egonets
Network data has rapidly emerged as an important and active area of statistical methodology. In this paper we consider the problem of anomaly detection in networks. Given a large background network, we seek to detect whether there is a small anomalous subgraph present in the network, and if such a subgraph is present, which nodes constitute the subgraph. We propose an inferential tool based on egonets to answer this question. The proposed method is computationally efficient and naturally amenable to parallel computing, and easily extends to a wide variety of network models. We demonstrate through simulation studies that the egonet method works well under a wide variety of network models. We obtain some fascinating empirical results by applying the egonet method on several well-studied benchmark datasets.
A Note on Clustering Aggregation
Asymptotically Optimal Quickest Change Detection In Multistream Data – Part 1: General Stochastic Models
Assume that there are multiple data streams (channels, sensors) and in each stream the process of interest produces generally dependent and non-identically distributed observations. When the process is in a normal mode (in-control), the (pre-change) distribution is known, but when the process becomes abnormal there is a parametric uncertainty, i.e., the post-change (out-of-control) distribution is known only partially up to a parameter. Both the change point and the post-change parameter are unknown. Moreover, the change affects an unknown subset of streams, so that the number of affected streams and their location are unknown in advance. A good changepoint detection procedure should detect the change as soon as possible after its occurrence while controlling for a risk of false alarms. We consider a Bayesian setup with a given prior distribution of the change point and propose two sequential mixture-based change detection rules, one mixes a Shiryaev-type statistic over both the unknown subset of affected streams and the unknown post-change parameter and another mixes a Shiryaev-Roberts-type statistic. These rules generalize the mixture detection procedures studied by Tartakovsky (2018) in a single-stream case. We provide sufficient conditions under which the proposed multistream change detection procedures are first-order asymptotically optimal with respect to moments of the delay to detection as the probability of false alarm approaches zero.
Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!
Collective Matrix Completion
Matrix completion aims to reconstruct a data matrix based on observations of a small number of its entries. Usually in matrix completion a single matrix is considered, which can be, for example, a rating matrix in recommendation system. However, in practical situations, data is often obtained from multiple sources which results in a collection of matrices rather than a single one. In this work, we consider the problem of collective matrix completion with multiple and heterogeneous matrices, which can be count, binary, continuous, etc. We first investigate the setting where, for each source, the matrix entries are sampled from an exponential family distribution. Then, we relax the assumption of exponential family distribution for the noise and we investigate the distribution-free case. In this setting, we do not assume any specific model for the observations. The estimation procedures are based on minimizing the sum of a goodness-of-fit term and the nuclear norm penalization of the whole collective matrix. We prove that the proposed estimators achieve fast rates of convergence under the two considered settings and we corroborate our results with numerical experiments.
Uncertainty Modelling in Deep Networks: Forecasting Short and Noisy Series
Deep Learning is a consolidated, state-of-the-art Machine Learning tool to fit a function when provided with large data sets of examples. However, in regression tasks, the straightforward application of Deep Learning models provides a point estimate of the target. In addition, the model does not take into account the uncertainty of a prediction. This represents a great limitation for tasks where communicating an erroneous prediction carries a risk. In this paper we tackle a real-world problem of forecasting impending financial expenses and incomings of customers, while displaying predictable monetary amounts on a mobile app. In this context, we investigate if we would obtain an advantage by applying Deep Learning models with a Heteroscedastic model of the variance of a network’s output. Experimentally, we achieve a higher accuracy than non-trivial baselines. More importantly, we introduce a mechanism to discard low-confidence predictions, which means that they will not be visible to users. This should help enhance the user experience of our product.
• Constraint-Based Visual Generation• Every square can be tiled with T-tetrominos and no more than 5 monominos• The Power of One Clean Qubit in Communication Complexity• S12 S 12 and P12 P 12 -colorings of cubic graphs• On the Geodetic Hull Number of Complementary Prisms• Finite Time Adaptive Stabilization of LQ Systems• Mitigation of Human RF Exposure in 5G Downlink• Sleep Staging by Modeling Sleep Stage Transitions using Deep CRF• Human peripheral blur is optimal for object recognition• Necessary and Sufficient Topological Conditions for Identifiability of Dynamical Networks• A Cognitive Sub-Nyquist MIMO Radar Prototype• Clearing noisy annotations for computed tomography imaging• Identity Preserving Face Completion for Large Ocular Region Occlusion• Maximum rank-distance codes with maximum left and right idealisers• Peeking Behind Objects: Layered Depth Prediction from a Single Image• Two Algorithms to Find Primes in Patterns• Fast Vessel Segmentation and Tracking in Ultra High-Frequency Ultrasound Images• A Study on the Strong Duality of Conic Relaxation of AC Optimal Power Flow in Radial Networks• Dynamics of Langton’s ant allowed to periodically go straight• PCNNA: A Photonic Convolutional Neural Network Accelerator• Theta-vexillary signed permutations• Runoff on rooted trees• Time and place of the maximum for one-dimensional diffusion bridges and meanders• A Faster Deterministic Distributed Algorithm for Weighted APSP Through Pipelining• Hierarchical Classification using Binary Data• Stable Multiple Time Step Simulation/Prediction from Lagged Dynamic Network Regression Models• Fisher Information and Logarithmic Sobolev Inequality for Matrix Valued Functions• Characterizing health informatics journals by subject-level dependencies: a citation network analysis• Lesion segmentation using U-Net network• The g g -good neighbor conditional diagnosability of locally exchanged twisted cubes• Complex self-sustained oscillation patterns in modular excitable networks• Exact solution of some quarter plane walks with interacting boundaries• Weak in the NEES?: Auto-tuning Kalman Filters with Bayesian Optimization• Toward a language-theoretic foundation for planning and filtering• StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction• One for All, All for One: A Heterogeneous Data Plane for Flexible P4 Processing• On the number of simultaneous core partitions with d d -distinct parts• Proximal Averages for Minimization of Entropy Functionals• Top-Down Feedback for Crowd Counting Convolutional Neural Network• Sublinear Algorithms for (Δ+1) ( Δ + 1 ) Vertex Coloring• An Efficient System for Subgraph Discovery• Skin Lesion Segmentation Using Atrous Convolution via DeepLab v3• ClusterNet: Instance Segmentation in RGB-D Images• State-space analysis of an Ising model reveals contributions of pairwise interactions to sparseness, fluctuation, and stimulus coding of monkey V1 neurons• Self-produced Guidance for Weakly-supervised Object Localization• Traffic-Aware Backscatter Communications in Wireless-Powered Heterogeneous Networks• Pilot Spoofing Attack by Multiple Eavesdroppers• Meta-Learning Priors for Efficient Online Bayesian Regression• Variation of a Signal in Schwarzschild Spacetime• Panchromatic Sharpening of Remote Sensing Images Using a Multi-scale Approach• The Variational Homoencoder: Learning to learn high capacity generative models from few examples• Competitive Inner-Imaging Squeeze and Excitation for Residual Network• A decision theoretic approach to model evaluation in computational drug discovery• Bivariate network meta-analysis for surrogate endpoint evaluation• CReaM: Condensed Real-time Models for Depth Prediction using Convolutional Neural Networks• Remark on Barnette’s Conjecture• SAAGs: Biased Stochastic Variance Reduction Methods• Combining Heterogeneously Labeled Datasets For Training Segmentation Networks• Semiparametric Slepian-Bangs Formula for Complex Elliptically Symmetric Distributions• On Brownian exit times from some non-convex domains• A Temporal Difference Reinforcement Learning Theory of Emotion: unifying emotion, cognition and adaptive behavior• Example Mining for Incremental Learning in Medical Imaging• Hyperspectral Images Classification Using Energy Profiles of Spatial and Spectral Features• Otem: Over- and Under-Translation Evaluation Metric for NMT• Dermoscopic Image Analysis for ISIC Challenge 2018• Weak input-to-state stability: characterizations and counterexamples• Convex computation of extremal invariant measures of nonlinear dynamical systems and Markov processes• The Double Sphere Camera Model• Space-Time Extension of the MEM Approach for Electromagnetic Neuroimaging• Note on the zero-free region of the hard-core model• Contrast function estimation for the drift parameter of ergodic jump diffusion process• Computational speedups using small quantum devices• Statistical Characterization of Second Order Scattering Fading Channels• Asymptotic Optimality of Mixture Rules for Detecting Changes in General Stochastic Models• On the equality of the induced matching number and the uniquely restricted matching number for subcubic graphs• Utility maximization for L{é}vy switching models• Strong convergence rates of modified truncated EM methods for neutral stochastic differential delay equations• Composite likelihood estimation for a Gaussian process under fixed domain asymptotics• Deep-CLASS at ISIC Machine Learning Challenge 2018• Spatial growth processes with long range dispersion: microscopics, mesoscopics, and discrepancy in spread rate• Stabilization of an unstable wave equation using an infinite dimensional dynamic controller• Speakers account for asymmetries in visual perspective so listeners don’t have to• Rule Based Metadata Extraction Framework from Academic Articles• Optimal control of resources for species survival• The periodic Schur process and free fermions at finite temperature• Exploring Tehran with excitable medium• On critical and maximal digraphs• Bounding the Number of Minimal Transversals in Tripartite 3-Uniform Hypergraphs• Behavior of the empirical Wasserstein distance in R^d under moment conditions• Connected greedy coloring H H -free graphs• Dynamic Optimization of Thermodynamically Rigorous Models of Multiphase Flow in Porous Subsurface Oil Reservoirs• Likelihood-based meta-analysis with few studies: Empirical and simulation studies• Are RLL Codes Suitable for Simultaneous Energy and Information Transfer?• Mean asymptotics for a Poisson-Voronoi cell on a Riemannian manifold• Shortfall-Minimising Dispatch of Heterogeneous Stores and Application to Adequacy Studies• Transient Performance of Electric Power Networks under Colored Noise• On complexity of post-processing in analyzing GATE-driven X-ray spectrum• CaricatureShop: Personalized and Photorealistic Caricature Sketching• Height and contour processes of Crump-Mode-Jagers forests (II): The Bellman-Harris universality class• Remarks on the transcendence of certain infinite products• Feature Fusion through Multitask CNN for Large-scale Remote Sensing Image Segmentation• Learning Human Poses from Actions• On consistency and inconsistency of nonparametric tests• Optional Stopping with Bayes Factors: a categorization and extension of folklore results, with an application to invariant situations• An entropy minimization approach to second-order variational mean-field games• ISIC 2017 Skin Lesion Segmentation Using Deep Encoder-Decoder Network• The Möbius function of PSU(3,22n) P S U ( 3 , 2 2 n ) • Decision Variance in Online Learning• Revisiting the Challenges of MaxClique• Symplectic Isometries of Stabilizer Codes• Noncoherent Multi-User MIMO Communications using Covariance CSIT• Chromosome Painting• Moderate deviations for a stochastic Burgers equation• Cameron-Liebler line classes of PG(3,q) P G ( 3 , q ) admitting PGL(2,q) P G L ( 2 , q ) • Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition• Zeros of Holant problems: locations and algorithms• Projected Stochastic Gradients for Convex Constrained Problems in Hilbert Spaces• Minimum supports of functions on the Hamming graphs with spectral constrains• Symmetries in left-invariant optimal control problems• Inexact Variable Metric Stochastic Block-Coordinate Descent for Regularized Optimization• Collaborative double robustness using the e e -score• The realization problem for discrete Morse functions on trees• Residual Network based Aggregation Model for Skin Lesion Classification• Coagulation-transport equations and the nested coalescents• QUEST: Quadriletral Senary bit Pattern for Facial Expression Recognition• The Soft Multivariate Truncated Normal Distribution• Robust Group Comparison Using Non-Parametric Block-Based Statistics• An argument in favor of strong scaling for deep neural networks with small datasets• Partial Person Re-identification with Alignment and Hallucination• Skin disease identification from dermoscopy images using deep convolutional neural network• Convolutional Simplex Projection Network (CSPN) for Weakly Supervised Semantic Segmentation• Robot Imitation through Vision, Kinesthetic and Force Features with Online Adaptation to Changing Environments• Strong randomness criticality in the scratched-XY model• PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation• Multicolumn Networks for Face Recognition• Chromatic transitions in the emergence of syntax networks• A convex formulation for Discrete Tomography• Self-Paced Learning with Adaptive Deep Visual Embeddings• Theoretical Perspective of Convergence Complexity of Evolutionary Algorithms Adopting Optimal Mixing• Face Mask Extraction in Video Sequence• Deterministic Fitting of Multiple Structures using Iterative MaxFS with Inlier Scale Estimation and Subset Updating• Hardware-In-The-Loop Vulnerability Analysis of a Single-Machine Infinite-Bus Power System• Multi-Class Lesion Diagnosis with Pixel-wise Classification Network• Combinatorics of the Deodhar decomposition of the Grassmannian• Deep Learning on Retina Images as Screening Tool for Diagnostic Decision Support• Improving pairwise comparison models using Empirical Bayes shrinkage• Hierarchical infinite factor model for improving the prediction of surgical complications for geriatric patients• Markov semi-groups associated with the complex unimodular group Sl(2,C) S l ( 2 , C ) • Unsupervised Learning of Latent Physical Properties Using Perception-Prediction Networks• Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks• Likely equilibria of the stochastic Rivlin cube• GANimation: Anatomically-aware Facial Animation from a Single Image• Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision• Time Correlation Exponents in Last Passage Percolation

Like this:

Like Loading…

Related

你可能感兴趣的:(机器学习,人工智能)