https://cmsworkshops.com/ICASSP2020/TechnicalProgram.asp
by
Yonina Eldar, Vince Poor and Nir Shlezinger
尤尼娜·艾尔达、文斯·普尔和尼尔·什莱辛格
Weizman Institute of Science, Princeton U, Weizman Inst. Sci.
威斯曼科学研究所,普林斯顿大学,威斯曼科学研究所。
mobile communications and machine learning are two of the most exciting and rapidly developing technological fields of our time. In the past few years, these two fields have begun to merge in two fundamental ways. First, while mobile communications has developed largely as a model-driven field, the complexities of many emerging communication scenarios raise the need to introduce data-driven methods into the design and analysis of mobile networks. Second, many machine learning problems are by their nature distributed due to either physical limitations or privacy concerns. This distributed nature can be exploited by using mobile networks as part of the learning mechanisms, i.e., as platforms for machine learning.
移动通信和机器学习是当今最令人兴奋和发展最快的两个技术领域。在过去的几年里,这两个领域已经开始以两种基本的方式融合。首先,虽然移动通信在很大程度上已经发展成为一个模型驱动的领域,但许多新兴通信场景的复杂性使得需要在移动网络的设计和分析中引入数据驱动方法。其次,许多机器学习问题由于物理限制或隐私问题的性质而分布。利用移动网络作为学习机制的一部分,即作为机器学习的平台,可以利用这种分布式性质。
In this tutorial we will illuminate these two perspectives, presenting a representative set of relevant problems which have been addressed in the recent literature, and discussing the multitude of exciting research directions which arise from the combination of machine learning and wireless communications. We will begin with the application of machine learning methods for optimizing wireless networks: Here, we will first survey some of the challenges in communication networks which can be treated using machine learning tools. Then, we will focus on one of the fundamental problems in digital communications – receiver design. We will review different designs of data driven receivers, and discuss how they can be related to conventional and emerging approaches for combining machine learning and model-based algorithms. We will conclude this part of the tutorial with a set of communication-related problems which can be tackled in a data-driven manner.
在本教程中,我们将阐述这两个观点,介绍一组在最近的文献中已经解决的具有代表性的相关问题,并讨论由机器学习和无线通信相结合而产生的众多令人兴奋的研究方向。我们将从应用机器学习方法优化无线网络开始:在这里,我们将首先调查通信网络中的一些挑战,这些挑战可以使用机器学习工具来处理。然后,我们将集中讨论数字通信中的一个基本问题——接收机设计。我们将回顾数据驱动接收器的不同设计,并讨论它们如何与用于结合机器学习和基于模型的算法的传统和新兴方法相关。我们将以一组与通信相关的问题结束本教程的这一部分,这些问题可以用数据驱动的方式解决。
The second part of the tutorial will be dedicated to wireless networks as a platform for machine learning: We will discuss communication issues arising in distributed learning problems such as federated learning and collaborative learning. We will explain how established communications and coding methods can contribute to the development of these emerging distributed learning technologies, illustrating these ideas through examples from recent research in the field. We will conclude with a set of open machine learning related problems, which we believe can be tackled using established communications and signal processing techniques.
本教程的第二部分将专门介绍作为机器学习平台的无线网络:我们将讨论分布式学习问题中出现的通信问题,如联合学习和协作学习。我们将解释已建立的通信和编码方法如何有助于这些新兴的分布式学习技术的发展,并通过该领域最新研究的实例说明这些想法。最后,我们将提出一系列与开放式机器学习相关的问题,我们相信这些问题可以通过现有的通信和信号处理技术来解决。
by
Wojciech Samek and Felix Sattler
萨梅克和萨特勒
Fraunhofer Heinrich Hertz Institute
弗劳恩霍夫海因里希赫兹研究所
eep neural networks have recently demonstrated their incredible ability to solve complex tasks. Today’s models are trained on Millions of examples using powerful GPU cards and are able to reliably annotate images, translate text, understand spoken language or play strategic games such as chess or go. Furthermore, deep learning will also be integral part of many future technologies, e.g., autonomous driving, Internet of Things (IoT) or 5G networks. Especially with the advent of IoT, the number of intelligent devices has rapidly grown in the last couple of years. Many of these devices are equipped with sensors that allow them to collect and process data at unprecedented scales. This opens unique opportunities for deep learning methods.
eep神经网络最近已经证明了它们解决复杂任务的不可思议的能力。今天的模型使用强大的GPU卡在数百万个例子上进行训练,能够可靠地注释图像、翻译文本、理解口语或玩象棋或围棋等战略游戏。此外,深度学习也将是许多未来技术的组成部分,例如,自主驾驶、物联网(IoT)或5G网络。特别是随着物联网的出现,智能设备的数量在过去几年里迅速增长。这些设备中的许多都配备了传感器,使它们能够以前所未有的规模收集和处理数据。这为深入学习方法提供了独特的机会。
However, these new applications come with several additional constraints and requirements, which limit the out-of-the-box use of current models.
然而,这些新的应用程序附带了一些额外的约束和要求,这些约束和要求限制了当前模型的开箱即用。
Embedded devices, IoT gadgets and smartphones have limited memory & storage capacities and restricted energy resources. Deep neural networks such as VGG-16 require over 500 MB for storing the parameters and up to 15 giga-operations for performing a single forward pass. Such models in their current (uncompressed) form cannot be used on-device.
嵌入式设备、物联网设备和智能手机的内存和存储容量有限,能源资源有限。VGG-16这样的深层神经网络需要500兆以上的内存来存储参数,执行一次前向传递需要高达15千兆的操作。这些当前(未压缩)形式的模型不能在设备上使用。
Training data is often distributed over devices and cannot simply be collected at a central server due to privacy issues or limited resources (bandwidth). Since a local training of the model with only few data points is often not promising, new collaborative training schemes are needed to bring the power of deep learning to these distributed applications.
训练数据通常分布在设备上,由于隐私问题或资源(带宽)有限,不能简单地在中央服务器上收集。由于只有很少数据点的模型的局部训练往往是不可能的,因此需要新的协作训练方案来为这些分布式应用带来深度学习的能力。
This tutorial will discuss recently proposed techniques to tackle these two problems.
本教程将讨论最近提出的解决这两个问题的技术。
by
Geert Leus, Elvin Isufi and Mario Coutino
格尔特·勒乌斯、埃尔文·伊斯菲和马里奥·库蒂诺
TU Delft
荷兰代尔夫特理工大学
although processing and analyzing audio, images and video is still of great importance in current society, more and more data is originating from networks with an irregular structure, e.g., social networks, brain networks, sensor networks, and communications networks to name a few. To handle such signals, graph signal processing has recently been coined as a proper tool set. In graph signal processing the irregular structure of the network is captured by means of a graph, and the data is viewed as a signal on top of this graph, i.e., a graph signal. Graph signal processing extends concepts and tools from classical signal processing to the field of graph signals, e.g., the Fourier transform, filtering, sampling, stationarity, etc. Since nowadays many researchers and engineers work in the field of network data processing, this tutorial is attractive, timely and critical. Further, most existing tutorials in this field focus on the basics of graph signal processing. Hence, it is urgent to go one step beyond and discuss the latest advances in graph signal processing as well as connections to the exciting fields of distributed optimization and neural networks, both of which draw inspiration from fundamental signal processing techniques.
尽管音频、图像和视频的处理和分析在当今社会仍然具有重要的意义,但越来越多的数据来自于结构不规则的网络,例如社交网络、大脑网络、传感器网络和通信网络等等。为了处理这些信号,图形信号处理最近被创造成一个合适的工具集。在图形信号处理中,网络的不规则结构是通过一个图形来捕获的,数据被看作是这个图形上的一个信号,即一个图形信号。图形信号处理将概念和工具从经典的信号处理扩展到图形信号领域,如傅立叶变换、滤波、采样、平稳性等。由于目前许多研究人员和工程师在网络数据处理领域工作,本教程具有吸引力、及时性和批判性。此外,这一领域的大多数现有教程都侧重于图形信号处理的基础知识。因此,迫切需要超越和讨论图形信号处理的最新进展,以及与分布式优化和神经网络这两个激动人心的领域的联系,这两个领域都从基本的信号处理技术中得到启发。
More specifically, in this tutorial, we will emphasize the concept of graph filtering, one of the cornerstones of the field of graph signal processing. Graph filters are direct analogues of time-domain filters but intended for signals defined on graphs. They find applications in image denoising, network data interpolation, signal and link prediction, learning of graph signals and building recommender systems. More recently, connections to distributed optimization as well as neural networks have been established. These last two applications rely heavily on core signal processing techniques such as iterative inversion algorithms and linear time-invariant filters. Graph filters extend these concepts to graphs, leading to key developments in distributed optimization and neural networks.
更具体地说,在本教程中,我们将强调图形滤波的概念,图形信号处理领域的基石之一。图滤波器是时域滤波器的直接类似物,但用于图上定义的信号。它们在图像去噪、网络数据插值、信号和链路预测、图形信号学习和建立推荐系统等方面有着广泛的应用。最近,分布式优化和神经网络已经建立了联系。最后两个应用严重依赖于核心信号处理技术,如迭代反演算法和线性时不变滤波器。图过滤器将这些概念扩展到图,导致了分布式优化和神经网络的关键发展。
by
Ingrid Daubechies, Pier Luigi Dragotti, Nathan Daly, Catherine Higgitt and Miguel Rodrigues
英格丽·多贝基斯,皮尔·路易吉·德拉戈蒂,内森·戴利,凯瑟琳·希吉特和米格尔·罗德里格斯
Duke University, National Gallery, University College London
杜克大学,国家美术馆,伦敦大学学院
the cultural heritage sector is experiencing a digital revolution driven by the growing adoption of non-invasive, non-destructive spectroscopic imaging approaches generating multi-dimensional data from entire artworks. Such approaches include ‘macro X-ray fluorescence’ (MA-XRF) scanning or hyper-spectral imaging (HSI) in the visible and infrared ranges and are highly complementary to more traditional broad band digital imaging techniques such as X-ray radiography (XRR) or infrared reflectography (IRR).
文化遗产部门正在经历一场数字革命,其驱动力是越来越多地采用非侵入性、非破坏性的光谱成像方法,从整个艺术品中生成多维数据。这些方法包括可见光和红外范围内的“宏观X射线荧光”(MA-XRF)扫描或高光谱成像(HSI),与更传统的宽带数字成像技术(如X射线照相术(XRR)或红外反射成像(IRR)高度互补。
This data –spanning both the spatial and spectral domains– holds information about both materials at the surface of an artwork but also about sub-surface layers or features of interest otherwise invisible to the naked eye. The ability to interrogate the wealth of data yielded by these techniques can potentially provide insights into an artist’s materials, techniques and creative process; reveal the changing condition of an artwork over time and its restoration history; help inform strategies for the conservation and preservation of artworks; and, importantly, offer means by which to present artwork to the public in new ways.
这些数据横跨空间域和光谱域,包含了艺术品表面两种材料的信息,也包含了亚表面层或肉眼看不见的感兴趣特征的信息。对这些技术所产生的大量数据进行查询的能力,可以潜在地洞察艺术家的材料、技术和创作过程;揭示艺术作品随时间变化的状况及其修复历史;有助于为保护和保存艺术作品的策略提供信息;而且,重要的是,以新的方式向公众展示艺术品的方式。
However, to do this successfully also calls for new sophisticated signal and image processing tools capable of addressing various challenges associated with the analysis, interrogation, and processing of such massive multi-dimensional datasets. These challenges derive from the fact that paintings are very complex objects where
然而,要成功地做到这一点,还需要新的复杂的信号和图像处理工具,能够应对与分析、查询和处理这种大规模多维数据集相关的各种挑战。这些挑战源于绘画是非常复杂的物体
Materials are often present in intimate mixtures applied over multi-layered systems so signals deriving from spectroscopic imaging techniques are highly nonlinearly mixed.
材料通常以多层系统上的紧密混合物存在,因此光谱成像技术产生的信号是高度非线性混合的。
Materials also age/degrade over time so signals collected from spectroscopic imaging techniques cannot be often compared to signals present in reference libraries.
材料也会随着时间老化/降解,因此从光谱成像技术收集的信号不能经常与参考库中的信号进行比较。
A ‘ground-truth’ is often unavailable or limited because each painting is unique, with original materials often unknown, and materials’ aging process also unknown.
因为每幅作品都是独一无二的,原材料往往是未知的,材料的老化过程也未知,所以一个“基本真相”往往是不可能或有限的。
In addition, different spectroscopic imaging techniques reveal different details about artwork, so there is also a need to develop new signal and image processing tools combining different datasets in order to understand artwork.
此外,不同的光谱成像技术揭示了艺术品的不同细节,因此还需要开发结合不同数据集的新的信号和图像处理工具来理解艺术品。
This tutorial –which is offered by experts in applied mathematics, signal & image processing, machine learning, and heritage science– (a) reviews the state-of-the-art in signal and image processing for art investigation (b) reviews signal and image processing challenges arising in the examination of datasets acquired on artwork and © overviews emerging directions in signal processing for art investigation.
本教程-由应用数学、信号与图像处理、机器学习、计算机辅助教学、计算机辅助教学和计算机辅助教学等领域的专家提供,和遗产科学–(a)回顾艺术调查的信号和图像处理的最新技术(b)回顾检查艺术品上获得的数据集时出现的信号和图像处理挑战,以及(c)概述艺术调查信号处理的新方向。
by
Lajos Hanzo, Angela Sera Cacciapuoti and Marcello Caleffi
拉乔斯·汉佐、安吉拉·塞拉·卡恰普奥蒂和马塞洛·卡列菲
University of Southampton; University of Naples Federico II
南安普敦大学;那不勒斯大学费德里科二世
moore’s law has indeed prevailed since he outlined his empirical rule-of-thumb in 1965, but based on this trend the scale of integration is set to depart from classical physics, entering nano-scale integration, where the postulates of quantum physics have to be obeyed. The quest for quantum-domain communication and processing solutions was inspired by Feynman’s revolutionary idea in 1985: particles such as photons or electrons might be relied upon for encoding, processing and delivering information. Hence in the light of these trends it is extremely timely to build an interdisciplinary momentum in the area of quantum signal processing and communications, where there is an abundance of open problems for a broad community to solve collaboratively. In this workshop-style interactive presentation we will address the following issues:
摩尔定律自1965年概述其经验经验经验法则以来,确实盛行,但基于这一趋势,积分的尺度设置偏离经典物理,进入纳米级积分,在那里必须遵守量子物理的假设。对量子域通信和处理解决方案的探索灵感来自费曼1985年的革命性想法:光子或电子等粒子可能被用来编码、处理和传递信息。因此,鉴于这些趋势,在量子信号处理和通信领域建立一种跨学科的势头是非常及时的,在这一领域,有大量的开放问题需要广大社区共同解决。在本次研讨会式互动演示中,我们将讨论以下问题:
We commence by highlighting the nature of the quantum channel, followed by techniques of mitigating the effects of quantum decoherence using quantum codes.
我们首先强调量子信道的性质,然后是利用量子码来减轻量子退相干效应的技术。
Then we bridge the subject areas of large-scale search problems in wireless communications and exploit the benefits of quantum search algorithms in multi-user detection, in joint-channel estimation and data detection, localization and in routing problems of networking, for example.
然后,我们将无线通信中大规模搜索问题的主题领域联系起来,并利用量子搜索算法在多用户检测、联合信道估计和数据检测、定位和网络路由问题等方面的优势。
by
Michael Fauß, Michael Muma and Abdelhak M. Zoubir
迈克尔·福尔、迈克尔·穆马和阿卜杜勒哈克·M·祖比尔
Princeton University, TU Darmstadt
普林斯顿大学
with rapid developments in signal processing and data analytics, driven by technological advances towards a more intelligent networked world, there is an ever-increasing need for reliable and robust information extraction and processing. Robust statistical methods account for the fact that the postulated models for the data are fulfilled only approximately and not exactly. In contrast to classical parametric procedures, robust methods are not significantly affected by small changes in the data, such as outliers or minor model departures. In practice, many engineering applications involve measurements that are not Gaussian and that may contain corrupted measurements and outliers, which cause the data distributions to be heavy-tailed. This leads to a breakdown in performance of traditional signal processing techniques that are based on Gaussian models.
随着信号处理和数据分析技术的迅速发展,在技术进步的推动下,向更智能的网络世界发展,对可靠和可靠的信息提取和处理的需求日益增加。稳健的统计方法解释了这样一个事实:数据的假设模型只得到了近似的实现,而不是精确的实现。与经典的参数化方法相比,稳健方法不受数据的微小变化(如离群值或微小模型偏离)的显著影响。在实际应用中,许多工程应用涉及非高斯的测量,这些测量可能包含损坏的测量和异常值,从而导致数据分布是重尾的。这导致基于高斯模型的传统信号处理技术的性能崩溃。
The focus of this tutorial is on recent advances in the related areas of robust detection and robust cluster analysis for unsupervised learning. This tutorial is organized into two parts. In the first part, we discuss robust detection for a given number of hypotheses. In the second part, we move to robust cluster analysis with a focus on recent advances in robust cluster enumeration.
本教程的重点是无监督学习的鲁棒检测和鲁棒聚类分析相关领域的最新进展。本教程分为两部分。在第一部分中,我们讨论了给定假设数下的鲁棒检测。在第二部分中,我们将讨论稳健聚类分析,重点是稳健聚类枚举的最新进展。
Alexa Conversations is a new deep learning-based approach for creating more natural voice experiences on Alexa. As an AI driven dialog manager, Alexa Conversations provides implicit support for context carry-over, slot over and under-filling, and user driven corrections with less training data and back-end code than traditional skill development techniques. In this session you will learn about the science behind Alexa Conversations in a 30-minute technical deep dive, followed by a 60-minute hands-on workshop. In the workshop you will participate in a guided walk-through to create an Alexa skill, learn how to train the conversational AI to handle increasingly complex user interactions, and enhance Alexa’s spoken and visual responses to add personality. At the end of the session you will be able to speak to your skill on your Alexa enabled devices, and have the knowledge to build on what you have learned for new and innovative use cases.
Alexa对话是一种新的基于深度学习的方法,可以在Alexa上创建更自然的语音体验。作为一个人工智能驱动的对话管理器,Alexa Conversations提供了对上下文转换、时隙转换和不足填充以及用户驱动的更正的隐式支持,与传统的技能开发技术相比,它的训练数据和后端代码更少。在本课程中,你将在30分钟的技术深度潜水中学习Alexa对话背后的科学知识,然后是60分钟的实践研讨会。在研讨会中,你将参与一个引导性的走查,以创造Alexa技能,学习如何训练对话型人工智能来处理日益复杂的用户交互,并增强Alexa的口语和视觉反应以增加个性。在课程结束时,您将能够在支持Alexa的设备上与您的技能对话,并拥有构建新的和创新的用例所学知识的知识。
Alexa users in the US can try an experience created with Alexa Conversations by saying “Alexa, open Night Out”.
美国的Alexa用户可以通过说“Alexa,open Night Out”来尝试用Alexa对话创建的体验。
Presenter Bio: Maryam Fazel-Zarandi is currently a Senior Applied Scientist at Amazon working on conversational AI for Alexa. Before joining Amazon in 2017, she was a Senior Research Scientist at Nuance Communications. Maryam received her Ph.D. in Computer Science from University of Toronto in 2013. Her research interests are in the areas of machine learning and knowledge representation and reasoning with a focus on natural language understanding and dialogue management.
主持人简介:Maryam Fazel Zarandi目前是亚马逊的高级应用科学家,为Alexa开发对话人工智能。2017年加入亚马逊之前,她是Nuance Communications的资深研究科学家。2013年,Maryam在多伦多大学获得计算机科学博士学位。她的研究兴趣是机器学习、知识表示和推理,重点是自然语言理解和对话管理。
Presenter Bio: Adam Hasham is a Senior Product Manager for Alexa Skills Kit (ASK) at Amazon. Before that, Adam founded a full-stack on-demand delivery technology platform called Hurrier in 2013, and sold it within 2 years to a Delivery Hero subsidiary (foodora) for its logistics technology and market footprint – generating a 2x+ return for investors (Delivery Hero subsequently IPO’d in 2017 on FSE). He holds a degree in Computer Engineering from University of Waterloo (2004), MBA from Queen’s University (2007) and CFA (2010) and has a broad range of experience from working in technology as a microchip designer, working in finance in Singapore as a Corporate Banker, heading product at foodora and wearing many hats as an entrepreneur.
演示者简介:亚当·哈沙姆是亚马逊Alexa Skills Kit(ASK)的高级产品经理。在此之前,Adam于2013年创建了一个名为Hurrier的全栈按需交付技术平台,并在2年内将其出售给delivery Hero子公司(foodora),以获取物流技术和市场足迹——为投资者带来2倍以上的回报(delivery Hero随后于2017年在FSE上市)。他拥有滑铁卢大学(2004年)计算机工程学位、皇后大学(2007年)工商管理硕士学位和CFA(2010年)学位,拥有广泛的经验,包括作为微芯片设计师从事技术工作、作为公司银行家在新加坡从事金融工作、在foodora领导产品以及作为企业家戴着许多帽子。
Presenter Bio: Josey Sandoval is a Senior Product Manager for Alexa AI at Amazon, based in Seattle. He has spent the last three years with Alexa focused on dialog management capabilities and enabling more natural conversational experiences. Before joining Amazon in 2016, Josey lead product and engineering teams for a B2B human resources technology startup, and holds a Master’s of Science in Electrical Engineering from the University of Washington with a focus on signal processing and signal path.
演示者简介:Josey Sandoval是位于西雅图的Amazon的Alexa AI的高级产品经理。在过去的三年里,他一直与Alexa一起致力于对话管理能力和更自然的对话体验。在2016年加入亚马逊之前,Josey领导一家B2B人力资源技术初创公司的产品和工程团队,并拥有华盛顿大学电气工程科学硕士学位,重点是信号处理和信号路径。
AI x Audio: From Research to Production
AI x音频:从研究到生产
AI has changed the way we process and create audio, especially music. This opens new possibilities and enables new products that could not be envisioned some years ago. In this industry session, we want to give an overview of Sony’s activities in this field.
人工智能改变了我们处理和创造音频的方式,特别是音乐。这开辟了新的可能性,并使一些年前无法预见的新产品成为可能。在本次行业会议上,我们想概述一下索尼在这一领域的活动。
We start this session with an introduction into music source separation. Sony has been active in AI-based source separation since 2013 and our systems have repeatedly won international evaluation campaigns. In the last years, we could successfully integrate this technology into a number of products, which we will introduce as well.
我们从介绍音乐源分离开始这节课。索尼自2013年以来一直积极开展基于人工智能的源代码分离,我们的系统多次赢得国际评估活动。在过去的几年里,我们可以将这项技术成功地集成到许多产品中,我们也将引进这些产品。
Recently, INRIA released -in collaboration with Sony- open-unmix, an open-source implementation of our music source separation. open-unmix is available for NNabla as well as PyTorch.
最近,INRIA与索尼合作发布了一个开源的音乐源代码分离实现。NNabla和Pythorch都有开放的联尼特派团。
Finally, in this first part, we will briefly introduce the NNabla open-source project. NNabla is Sony’s Deep Learning Library, which we are actively developing worldwide. We will give a brief overview of its main features and compare it to other popular DL frameworks. We will highlight its focus on network compression and speed, making it a good choice for audio and music product development and prototyping.
最后,在第一部分中,我们将简要介绍NNabla开源项目。NNabla是索尼的深度学习图书馆,我们正在全球范围内积极开发。我们将简要概述它的主要特性,并将其与其他流行的DL框架进行比较。我们将重点关注网络压缩和速度,使之成为音频和音乐产品开发和原型制作的良好选择。
In the second part of the session, we will present our activities on music creation where we envision technologies that could drive music for the years to come. Through deep learning-based approaches, we develop tools that enhance a composer’s creativity and augment his capabilities. In our talk, we briefly present our research activities, including details about the underlying machine learning models. For these tools to be relevant, we rely on close collaboration with artists from Sony Music Entertainment, which can sometimes be tricky. Indeed, we are often experiencing a gap that exists between scientific research and the music industry on many levels, such as timeliness or profitability. Hence, the presentation will also address our efforts to bridge that gap.
在会议的第二部分,我们将介绍我们在音乐创作方面的活动,我们设想在未来几年里可以推动音乐发展的技术。通过基于深度学习的方法,我们开发了增强作曲家创造力和增强其能力的工具。在我们的演讲中,我们简要介绍了我们的研究活动,包括关于底层机器学习模型的细节。为了使这些工具具有相关性,我们依赖于与索尼音乐娱乐公司(Sony Music Entertainment)的艺术家们的密切合作,这有时是很棘手的。事实上,我们经常遇到科学研究和音乐产业在许多层面上的差距,例如及时性或盈利能力。因此,本报告还将讨论我们为弥合这一差距所作的努力。
Presenter bio: Mototsugu Abe is a Senior General Manager and Chief Distinguished Researcher at R&D Center of Sony Corporation. As a researcher, he specializes in audio signal processing, intelligent sensing and pattern recognition. As a manager, he supervises fundamental technology R&D in information technology field including video, image, audio, speech, natural language, communication, RF, robotics, sensing and machine learning technologies. He received a Ph.D in engineering from the University of Tokyo in 1999 and has been with Sony Corporation since then. From 2003 to 2004, he was a visiting scholar at Stanford University worked with Prof. Julius O. Smith III.
主持人简介:安倍晋三是索尼公司研发中心高级总经理兼首席杰出研究员。作为一名研究人员,他专门研究音频信号处理、智能传感和模式识别。作为经理,他监督信息技术领域的基础技术研发,包括视频、图像、音频、语音、自然语言、通信、射频、机器人、传感和机器学习技术。他于1999年在东京大学获得工程学博士学位,此后一直在索尼公司工作。从2003年到2004年,他是斯坦福大学的访问学者,与朱利叶斯密斯三世教授合作。
Presenter bio: Marc Ferras received the B.S. degree in computer science, the M.S. degree in telecommunications, and the European Master in Language and Speech from the Universitat Politecnica de Catalunya (UPC), Spain, in 1999 and 2005, respectively. He received his PhD. degree from Université Paris-Sud XI, France, in 2009, researching the use of automatic speech recognition in speaker recognition tasks. Since, he has hold two post-doc positions, one at Tokyo Institute of Technology, Japan (2009-2011) and one at the Idiap Research Institute, Switzerland (2011-2016), both focused on automatic speech and speaker recognition. He is currently working at SONY’s Stuttgart Technology Center as a Senior Engineer working on speech recognition technology.
主持人简介:Marc Ferras分别于1999年和2005年在西班牙加泰罗尼亚理工大学(UPC)获得计算机科学学士学位、电信硕士学位和欧洲语言和演讲硕士学位。他获得了博士学位。2009年毕业于法国巴黎第十一大学,研究自动语音识别在说话人识别任务中的应用。此后,他先后担任过两个博士后职位,一个在日本东京理工学院(2009-2011)工作,一个在瑞士Idiap研究院(2011-2016)工作,均专注于自动语音和说话人识别。他目前在索尼斯图加特技术中心工作,是一名从事语音识别技术的高级工程师。
Presenter bio: Stefan Lattner is a research associate at Sony CSL Paris, where he works on transformation and invariance learning with artificial neural networks. Using this paradigm, he targets rhythm generation (i.e., DrumNet) and is also involved in music information retrieval, audio generation, and recommendation. He obtained his doctorate in the area of music structure modeling from the Johannes Kepler University in Linz, Austria.
主持人简介:Stefan Lattner是Sony CSL Paris的研究助理,他致力于用人工神经网络进行变换和不变性学习。使用这种范式,他以节奏生成(即鼓声)为目标,还参与音乐信息检索、音频生成和推荐。他在奥地利林茨的约翰内斯开普勒大学获得了音乐结构建模领域的博士学位。
Presenter bio: Cyran Aouameur is an assistant researcher at Sony CSL. Graduated from Ircam-organized ATIAM Master’s degree, he entered CSL two years ago. Passionate about urban music since he was a child, he has been focusing on developing AI-based solutions for artists to quickly design unique drum sounds and rhythms, which he considers being top-importance elements. He is now partly responsible for the communication with the artists, seeking to get the research and the music industry worlds to understand each other.
主持人简介:Cyran Aouameur是索尼CSL的助理研究员。毕业于Ircam组织的ATIAM硕士学位,两年前进入CSL。他从小就热衷于城市音乐,一直致力于开发基于人工智能的解决方案,让艺术家快速设计独特的鼓声和节奏,他认为这些是最重要的元素。他现在部分负责与艺术家的交流,寻求让研究界和音乐界相互了解。
Monday, May 4, 09:30 – 13:00
5月4日,星期一,09:30–13:00
Navigating social media as a scientist
作为一名科学家驾驭社交媒体
Facebook, Twitter or LinkedIn, by now, are no longer a new thing. Also for scientists, social media platforms have become an integral networking tool to connect globally, exchange research ideas and advance careers. But, what’s a proper way for scientists to make use of these platforms?
Facebook、Twitter或LinkedIn现在已经不是什么新鲜事了。同样对科学家来说,社交媒体平台已经成为连接全球、交流研究想法和促进职业发展的不可或缺的网络工具。但是,什么是科学家利用这些平台的正确方法呢?
In this workshop part, you will gain a better understanding on the current state of digital science communication. In detail, you will learn how scientists may integrate social media into their activities — in a helpful and productive way. The workshop advocates a reflected media usage that keeps a close eye on how and when it is recommended for you to “go digital”. This workshop provides…
在这一部分中,您将对数字科学传播的现状有一个更好的了解。具体来说,你将学习到科学家如何将社交媒体融入到他们的活动中——以一种有帮助和有成效的方式。研讨会提倡一种反映媒体使用情况的方法,密切关注如何以及何时建议您“数字化”。这个研讨会提供…
Professional assistance in clarifying your objectives for engaging with social media. Why should I consider social media usage? What are my goals?
提供专业帮助,帮助您明确参与社交媒体的目标。我为什么要考虑使用社交媒体?我的目标是什么?
Help figuring out which of the many media platforms is the right one for you.
帮助找出哪些媒体平台适合您。
Assistance on how social media may help you explore your career options (e.g. after a PhD or postdoc).
关于社交媒体如何帮助你探索职业选择的帮助(例如博士或博士后)。
Help taking first steps towards brushing up your personal professional online profiles.
帮助采取第一步刷你的个人专业在线档案。
Speaker: Peter Kronenberg from NaturalScience.Careers
演讲者:来自自然科学的彼得·克伦伯格。职业生涯
https://naturalscience.careers/
by
Emil Björnson and Jiayi Zhang
埃米尔·比约恩森和张嘉怡
Linköping University, Beijing Jiaotong University
林科平大学、北京交通大学
Signal processing is at the core of the 5G communication technology. The use of large arrays with 64 or more antennas is becoming mainstream and the commercial deployment started in 2019. This technology is known as Massive MIMO (multiple-input multiple-output) and was viewed as science fiction just ten years ago, but with the combination of advanced signal processing and innovative protocols, it is now a reality. Just as the seminal papers on Massive MIMO were published ten years ago, this is likely the time when the new technology components for 6G will be identified. In this tutorial, we will consider two such promising research directions, which might be utilized in the conventional cellular spectrum as well as in mmWave or sub-THz bands.
信号处理是5G通信技术的核心。使用64个或更多天线的大型阵列正在成为主流,商业部署从2019年开始。这项技术被称为大规模MIMO(multiple input multiple output,多输入多输出),十年前还被视为科幻小说,但随着先进信号处理和创新协议的结合,现在已经成为现实。正如10年前发表的关于大规模MIMO的开创性论文一样,现在很可能是确定6G的新技术组件的时候。在本教程中,我们将考虑两个这样有前途的研究方向,它们可以用于传统的蜂窝频谱以及毫米波或亚太赫兹波段。
The first new direction is Cell-free Massive MIMO, which refers to a large-scale distributed antenna system that is made practical by innovative signal processing and radio resource allocation algorithms. Different from cellular communications, each user is served by all or a user-unique subset of the antennas. The system is designed to achieve high spectral and energy efficiency, but under the unusual constraint of being scalable from a computational and cost perspective to enable large network deployments. The main goal is to achieve uniformly good and reliable service via excessive macro-diversity, as compared to the micro-diversity achieved by conventional Massive MIMO with large arrays. We will cover the basic theory as well as the recent algorithmic and implementation developments.
第一个新的方向是无小区大规模MIMO,它是指通过创新的信号处理和无线资源分配算法使大规模分布式天线系统实用化。与蜂窝通信不同,每个用户由天线的全部或用户唯一子集服务。该系统旨在实现高频谱和能源效率,但在不寻常的限制下,可从计算和成本角度扩展,以实现大型网络部署。其主要目标是通过过多的宏分集来实现一致的良好和可靠的服务,与传统的大阵列大规模MIMO相比。我们将介绍基本理论以及最近的算法和实现发展。
The second new direction is intelligent reflecting surfaces, which are also known as software-controlled meta-surfaces and reconfigurable intelligent surfaces. These are semi-passive surfaces consisting of an array of meta-atoms with reconfigurable properties that can be controlled to reflect an incoming wave in a controllable way. While only the transmitter and receiver can be optimized in conventional wireless communication systems, the addition of intelligent reflecting surfaces enables optimization also of the channels (i.e., the creating of smart radio environments). We will derive the propagation model from physics and tackle difficult issues such as channel estimation and real-time operation.
第二个新方向是智能反射面,也称为软件控制的元曲面和可重构的智能曲面。这些是半被动表面,由具有可重构特性的元原子阵列组成,这些元原子阵列可以控制以可控方式反射入射波。虽然在传统无线通信系统中只能优化发射机和接收机,但增加智能反射面也可以优化信道(即创建智能无线电环境)。我们将从物理学中导出传播模型,并解决诸如信道估计和实时操作等难题。
深度学习模型的对抗鲁棒性:攻击,防御和验证
by
Pin-Yu Chen 陈品玉
IBM Research, Yorktown Heights
IBM Research,约克敦高地
陈品瑜
IBM研究中心,约克敦高地
Despite the fact of achieving high standard accuracy in a variety of machine learning tasks, deep learning models built upon neural networks have recently been identified having the issue of lacking adversarial robustness. The decision making of well-trained deep learning models can be easily falsified and manipulated, resulting in ever-increasing concerns in safety-critical and security-sensitive applications requiring certified robustness and guaranteed reliability.
This tutorial will provide an overview of recent advances in the research of adversarial robustness, featuring both comprehensive research topics and technical depth. We will cover three fundamental pillars in adversarial robustness: attack, defense and verification. Attack refers to efficient generation of adversarial examples for robustness assessment under different attack assumptions (e.g., white-box or black-box attacks). Defense refers to adversary detection and robust training algorithms to enhance model robustness. Verification refers to attack-agnostic metrics and certification algorithms for proper evaluation of adversarial robustness and standardization. For each pillar, we will emphasize the tight connection between signal processing and the research in adversarial robustness, ranging from fundamental techniques such as first-order and zero-order optimization, minimax optimization, geometric analysis, model compression, data filtering and quantization, subspace analysis, active sampling, frequency component analysis to specific applications such as computer vision, automatic speech recognition, natural language processing and data regression.
This tutorial aims to serve as a short lecture for researchers and students to access the emergent filed of adversarial robustness from the viewpoint of signal processing.
尽管在各种机器学习任务中都达到了很高的标准精度,但是最近发现基于神经网络的深度学习模型存在缺乏对抗性鲁棒性的问题。训练有素的深度学习模型的决策很容易被篡改和操纵,导致对安全性要求严格且对安全性要求很高的应用程序的担忧日益增加,这些应用程序需要经过认证的鲁棒性和保证的可靠性。
本教程将概述对抗性鲁棒性研究的最新进展,包括全面的研究主题和技术深度。我们将涵盖对抗性鲁棒性的三个基本支柱:攻击,防御和验证。攻击是指在不同攻击假设(例如白盒或黑盒攻击)下有效生成对抗性示例的鲁棒性评估。防御是指对手检测和鲁棒训练算法以增强模型的鲁棒性。验证是指与攻击无关的指标和认证算法,用于正确评估对抗性的鲁棒性和标准化。对于每个支柱,我们都将强调信号处理与对抗性鲁棒性研究之间的紧密联系,
本教程旨在作为短期讲座,供研究人员和学生从信号处理的角度访问新兴的对抗鲁棒性文件。
by
Alejandro Ribeiro and Fernando Gama
University of Pennsylvania
由
亚历杭德罗·里贝罗和费尔南多·伽马
宾夕法尼亚大学
Neural Networks have achieved resounding success in a variety of learning tasks. Although sometimes overlooked, success has not been uniform across all learning problems and it has not been achieved by generic architectures. Most remarkable accomplishments are on the processing of signals in time and images and have been attained by Convolutional Neural Networks (CNNs). This is because convolutions successfully exploit the regular structure of Euclidean space and enable learning in high dimensional spaces.
In this tutorial we will develop the novel concept of Graph Neural Networks (GNNs), which intend to extend the success of CNNs to the processing of high dimensional signals in non-Euclidean domains. They do so by leveraging possibly irregular signal structures described by graphs. The following topics will be covered:
Graph Convolutions and GNN Architectures. The key concept enabling the definition of GNNs is the graph convolutional filter introduced in the graph signal processing (GSP) literature. GNN architectures compose graph filters with pointwise nonlinearities. Illustrative examples on authorship attribution and recommendation systems will be covered.
Fundamental Properties of GNNs. Graph filters and GNNs are suitable architectures to process signals on graphs because of their permutation equivariance. GNNs tend to work better than graph filters because they are Lipschitz stable to deformations of the graph that describes their structure. This is a property that regular graph filters can’t have.
Distributed Control of Multiagent Systems. An exciting application domain for GNNs is the distributed control of large scale multiagent systems. Applications to the control of robot swarms and wireless communication networks will be covered.
Attendees to this tutorial will be prepared to tackle research on the practice and theory of GNNs. Coding examples will be provided throughout.
神经网络在各种学习任务中都取得了巨大的成功。尽管有时会被忽略,但在所有学习问题中,成功并不一致,并且通用架构也未实现成功。卷积神经网络(CNN)是在时间和图像信号处理方面取得的最杰出成就。这是因为卷积成功地利用了欧几里得空间的规则结构,并允许在高维空间进行学习。
在本教程中,我们将开发图形神经网络(GNN)的新颖概念,其意图是将CNN的成功扩展到非欧氏域中高维信号的处理。它们通过利用图形描述的可能不规则的信号结构来做到这一点。将涵盖以下主题:
图卷积和GNN架构。定义GNN的关键概念是在图信号处理(GSP)文献中引入的图卷积滤波器。GNN架构构成具有逐点非线性的图形滤波器。本文将介绍作者身份归属和推荐系统的示例。
GNN的基本属性。图形滤波器和GNN由于其置换等方差,因此是处理图形信号的合适体系结构。GNN倾向于比图形过滤器更好地工作,因为它们对于描述其结构的图形的变形是Lipschitz稳定的。这是常规图形过滤器无法拥有的属性。
多主体系统的分布式控制。GNN的一个激动人心的应用领域是大规模多主体系统的分布式控制。将介绍控制机器人群和无线通信网络的应用程序。
本教程的参与者将准备进行有关GNN的实践和理论的研究。全文将提供编码示例。
生物医学图像重建-从基础到深度神经网络
by
Michaël Unser and Pol del Aguila Pla
CIBM, EPFL
Biomedical imaging plays a key role in medicine and biology. Its range of applications and its impact in research and medical practice have increased steadily during the past 4 decades. Part of the astonishing improvements in image quality and resolution is due to the use of increasingly sophisticated signal-processing techniques. This, in itself, would justify the tutorial. Nonetheless, the field is now transitioning towards the deep-learning era, where disruptive improvements and lack of theoretical background go hand-in-hand. To harness the power of these new techniques without suffering from their pitfalls, a structured understanding of the field is fundamental.
We start the tutorial by presenting the building blocks of an image-reconstruction problem, from the underlying image that lives in function spaces to its observed discrete measurements. Most importantly, we detail the small collection of forward and sampling operators that allow one to model most biomedical imaging problems, including magnetic resonance imaging, bright-field microscopy, structured-illumination microscopy, x-ray computed tomography, and optical diffraction tomography. This leads up to our exposition of 1st-generation methods (e.g., filtered back-projection, Tikhonov regularization), the regimes in which they are most attractive, and how to implement them efficiently.
We then transition to 2nd-generation methods (non-quadratic regularization, sparsity, and compressive sensing) and show how advanced signal processing allows image reconstruction with smaller acquisition times, less invasive procedures, and lower radioactive and irradiation dosage. We expose the foundations of these methods (results in compressed-sensing recovery, representer theorems, infinite-divisible distributions) and the most useful algorithms in imaging (proximal operators, projected gradient descent, alternate-direction method of multipliers), again exemplifying their efficient implementation.
Finally, we present the state of the art in 3rd-generation methods (deep-learning reconstruction of images), categorizing them using the building-block terminology introduced throughout the tutorial. In this manner, we emphasize the links to 1st- and 2nd-generation methods in order to provide intuition and guidelines to devise and understand novel 3rd-generation methods. Furthermore, we state the benefits of each proposal and give cautionary examples of the dangers of overreliance on training data.
生物医学成像在医学和生物学中起着关键作用。在过去的40年中,其应用范围及其在研究和医学实践中的影响稳步增长。图像质量和分辨率的惊人提高的部分原因是由于使用了越来越复杂的信号处理技术。这本身就可以证明本教程的合理性。尽管如此,该领域现在正在过渡到深度学习时代,在那个时代,颠覆性的改进和缺乏理论背景齐头并进。为了利用这些新技术的力量而不遭受其陷阱,对这一领域的结构化理解至关重要。
我们通过介绍图像重建问题的构建块开始本教程,从生活在功能空间中的基础图像到观察到的离散测量值。最重要的是,我们详细介绍了少量的正向运算符和采样运算符,使他们可以对大多数生物医学成像问题进行建模,包括磁共振成像,明场显微镜,结构照明显微镜,X射线计算机断层扫描和光学衍射断层扫描。这导致我们对第一代方法(例如,滤波反投影,Tikhonov正则化)进行了阐述,介绍了它们最有吸引力的系统以及如何有效地实施它们。
然后,我们过渡到第二代方法(非二次正则化,稀疏性和压缩感测),并展示先进的信号处理如何以更少的采集时间,更少的侵入性程序以及更低的放射性和辐射剂量实现图像重建。我们介绍了这些方法的基础(压缩感知恢复的结果,表示定理,无限可分分布)和成像中最有用的算法(近距离算子,投影梯度下降,乘法器的交替方向方法),再次证明了它们的高效性实施。
最后,我们以第三代方法(图像的深度学习重建)展示了最新技术,并使用了整个教程中介绍的构建模块术语对它们进行分类。以这种方式,我们强调与第一代和第二代方法的联系,以提供设计和理解新颖的第三代方法的直觉和指导。此外,我们陈述了每个提案的好处,并给出了过度依赖培训数据的危险的警告示例。
博弈论学习及其在频谱协作中的应用
by
Amir Leshem and Kobi Cohen
Bar Ilan University, Ben-Gurion University of the Negev
阿米尔Leshem和科恩Kobi
巴伊兰大学,内盖夫本-古里安大学
Recent years have shown significant advances in many signal processing tasks based on machine learning techniques. Deep learning as well as reinforcement learning techniques have shown a tremendous value for classification, noise reduction and many other tasks. Recent advances in transferring the learning process to the edge of the network in order to protect the privacy of users’ data, as well as exploit the computational resources available at the mobile devices stimulated the development of techniques such as federated learning. In contrast, learning over networks of selfish agents is much less understood and holds the potential for the next leap in learning techniques. To allow distributed learning, both in the federated and distributed contexts, efficient communication techniques are required to save both energy and bandwidth. The tutorial will present recent results related to distributed learning under communication constraints. We will survey basic protocols which can be utilized to achieve efficient learning and then implement them to multiple examples of collaborative spectrum access as well as other resource sharing problems.
近年来,在基于机器学习技术的许多信号处理任务中显示出了显着的进步。深度学习以及强化学习技术已显示出对分类,降噪和许多其他任务的巨大价值。为了保护用户数据的私密性以及利用移动设备上可用的计算资源,将学习过程转移到网络边缘的最新进展刺激了诸如联合学习之类的技术的发展。相比之下,通过自私行为者网络进行学习的了解却很少,并且具有学习技术下一次飞跃的潜力。为了允许在联邦和分布式环境中进行分布式学习,需要有效的通信技术来节省能量和带宽。本教程将介绍与交流约束下的分布式学习相关的最新结果。我们将调查可用于实现有效学习的基本协议,然后将其应用于协作频谱访问以及其他资源共享问题的多个示例。
毫米波联合状态传感和通信中的信息提取:应用基础
by
Kumar Vijay Mishra, Bhavani Shankar and Mari Kobayashi
US Army Research Laboratory; University of Luxembourg;
TU Munich
库玛·维杰·米斯拉(Kumar Vijay Mishra),巴瓦尼·尚卡尔(Bhavani Shankar)和马里·小林(Mari Kobayashi)
美国陆军研究实验室;卢森堡大学
慕尼黑工业大学
Extreme crowding of electromagnetic spectrum in recent years has led to the emergence of complex challenges in designing, sensing and communications systems. The advent of novel technologies –such as drone-based customer services, autonomous driving, radio-frequency identification, and weather monitoring– imply sensors like radars are now deployed in urban environments and operate in bands that were earlier reserved for communications services. Similarly, with rapid surge in mobile network operators, there is a growing concern that the amount of mobile data traffic poses a formidable challenge toward realizing future wireless networks. Both radar and communications systems need wide bandwidth to provide a designated quality-of-service thus resulting in competing interests in exploiting the spectrum. Hence, sharing spectral and hardware resources of communications and radar is imperative toward efficient spectrum utilization.
Specifically, in the automotive sector, state sensing and communication are two major tasks enabling future high-mobility applications such as Vehicular to Everything (V2X) where a node must continuously track its dynamically changing environment and react accordingly by exchanging information with others. This field has, therefore, witnessed concerted and intense efforts towards realizing these joint radar-communications (JRC) systems. Most of the modern automotive JRC systems are envisaged to operate at millimeter-wave (mm-Wave); this brings a new set of challenges and opportunities for the system engineers when compared with centimeter-wave (cm-Wave) JRC. This band is characterized by severe penetration losses, short coherence times, and availability of wide bandwidth. While wide bandwidth is useful in attaining high vehicular communications data rates and high-resolution automotive radar, the losses must be compensated by using large number of antennas at the transmitter and receiver. There is, therefore, a recent surge in research on joint multiple-input multiple-output (MIMO)-Radar-MIMO-Communications (MRMC) systems, where the antenna positions of radar and communications are shared with each other. Both systems may share information with each other to benefit from increased number of design degrees-of-freedom (DoFs).
This tutorial takes a focused view on mm-Wave JRC touching the entire spectrum of this field. After attending the tutorial, participants will be able to understand:
Current challenges and design criteria associated with mm-Wave JRC.
Information theoretic modeling and fundamental limits of joint sensing-communications.
Overview of communication and radar systems including waveform design and data/ target detection-estimation-tracking theoretic criteria, receiver processing algorithms for mm-Wave JRC.
Hardware design aspects of example JRC designs.
Emerging research challenges and solutions in MRMC
近年来,电磁频谱的极端拥挤导致在设计,传感和通信系统中出现了复杂的挑战。基于无人机的客户服务,自动驾驶,射频识别和天气监控等新技术的问世,意味着雷达等传感器现在已部署在城市环境中,并在为通信服务保留的频段中运行。类似地,随着移动网络运营商的迅速增长,人们越来越担心移动数据业务量将对实现未来的无线网络提出巨大的挑战。雷达和通信系统都需要宽带宽来提供指定的服务质量,因此在利用频谱方面引起了竞争。因此,
具体而言,在汽车领域,状态检测和通信是实现未来的高移动性应用(如车到万物(V2X))的两个主要任务,其中节点必须不断跟踪其动态变化的环境并通过与他人交换信息来做出相应的反应。因此,在这一领域见证了为实现这些联合雷达通信(JRC)系统而进行的一致而艰苦的努力。大多数现代汽车JRC系统被设想为在毫米波(mm-Wave)下运行。与厘米波(JCW)相比,这给系统工程师带来了一系列新的挑战和机遇。该频带的特点是严重的穿透损耗,较短的相干时间和宽带宽的可用性。虽然宽带宽可用于获得高的车载通信数据速率和高分辨率的汽车雷达,但必须通过在发射器和接收器处使用大量天线来补偿损耗。因此,最近对联合多输入多输出(MIMO)-雷达-MIMO-通信(MRMC)系统的研究激增,其中雷达和通信的天线位置彼此共享。两个系统可能会彼此共享信息,以受益于更多的设计自由度(DoF)。雷达和通讯的天线位置彼此共享。两个系统可能会彼此共享信息,以受益于更多的设计自由度(DoF)。雷达和通讯的天线位置彼此共享。两个系统可能会彼此共享信息,以受益于更多的设计自由度(DoF)。
本教程重点介绍了毫米波JRC,涉及该领域的整个领域。参加本教程之后,参与者将能够理解:
与毫米波JRC相关的当前挑战和设计标准。
联合感测通信的信息理论模型和基本限制。
通信和雷达系统概述,包括波形设计和数据/目标检测-估计-跟踪理论标准,毫米波JRC的接收器处理算法。
示例JRC设计的硬件设计方面。
MRMC中新兴的研究挑战和解决方案
华为研讨会,塑造2025年的垂直产业
Monday, May 4, 14:30 – 18:00
Shaping the vertical industry in 2025
It is an era of digitalization and transformation of industry factory with 5G technologies to relax people from hard working. Emerging new technologies include but not limited to URLLC, V2X, high-accuracy positioning, intelligent sensing, massive IOTs, and low power IOTs. It is predictive that 2025 is a suitable time to widely commercialize the vertical industry. However, the challenge still exists in reaching the KPIs such as low latency, high reliability, high accuracy positioning, and massive connections. This workshop is to discuss the promising issues on vertical industry.
Presenter Bio: Dr. Peiying Zhu (Chair) is a Huawei Fellow. She is currently leading 5G wireless system research in Huawei. The focus of her research is advanced wireless access technologies with more than 150 granted patents. She has been regularly giving talks and panel discussions on 5G vision and enabling technologies. She served as the guest editor for IEEE Signal processing magazine special issue on the 5G revolution and co-chaired for various 5G workshops. She is actively involved in IEEE 802 and 3GPP standards development. She is currently a WiFi Alliance Board member. Prior to joining Huawei in 2009, Peiying was a Nortel Fellow and Director of Advanced Wireless Access Technology in the Nortel Wireless Technology Lab. She led the team and pioneered research and prototyping on MIMO-OFDM and Multi-hop relay. Many of these technologies developed by the team have been adopted into LTE standards and 4G products.
Presenter Bio: Dr. Thomas Haustein (Panelist) received the Dr.-Ing. (Ph.D.) degree in mobile communications from the University of Technology Berlin, Germany, in 2006. In 1997, he was with the Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute (HHI), Berlin, where he was involved in wireless infrared systems and radio communications with multiple antennas and OFDM. He was also involved in real-time algorithms for baseband processing and advanced multiuser resource allocation. From 2006 to 2008, he was with Nokia Siemens Networks, where he conducted research on 4G. Since 2009, he has been the Head of the Wireless Communications Department, Fraunhofer HHI, and is currently involved in research on 5G and industrial wireless. He led several national and European funded projects on the topics of Cognitive Radio and Millimeter Wave Technology. He was and is active in several H2020 5GPPP projects. He has been serving as an Academic Advisor to NGMN since 2012 and contributes to 3GPP standardization since 2015.
Presenter Bio: Prof. Joerg Widmer (panelist) is Research Professor and Research Director of IMDEA Networks in Madrid, Spain. Before, he held positions at DOCOMO Euro-Labs in Munich, Germany and EPFL, Switzerland. His research focuses on wireless networks, ranging from extremely high frequency millimeter-wave communication and MAC layer design to mobile network architectures. Joerg Widmer authored more than 150 conference and journal papers and three IETF RFCs, and holds 13 patents. He was awarded an ERC consolidator grant, the Friedrich Wilhelm Bessel Research Award of the Alexander von Humboldt Foundation, a Mercator Fellowship of the German Research Foundation, a Spanish Ramon y Cajal grant, as well as eight best paper awards. He is an IEEE Fellow and Distinguished Member of the ACM.
Presenter Bio: Prof. Petar Popovski (panelist) is a Professor in Connectivity at Aalborg University, Denmark. He received his Dipl.-Ing./ Mag.-Ing. in communication engineering from Sts. Cyril and Methodius University in Skopje, R. of Macedonia, and his Ph.D. from Aalborg University. He is a Fellow of IEEE and featured in the list of Highly Cited Researchers 2018, compiled by Web of Science. He received an ERC Consolidator Grant (2015), the Danish Elite Researcher award (2016), IEEE Fred W. Ellersick prize (2016) and IEEE Stephen O. Rice prize (2018). He is currently an Area Editor for IEEE Transactions on Wireless Communications and a Steering Board member of IEEE SmartGridComm. He served as a General Chair for IEEE SmartGridComm 2018 and General Chair for IEEE Communication Theory Workshop 2019. He co-founded RESEIWE A/S, a company delivering ultra-reliable wireless solutions. His research interests are in the area of communication theory, with focus on wireless communication and networks.
Presenter Bio: Eichinger Josef (Panelist) joined Huawei Technologies in 2013 to strengthen the 5G Research team in Munich. He started his professional carrier as technical expert in the field of industry energy and electronic systems. After the study he joined Siemens AG 1994 and was working in development of high frequency radar systems, optical networks and as researcher on radio technologies as HSPA and LTE. He changed to Nokia Siemens Networks 2007 as LTE Product Manager and was head of LTE-Advanced FastTrack Programs from 2010 to end of 2012 to push forward LTE-A. Currently he is leading research on 5G enabled industrial communication in Huawei Munich Research Center. The focus are 5G for industry 4.0 and vehicle-to-vehicle communication. Complementary to the research and standardization work he is also responsible for the prove of the new concept by trials and live experiments e.g. Robot Control in the Cloud, Robot as a Service, Tele-operated Driving, etc. Since April 2018 he is also member of the 5G-ACIA steering board and leading the Huawei delegation.
Presenter Bio: Dr. Mehdi Bennis is an Associate Professor at the Centre for Wireless Communications, University of Oulu, Finland, an Academy of Finland Research Fellow and head of the intelligent connectivity and networks/systems group (ICON). His main research interests are in radio resource management, heterogeneous networks, game theory and machine learning in 5G networks and beyond. He has co-authored one book and published more than 200 research papers in international conferences, journals and book chapters. He has been the recipient of several prestigious awards including the 2015 Fred W. Ellersick Prize from the IEEE Communications Society, the 2016 Best Tutorial Prize from the IEEE Communications Society, the 2017 EURASIP Best paper Award for the Journal of Wireless Communications and Networks, the all-University of Oulu award for research, In 2019 Dr Bennis received the IEEE ComSoc Radio Communications Committee Early Achievement Award.
5月4日,星期一,14:30 – 18:00
塑造2025年的垂直产业
这是一个数字化和采用5G技术的工业工厂转型时代,可以使人们摆脱辛苦的工作。新兴的新技术包括但不限于URLLC,V2X,高精度定位,智能感应,大规模物联网和低功耗物联网。可以预见的是,2025年是使垂直行业广泛商业化的合适时机。但是,在达到KPI方面仍然存在挑战,例如低延迟,高可靠性,高精度定位和大规模连接。该研讨会将讨论垂直行业中有希望的问题。
主持人简历: 朱培英博士(主席)是华为研究员。她目前在华为领导5G无线系统研究。她的研究重点是拥有150多项授权专利的高级无线访问技术。她经常就5G愿景和支持技术进行演讲和小组讨论。她曾担任IEEE信号处理杂志关于5G革命的特刊的客座编辑,并共同主持了各种5G研讨会。她积极参与IEEE 802和3GPP标准的开发。她目前是WiFi联盟董事会成员。在2009年加入华为之前,裴英颖曾担任北电无线研究员和北电无线技术实验室高级无线接入技术总监。她领导团队,并开创了MIMO-OFDM和多跳中继的研究和原型设计。
主持人简历:Thomas Haustein博士(主持人)获得了Ing博士。2006年获得德国柏林工业大学移动通信博士学位(博士学位)。1997年,他在柏林海因里希·赫兹研究所(HHI)的弗劳恩霍夫电信研究所工作,在那里从事无线红外技术的研究。系统和带有多个天线和OFDM的无线电通信。他还参与了用于基带处理和高级多用户资源分配的实时算法。从2006年到2008年,他在诺基亚西门子通信公司任职,从事4G研究。自2009年以来,他一直担任Fraunhofer HHI无线通信部负责人,目前从事5G和工业无线领域的研究。他领导了多个国家和欧洲资助的项目,涉及认知无线电和毫米波技术。他曾经并且活跃于多个H2020 5GPPP项目。自2012年以来,他一直担任NGMN的学术顾问,并从2015年开始为3GPP标准化做出贡献。
主持人简介: Joerg Widmer教授(专家组成员)是西班牙马德里IMDEA Networks的研究教授兼研究总监。在此之前,他曾在德国慕尼黑的DOCOMO欧洲实验室和瑞士的EPFL任职。他的研究专注于无线网络,从极高频率的毫米波通信和MAC层设计到移动网络架构。约尔格·威德默(Joerg Widmer)撰写了150余篇会议和期刊论文以及3份IETF RFC,并拥有13项专利。他获得了ERC合并者资助,亚历山大·冯·洪堡基金会的弗里德里希·威廉·贝塞尔研究奖,德国研究基金会的墨卡托奖学金,西班牙的拉蒙·卡哈尔奖以及八个最佳论文奖。他是ACM的IEEE资深会员。
主持人简历:Petar Popovski教授(专家)是丹麦奥尔堡大学的连通性教授。他获得了他的Dipl.-Ing./ Mag.-Ing。Sts的通信工程专业。马其顿河斯科普里的西里尔和麦迪乌斯大学,及其博士学位。来自奥尔堡大学。他是IEEE的院士,并入选了Web of Science撰写的《 2018年高被引学者》。他获得了ERC合并者奖(2015),丹麦优秀研究人员奖(2016),IEEE弗雷德·W·埃勒西克奖(2016)和IEEE斯蒂芬·赖斯奖(2018)。他目前是IEEE无线通信事务的区域编辑,还是IEEE SmartGridComm的指导委员会成员。他曾担任2018年IEEE SmartGridComm大会主席和2019年IEEE通讯理论研讨会主席。他是RESEIWE A / S的共同创始人,一家提供超可靠无线解决方案的公司。他的研究兴趣是通信理论领域,重点是无线通信和网络。
主持人简历:艾辛格·约瑟夫(Eichinger Josef)(潘捷)于2013年加入华为,以加强慕尼黑的5G研究团队。他以工业能源和电子系统领域的技术专家的身份开始了自己的职业航母。研究结束后,他于1994年加入西门子股份公司,从事高频雷达系统,光网络的开发以及HSPA和LTE等无线电技术的研究。他改为诺基亚西门子通信2007年担任LTE产品经理,并于2010年至2012年底担任LTE-Advanced FastTrack计划负责人,以推动LTE-A的发展。目前,他在华为慕尼黑研究中心领导着基于5G的工业通信研究。重点是用于工业4.0和车对车通信的5G。
主持人简历:Mehdi Bennis博士是芬兰奥卢大学无线通信中心的副教授,芬兰科学院研究员,也是智能连接和网络/系统小组(ICON)的负责人。他的主要研究兴趣是无线电资源管理,异构网络,博弈论和5G网络及以后的机器学习。他与他人合着了一本书,并在国际会议,期刊和书籍章节中发表了200多篇研究论文。他曾获得多个著名的奖项,包括2015年IEEE通信协会的Fred W. Ellersick奖,IEEE通信协会的2016年最佳教程奖,2017年无线通信和网络杂志的EURASIP最佳论文奖,奥卢大学全研究奖
数学研讨会,信号处理与MATLAB中的深度学习相结合-从入门到开发实际应用
Monday, May 4, 14:30 – 18:00
Signal Processing Meets Deep Learning in MATLAB – From Getting Started to Developing Real-World Applications
The adoption of deep learning across a wide range of signal processing applications has been attracting an increasing level of attention over the last few years. Deep learning for real-world signal processing systems has accentuated the need for application-specific tools and expertise for creating, labelling, augmenting and processing the vast amounts of signal data required to train and evaluate the learning models.
Using MATLAB code and new features, we will start from the basics of designing and training a network. We will then move onto more advanced topics, including data annotation, advanced feature extraction, training acceleration on GPUs and GPU clouds, and real-time implementation of deep networks on embedded devices. While focusing on a practical speech-based example, we will also discuss applications to other types of signals, such as Communications, Radar, and Medical Devices.
Presenter Bio: Jihad Ibrahim is a principal software developer and product lead of Audio Toolbox at MathWorks. He joined MathWorks in 2006 and has contributed to the development of Signal Toolbox, DSP System Toolbox, Communications Toolbox, and Audio Toolbox. He received his PhD in Electrical Engineering from Virginia Tech.
Presenter Bio: Gabriele Bunkheila is a senior product manager at MathWorks, where he coordinates the strategy of MATLAB toolboxes for audio and DSP. After joining MathWorks in 2008, he worked as a signal processing application engineer for several years, supporting MATLAB and Simulink users across industries from algorithm design to real-time implementations. Before MathWorks, he held a number of research and development positions, and he was a lecturer of sound theory and technologies at the national film school of Rome. He has a master’s degree in physics and a Ph.D. in communications engineering.
5月4日,星期一,14:30 – 18:00
信号处理与MATLAB中的深度学习相结合-从入门到开发实际应用
在过去的几年中,深度学习在各种信号处理应用中的采用已引起越来越多的关注。现实世界中信号处理系统的深度学习突显了对特定于应用的工具和专业知识的需求,这些工具和专业知识用于创建,标记,扩充和处理训练和评估学习模型所需的大量信号数据。
使用MATLAB代码和新功能,我们将从设计和训练网络的基础开始。然后,我们将进入更高级的主题,包括数据注释,高级功能提取,GPU和GPU云上的训练加速以及嵌入式设备上的深度网络的实时实现。在重点讨论基于语音的实际示例时,我们还将讨论对其他类型信号的应用,例如通信,雷达和医疗设备。
Presenter Bio:Jihad Ibrahim是MathWorks的首席软件开发人员和Audio Toolbox的产品负责人。他于2006年加入MathWorks,为Signal Toolbox,DSP System Toolbox,Communication Toolbox和Audio Toolbox的开发做出了贡献。他获得了弗吉尼亚理工大学电气工程博士学位。
Presenter Bio:Gabriele Bunkheila是MathWorks的高级产品经理,负责协调用于音频和DSP的MATLAB工具箱的策略。在2008年加入MathWorks之后,他担任信号处理应用工程师已有数年,为从算法设计到实时实现的各个行业的MATLAB和Simulink用户提供支持。在加入MathWorks之前,他担任过多个研发职位,并且在罗马国家电影学院担任声音理论和技术讲师。他拥有物理学硕士学位和博士学位。在通信工程中。
Tuesday, 5 May
10:00 - 11:00
Deep Representation Learning
Abstract: A crucial ingredient of deep learning is that of learning representations, more specifically with the objective to discover higher-level representations which capture and disentangle explanatory factors. This is a very ambitious goal and current state-of-the-art techniques still fall short, often capturing mostly superficial features of the data, which leaves them vulnerable to adversarial attacks and insufficient out-of-distribution robustness.This talk will review these original objectives, supervised and unsupervised approaches, and outline research ideas towards better representation learning.
Yoshua Bengio
Yoshua Bengio
Yoshua Bengio is recognized as one of the world’s artificial intelligence leaders and a pioneer of deep learning. Professor since 1993 at the Université de Montréal, he received the A.M. Turing Award 2018 with Geoff Hinton and Yann LeCun, considered like the Nobel prize for computing. Holder of the Canada Research Chair in Statistical Learning Algorithms, he is also the founder and scientific director of Mila, the Quebec Institute of Artificial Intelligence, which is the world’s largest university-based research group in deep learning. In 2018, he collected the largest number of new citations in the world for a computer scientist. He earned the prestigious Killam Prize from the Canada Council for the Arts and the Marie-Victorin Quebec Prize. Concerned about the social impact of AI, he actively contributed to the Montreal Declaration for the Responsible Development of Artificial Intelligence.
深度表示学习
摘要:深度学习的关键要素是学习表示形式,更具体地说,其目的是发现能够捕获和解开解释因素的高级表示形式。这是一个非常雄心勃勃的目标,当前的最新技术仍然不足,通常会捕获大部分数据的表面特征,使它们容易受到对抗性攻击和分布失稳性不足。最初的目标,有监督和无监督的方法,并概述了旨在更好地进行表征学习的研究思路。
尤舒亚·本吉奥(Yoshua Bengio)
尤舒亚·本吉奥(Yoshua Bengio)被公认为世界人工智能领导者之一和深度学习的先驱。自1993年担任蒙特利尔大学教授以来,他与杰夫·欣顿(Geoff Hinton)和扬·勒昆(Yann LeCun)一起获得了2018年图灵奖,被视为诺贝尔计算机奖。他是统计学习算法的加拿大研究主席,还是魁北克人工智能研究所Mila的创始人和科学总监,魁北克人工智能研究所是世界上最大的以大学为基础的深度学习研究小组。2018年,他为计算机科学家收集了世界上数量最多的新引用。他获得了加拿大艺术理事会颁发的著名的基拉姆奖和玛丽·维克托·魁北克奖。关注AI的社会影响,
ISS 1.1:加速工业物联网:如何?
Internet of Things, connectivity, and data analytics are the fundamental enablers for industrial digitalisation. Deploying and operating an IoT system at scale is no trivial task as its success will depend on a well-oiled ecosystem, favourable rules and regulations, a solid business model, and the presence of a pervasive network that can meet the desired quality of service. This talk reflects on how mobile network operators are working towards accelerating the adoption of IoT in vertical industries by providing an overview on key standardization and open source activities, new industry bodies, e.g., the 5G Automotive Association and the 5G Alliance for Connected Industries and Automation, coverage and performance trade-offs, and key issues related to spectrum usage.
物联网、连接性和数据分析是实现工业数字化的根本动力。大规模部署和操作物联网系统并非易事,因为它的成功将取决于良好的生态系统、有利的规章制度、可靠的商业模式以及能够满足所需服务质量的普及网络的存在。本次演讲反映了移动网络运营商如何通过概述主要标准化和开源活动、新的行业机构(如5G汽车协会和5G互联产业和自动化联盟)来加速在垂直行业中采用物联网,覆盖范围和性能权衡,以及与频谱使用相关的关键问题。
Speaker: Dr Ilaria Thibault, IoT Strategy Manager, Vodafone Business
发言人:Ilaria Thibault博士,沃达丰业务物联网战略经理
ISS 1.2:数据网——对数据的处理就像互联网对通信的处理
This talk presents a new approach to national infrastructure called DataNet- Building on successful examples such as X-Road, DataNet provides a unique approach to data as a national infrastructure. Through providing Identity as a Utility, Data Exchange and a Data Sovereign Wealth Fund, DataNet reduces required investment in a national infrastructure for data, provides a unique economic model for the delivery of such services and protects end-user and citizen privacy. Designed to be led by the national government and delivered by a combination of the private and public sector in an open architecture format, DataNet provides a flexible, modular and adaptable approach to data in the emerging digital era.
本次演讲提出了一种新的国家基础设施方法,称为数据网——在诸如X-Road等成功案例的基础上,数据网提供了一种独特的方法,将数据作为国家基础设施。通过提供作为公用事业、数据交换和数据主权财富基金的身份,数据网减少了对国家数据基础设施的必要投资,为提供此类服务提供了独特的经济模式,并保护最终用户和公民隐私。数据网由国家政府领导,由私营和公共部门以开放式架构的形式联合提供,为新兴数字时代的数据提供了灵活、模块化和适应性强的方法。
Speaker: Dr Cathy Mulligan, CTO, GovTech Labs
演讲人:Cathy Mulligan博士,GovTech实验室首席技术官
ISS 1.3:实现5G及以上工业物联网
5G is seen as a key enabler for achieving flexible implementation of industrial IoT (IIoT), where rigid wired connections can be replaced with low-latency and high-reliability wireless communications. Such flexible implementation paves the way for scalable operations, where the production can be more easily altered and scaled up, e.g., by adding new machines to the production cells of a production line. Nevertheless, the current industrial communications are supported by a diverse set of communication protocols addressing the associated application requirements. Moreover, new use cases, such as distributed machine controllers providing the needed flexibility, should be supported by the 5G system (5GS) design. In this talk, we will outline the key requirements for IIoT and the associated challenges for 5G Release 17 and beyond. We will further elaborate on the architectural framework for supporting distributed machine controllers.
5G被视为实现工业物联网(IIoT)灵活实施的关键促成因素,在工业物联网中,刚性有线连接可以被低延迟和高可靠性的无线通信所取代。这种灵活的实现为可扩展操作铺平了道路,在这种操作中,生产可以更容易地改变和扩展,例如,通过向生产线的生产单元添加新机器。然而,目前的工业通信由一组满足相关应用需求的不同通信协议支持。此外,5G系统(5GS)设计应支持新的用例,例如提供所需灵活性的分布式机器控制器。在本次演讲中,我们将概述IIoT的关键要求以及5G 17版及更高版本的相关挑战。我们将进一步阐述支持分布式机器控制器的体系结构框架。
Speaker: Dr. Malte Schellmann, Principle Researcher, Huawei Tech. GRC
演讲人:华为技术集团首席研究员马尔特·谢尔曼博士
ISS 3.1:用于听力增强和健康与健康监测的智能耳位装置
With resurgence of artificial intelligence (AI) and machine learning, sensor miniaturization and increased wireless connectivity, ear-level hearing devices are going through a major revolution transforming themselves from traditional hearing aids into modern hearing enhancement and health and wellness monitoring devices. For the aging user population of hearing aids, sound quality and speech understanding in challenging listening environments remain unsatisfactory. To improve quality of life and reduce health care cost, it is highly desirable if the devices can provide effective health and wellness monitoring capability on a continuous basis in everyday life. Finally, as the device functionality becomes more complex and dexterity is a major challenge for our user population, easy and intuitive user interactions with the devices are becoming increasingly important. In this talk, we will present examples of such transformation in the areas of hearing enhancement, health and wellness monitoring and user experience. In the process, we will highlight how AI and machine learning, miniaturized sensors and wireless connectivity are enabling and accelerating the transformation. In addition, we will discuss practical challenges for the transformation in areas of power consumption, non-volatile and volatile storage, audio latency and wireless reliability. Finally, we will provide an outlook on future directions and opportunities for intelligent ear-level devices.
随着人工智能(AI)和机器学习的兴起,传感器的小型化和无线连接的增强,耳廓听力设备正经历一场重大的革命,从传统的助听器向现代的听力增强和健康监测设备转变。随着助听器使用人口的老龄化,在充满挑战的听力环境中,其音质和语音理解能力仍不尽如人意。为了提高生活质量和降低医疗保健成本,如果这些设备能够在日常生活中持续提供有效的健康和健康监测能力,是非常理想的。最后,随着设备功能变得越来越复杂,灵巧性对我们的用户群来说是一个重大挑战,用户与设备的简单直观的交互变得越来越重要。在本次讲座中,我们将介绍在听力增强、健康和健康监测以及用户体验等领域的此类转变。在这个过程中,我们将强调人工智能和机器学习、微型传感器和无线连接是如何实现和加速这一转变的。此外,我们还将讨论在功耗、非易失性和易失性存储、音频延迟和无线可靠性方面的实际挑战。最后,我们将展望智能耳级设备的未来发展方向和机遇。
Speaker: Dr. Tao Zhang, Ph.D., Director of Algorithms , Starkey Hearing Technologies, USA
演讲人:Tao Zhang博士,美国Starkey Hearing Technologies公司算法总监
ISS 3.2:智能国家中城市声音的数字化
In this digital era, sensing and processing are being integrated into IoT devices that can be easily and economically deployed in our urban environment. In this talk, the speaker will describe some of the digital sound and active noise mitigation technologies that have been developed in my lab and some pilot deployment in our urban environment. In order to achieve a holistic understanding of our urban environment, we must rely on intelligent sound sensing that can operate 24/7 and deploy widely under different environmental conditions. These intelligent sound sensors also serve as digital ears to complement and activates the digital eyes of the CCTV cameras. By having a comprehensive and big aural sound data allows public agencies to better formulate complete and accurate sound mitigation policies. Sound pressure level (SPL) readings have been the de facto standard in quantifying our noise environment; however, SPL alone cannot accurately indicate how humans actually perceive noise; whether they like or dislike the sound even at the same SPL. The latest ISO standards (i.e. ISO 12913-1:2014, ISO 12913-2:2018) have been moving towards a measurement that is based on how humans perceive sound. With the advent of powerful and low-cost embedded processors, analog-to-digital convertors, and acoustic sensors, we are now seeing wide-spread usage of digital active noise control (ANC) technologies in consumer products, like hearables and in automobiles. In this talk, the speaker will also showcase our latest work in extending active noise control applications to a larger region of control, such as in open windows and openings of noise sources. Digitization plays a key role in advancing the art of ANC to incorporate artificial intelligence to select the most annoying noise to cancel and provide ways to further mitigate noise by perceptual-based sound augmentation.
在这个数字时代,传感和处理正被集成到物联网设备中,这些设备可以方便、经济地部署在我们的城市环境中。在本次演讲中,演讲者将介绍我的实验室开发的一些数字声音和主动降噪技术,以及在我们的城市环境中的一些试验性部署。为了全面了解我们的城市环境,我们必须依靠能够在不同环境条件下全天候运行和广泛部署的智能声音传感。这些智能声音传感器还充当数字耳朵,以补充和激活闭路电视摄像机的数字眼睛。通过拥有全面而庞大的声音数据,公共机构可以更好地制定完整而准确的声音缓解政策。声压级(SPL)读数已成为量化我们的噪声环境的事实标准;然而,单凭SPL无法准确地指出人类是如何实际感知噪声的;即使在相同的SPL下,他们是否喜欢或不喜欢声音。最新的ISO标准(即ISO 12913-1:2014、ISO 12913-2:2018)已经朝着基于人类感知声音的测量方向发展。随着功能强大、成本低廉的嵌入式处理器、模数转换器和声学传感器的出现,我们现在看到数字有源噪声控制(ANC)技术在消费品(如可听设备和汽车)中的广泛应用。在本次演讲中,演讲者还将展示我们在将有源噪声控制应用扩展到更大的控制区域方面的最新工作,例如在打开的窗口和噪声源的开口处。数字化在推进ANC技术中起着关键作用,它结合人工智能来选择最讨厌的噪声来抵消,并通过基于感知的声音增强提供进一步降低噪声的方法。
Speaker: Dr.Woon-Seng Gan, Director of Smart Nation Lab at Nanyang Technological University, Singapore
演讲人:新加坡南洋理工大学智能国家实验室主任吴森干博士
ISS 3.3:机械噪声抑制:沉默30年后的相位信号增强
This talk presents challenges, solutions, and applications in commercial products of mechanical noise suppression. The topic has become more important as dissemination of consumer products that process environmental signals in addition to human speech. Three typical types of mechanical noise signals with small, medium, and large signal power, represented by feature phones and camcorders, digital cameras, and standard and tablet PCs, respectively, are covered. Mechanical noise suppression for small power signals is performed by continuous spectral template subtraction with a noise template dictionary. Medium power mechanical noise is suppressed in a similar manner only when its presence is notified by the parent system such as the digital camera. When the power is large, explicit detection of the mechanical noise based on phase information determines suppression timings. In the all three scenarios, the phase information of the input noisy signal is randomized for making the residual noise inaudible in frequency bins where noise is dominant. The phase has been unaltered in the past 30 years after Lim, thus, these suppression algorithms opened the door to a new signal
本文介绍了机械噪声抑制在商业产品中的挑战、解决方案和应用。随着消费品的传播,这个话题变得越来越重要,这些消费品除了处理人类的语言外,还处理环境信号。介绍了三种典型的小、中、大信号功率机械噪声信号,分别以功能手机和摄像机、数码相机、标准和平板电脑为代表。采用带噪声模板字典的连续谱模板减法对小功率信号进行机械噪声抑制。只有当诸如数码相机的父系统通知其存在时,才以类似的方式抑制中等功率机械噪声。当功率较大时,基于相位信息的机械噪声的显式检测决定了抑制时序。在这三种情况下,输入噪声信号的相位信息是随机的,以使噪声占主导地位的频率箱中的残余噪声听不见。在Lim之后的30年里,相位没有改变,因此,这些抑制算法为新的信号打开了大门
从语音人工智能到金融人工智能
Tuesday, May 5, 16:30 – 17:30
5月5日,星期二,16:30–17:30
A brief review will be provided first on how deep learning has disrupted speech recognition and language processing industries since 2009. Then connections will be drawn between the techniques (deep learning or otherwise) for modeling speech and language and those for financial markets. Similarities and differences of these two fields will be explored. In particular, three unique technical challenges to financial investment are addressed: extremely low signal-to-noise ratio, extremely strong nonstationarity (with adversarial nature), and heterogeneous big data. Finally, how the potential solutions to these challenges can come back to benefit and further advance speech recognition and language processing technology will be discussed.
首先简要回顾一下自2009年以来,深度学习对语音识别和语言处理行业的影响。然后,将在语言和语言建模的技术(深度学习或其他)与金融市场的技术之间建立联系。将探讨这两个领域的异同。特别是金融投资面临的三个独特的技术挑战:极低的信噪比、极强的非平稳性(具有对抗性)和异构大数据。最后,我们将讨论如何使这些挑战的潜在解决方案重新受益并进一步提高语音识别和语言处理技术。
Speaker: Li Deng, IEEE Fellow, Chief AI Officer, Citadel, USA
演讲人:邓立,IEEE研究员,美国Citadel首席人工智能官