Adam坤

Distributed Stochastic Gradient Descent with Event-Triggered Communication

https://github.com/akamaster/pytorch resnet cifar10

Abstract

We develop a Distributed Event-Triggered Stochastic GRAdient Descent (DETSGRAD) algorithm for solving non-convex optimization problems typically encountered in distributed deep learning. We propose a novel communication triggering mechanism that would allow the networked agents to update their model parameters aperiodically and provide sufﬁcient conditions on the algorithm step-sizes that guarantee the asymptotic mean-square convergence. The algorithm is applied to a distributed supervised-learning problem, in which a set of networked agents collaboratively train their individual neural networks to perform image classiﬁcation, while aperiodically sharing the model parameters with their onehop neighbors. Results indicate that all agents report similar performance that is also comparable to the performance of a centrally trained neural network, while the event-triggered communication provides signiﬁcant reduction in inter-agent communication. Results also show that the proposed algorithm allows the individual agents to classify the images even though the training data corresponding to all the classes are not locally available to each agent.
我们开发了一种分布式事件触发的随机梯度下降（DETSGRAD）算法，用于解决通常在分布式深度学习中遇到的非凸优化问题。我们提出了一种新颖的通信触发机制，该机制将允许网络代理不定期地更新其模型参数，并为算法步长提供足够的条件，以保证渐近均方收敛。该算法应用于分布式监督学习问题，该问题中一组网络代理协同训练他们各自的神经网络以进行图像分类，同时不定期地与单跳邻居共享模型参数。结果表明，所有代理都报告了相似的性能，也可以与集中训练的神经网络的性能相媲美，而事件触发的通信则大大减少了代理之间的通信。结果还表明，即使不是所有代理都本地可用的训练数据，所提出的算法也允许单个代理对图像进行分类。

Introduction

With the advent of smart devices, there has been an exponential growth in the amount of data collected and stored locally on individual devices. Applying machine learning to extract value from such massive data to provide data-driven insights, decisions, and predictions has been a popular research topic as well as the focus of numerous businesses. However, porting these vast amounts of data to a data center to conduct traditional machine learning has raised two main issues:(i)the communication challenge associated with transferring vast amounts of data from a large number of devices to a central location and (ii) the privacy issues associated with sharing raw data. Distributed machine learning techniques based on the server-client architecture(Lietal.2014a; 2014b; Zhang, Alqahtani, and Demirbas 2017) have been proposed as solutions to this problem. On one extreme end of this architecture, we have the parameter server approach, where a server or group of servers initiate distributed learning by pushing the current model to a set of client nodes that host the data. The client nodes compute the local gradients or parameter updates and communicate them to the server nodes. Server nodes aggregate these values and update the current model (Zhang et al. 2018; Li et al. 2014b). On the other extreme, we have federated learning, where each client node obtains a local solution to the learning problem and the server node computes a global model by averaging the local models (Konecn˘ u et al ´ . 2016; McMahan et al. 2017). Besides the server-client architecture, a shared-memory (multicore/multiGPU) architecture, where different processors independently compute the gradients and update the global model parameter using a shared memory, has also been proposed as a solution to the distributed machine learning problem (Recht et al. 2011; De Sa et al. 2015; Chaturapruek, Duchi, and Re 2015; ´ Feyzmahdavian, Aytekin, and Johansson 2016). However, none of the above-mentioned learning techniques are truly distributed since they follow a master-slave architecture and do not involve any peer-to-peer communication. Furthermore, these techniques are not always robust and they are rendered useless if the master/server node or the shared-memory fails. Therefore, we aim to develop a fully distributed machine learning architecture enabled by client-to-client interaction.
随着智能设备的出现，本地收集和存储在单个设备上的数据量呈指数增长。应用机器学习从海量数据中提取价值以提供数据驱动的见解，决策和预测，已成为热门的研究主题，也是众多企业关注的焦点。但是，将这些大量数据移植到数据中心以进行传统的机器学习带来了两个主要问题：（i）与将大量数据从大量设备传输到中央位置相关的通信挑战；以及（ii）与共享原始数据相关的隐私问题。已经提出了基于服务器-客户端架构的分布式机器学习技术（Lietal.2014a; 2014b; Zhang，Alqahtani和Demirbas 2017）作为解决此问题的方法。在此架构的一个极端中，我们采用了参数服务器方法，其中一台服务器或一组服务器通过将当前模型推送到托管数据的一组客户端节点来启动分布式学习。客户端节点计算局部梯度或参数更新，并将其传达给服务器节点。服务器节点汇总这些值并更新当前模型（Zhang等，2018； Li等，2014b）。另一方面，我们采用联合学习，其中每个客户端节点获得学习问题的本地解决方案，而服务器节点则通过对本地模型求平均值来计算全局模型（Konecn˘u等，2016； McMahan等，2017）。）。除了服务器客户端架构外，还提出了一种共享内存（多核/多GPU）架构，其中不同的处理器独立计算梯度并使用共享内存更新全局模型参数，作为分布式机器学习问题的解决方案（ Recht等人，2011； De Sa等人，2015； Chaturapruek，Duchi和Re，2015； Feyzmahdavian，Aytekin和Johansson，2016年）。然而，由于上述学习技术遵循主从架构并且不涉及任何对等通信，因此没有一种是真正的分布式学习技术。此外，这些技术并不总是健壮的，如果主/服务器节点或共享内存出现故障，它们将变得无用。因此，我们旨在开发一种通过客户端到客户端交互实现的完全分布式的机器学习架构。

For large-scale machine learning, stochastic gradient descent (SGD) methods are often preferred over batch gradient methods (Bottou, Curtis, and Nocedal 2018) because (i) in many large-scale problems, there is a good deal of redundancy in data and therefore it is inefficient to use all the data in every optimization iteration, (ii) the computational cost involved in computing the batch gradient is much higher than that of the stochastic gradient, and (iii) stochastic methods are more suitable for online learning where data are arriving sequentially. Since most machine learning problems are non-convex, there is a need for distributed stochastic gradient methods for non-convex problems. Therefore, here we present a communication efficient, distributed stochastic gradient algorithm for non-convex problems and demonstrate its utility for distributed machine learning.

对于大规模机器学习，随机梯度下降（SGD）方法通常优于批量梯度方法（Bottou，Curtis和Nocedal 2018），因为（i）在许多大规模问题中，数据中存在大量冗余因此在每次优化迭代中使用所有数据效率低下，（ii）计算批次梯度所涉及的计算成本比随机梯度的计算成本高得多，并且（iii）随机方法更适合在线学习，其中数据按顺序到达。由于大多数机器学习问题是非凸的，因此需要用于非凸问题的分布式随机梯度方法。因此，在这里，我们提出了一种针对非凸问题的通信有效的分布式随机梯度算法，并展示了其在分布式机器学习中的效用。

Relatedwork

Distributed Non-Convex Optimization: A few early examples of (non-stochastic or deterministic) distributed non convex optimization algorithms include the Distributed Approximate Dual Subgradient (DADS) Algorithm (Zhu and Martnez 2013), NonconvEx primal-dual SpliTTing (NESTT) algorithm (Hajinezhad et al. 2016), and the Proximal PrimalDual Algorithm (Prox-PDA) (Hong, Hajinezhad, and Zhao 2017). More recently, a non-convex version of the accelerated distributed augmented Lagrangians (ADAL) algorithm is presented in Chatzipanagiotis and Zavlanos(2017) and successive convex approximation (SCA)-based algorithms such as iNner cOnVex Approximation (NOVA) and in-Network succEssive conveX approximaTion algorithm (NEXT) are given in Scutari, Facchinei, and Lampariello(2017) and Lorenzo and Scutari(2016), respectively. References (Hong 2018; Guo, Hug, and Tonguz 2017; Hong, Luo, and Razaviyayn 2016) provide several distributed alternating direction method of multipliers (ADMM) based non-convex optimization algorithms. Non-convex versions of Decentralized Gradient Descent (DGD) and Proximal Decentralized Gradient Descent (Prox-DGD) are given in Zeng and Yin(2018). Finally, Zeroth-Order NonconvEx (ZONE) optimization algorithms for mesh network (ZONE-M) and star network (ZONE-S) are presented in Hajinezhad, Hong, and Garcia(2019). However, almost all aforementioned consensus optimization algorithms focus on non-stochastic problems and are extremely communication heavy because they require constant communication among the agents.

分布式非凸优化:(非随机或确定性）分布式非凸优化算法的一些早期示例包括分布式近似对偶次梯度（DADS）算法（Zhu和Martnez 2013），NonconvEx原始对偶SpliTTing（NESTT）算法（ Hajinezhad et al.2016），以及Proximal PrimalDual Algorithm（Prox-PDA）（Hong，Hajinezhad和Zhao 2017）。最近，在Chatzipanagiotis和Zavlanos（2017）中提出了加速分布的增强拉格朗日算法（ADAL）的非凸版本，以及基于连续凸逼近（SCA）的算法（如iNner cOnVex逼近（NOVA）和网络中的SuccEssive）凸近似算法（NEXT）分别在Scutari，Facchinei和Lampariello（2017）以及Lorenzo和Scutari（2016）中给出。参考文献（Hong 2018; Guo，Hug，and Tonguz 2017; Hong，Luo和Razaviyayn 2016）提供了几种基于乘数（ADMM）的非凸优化算法的分布式交替方向方法。 Zeng和Yin（2018）给出了非凸版本的分散梯度下降（DGD）和近端分散梯度下降（Prox-DGD）。最后，在Hajinezhad，Hong和Garcia（2019）中提出了用于网状网络（ZONE-M）和星型网络（ZONE-S）的零阶NonconvEx（ZONE）优化算法。然而，几乎所有上述共识优化算法都集中在非随机问题上，并且由于需要代理之间的持续通信，因此通信量极大。

Distributed Convex SGD: Within the consensus optimization literature, there exist several works on distributed stochastic gradient methods, but mainly for strongly convex optimization problems. These include the stochastic subgradient-push method for distributed optimization over time-varying directed graphs given in Nedic and Ol- shevsky(2016), distributed stochastic optimization over random networks given in Jakovetic et al.(2018), the Stochastic Unbiased Curvature-aided Gradient (SUCAG) method given in Wai et al.(2018), and distributed stochastic gradient tracking methods Pu and Nedic(2018). There are ´very few works on distributed stochastic gradient methods for non-convex optimization (Tatarenko and Touri 2017; Bianchi and Jakubowicz 2013); however, the push-sum algorithm given in Tatarenko and Touri(2017) assumes there are no saddle-points and it often requires up to 3 times as many internal variables as the proposed algorithm. Compared to Bianchi and Jakubowicz(2013) and Tatarenko and Touri(2017), the proposed algorithm provides an explicit consensus rate and allows the parallel execution of the consensus communication and gradient computation steps.

分布式凸SGD：在共识优化文献中，存在一些关于分布式随机梯度法的工作，但主要针对强凸优化问题。这些包括Nedic和Olshevsky（2016）中给出的时变有向图上的分布式随机优化的随机次梯度推方法，Jakovetic等人（2018）中给出的随机网络上的分布式随机优化，随机无偏曲率辅助Wai等人（2018）中给出的梯度（SUCAG）方法以及Pu和Nedic（2018）中的分布式随机梯度跟踪方法。关于非凸优化的分布式随机梯度方法的工作很少（Tatarenko和Touri，2017年； Bianchi和Jakubowicz，2013年）。然而，Tatarenko和Touri（2017）中给出的推和算法假设没有鞍点，并且通常需要多达所建议算法3倍的内部变量。与Bianchi和Jakubowicz（2013）以及Tatarenko和Touri（2017）相比，该算法提供了明确的共识率，并允许共识通信和梯度计算步骤的并行执行。
Parallel SGD: There exist numerous asynchronous SGD algorithms aimed at parallelizing the data-intensive machine learning tasks. The two popular asynchronous parallel implementations of SGD are the computer network implementation originally proposed in Agarwal and Duchi(2011) and the shared memory implementation introduced in Recht et al.(2011). Computer network implementation follows the master-slave architecture and Agarwal and Duchi(2011) showed that for smooth convex problems, the delays due to asynchrony are asymptotically negligible. Feyzmahdavian, Aytekin, and Johansson(2016) extend the results in Agarwal and Duchi(2011) for regularized SGD. Extensions of the computer network implementation of asynchronous SGD with variance reduction and polynomially growing delays are given in Huo and Huang(2016) and Zhou et al.(2018), respectively. Recht etal.(2011) proposed a lock-free asynchronous parallel implementation of SGD on a shared memory system and proved a sublinear convergence rate for strongly convex smooth objectives. The lock-free algorithm, HOGWILD!, proposed in Recht et al.(2011) has been applied to PageRank approximation (Mitliagkas et al. 2015), deep learning (Noel and Osindero 2014), and recommender systems (Yu et al. 2012).

并行SGD：存在许多异步SGD算法，旨在并行化数据密集型机器学习任务。 SGD的两种流行的异步并行实现是最初在Agarwal和Duchi（2011）中提出的计算机网络实现，以及在Recht等人（2011）中引入的共享内存实现。计算机网络的实现遵循主从结构，Agarwal和Duchi（2011）指出，对于光滑凸问题，异步引起的延迟在渐近上可忽略不计。 Feyzmahdavian，Aytekin和Johansson（2016）将Agarwal和Duchi（2011）的结果扩展为正规SGD。分别在Huo和Huang（2016）和Zhou等人（2018）中给出了异步SGD的计算机网络实现的扩展，其具有方差减少和多项式增长的延迟。 Recht等人（2011年）提出了在共享内存系统上实现SGD的无锁异步并行实现，并证明了强凸光滑目标的亚线性收敛速率。 Recht等人（2011）提出的无锁算法HOGWILD！已应用于PageRank逼近（Mitliagkas等人2015），深度学习（Noel和Osindero 2014）和推荐系统（Yu等人2012））。
In Duchi, Jordan, and McMahan(2013), authors extended the HOGWILD! algorithm to a dual averaging algorithm that works for non-smooth, non-strongly convex problems with sparse gradients. An extension of HOGWILD! called BUCKWILD! is introduced in De Sa et al.(2015) to account for quantization errors introduced by fixed-point arithmetic. In Chaturapruek, Duchi, and Re(2015), the authors show that ´ because of the noise inherent to the sampling process within SGD, the errors introduced by asynchrony in the shared memory implementation are asymptotically negligible. Recently, several parallel SGD works focus on adjusting the worker-server interaction period or frequency as a way to decrease the communication overhead. For example, (Yu,
Jin, and Yang 2019) and (Yu, Yang, and Zhu 2019) used a fixed period, while (Yu and Jin 2019) and (Lin, Stich, and Jaggi 2018) propose an increasing period as a way to reduce communication. A detailed comparison of both computer network and shared memory implementation is given in Lian et al.(2015). Again, the aforementioned asynchronous algorithms are not distributed since they rely on a shared-memory or central coordinator.

在Duchi，Jordan和McMahan（2013）中，作者扩展了HOGWILD！算法对偶平均算法，适用于稀疏梯度的非平滑，非强凸问题。 HOGWILD的扩展！叫做BUCKWILD！ De Sa et al。（2015）中引入了``’’，以解决定点算法引入的量化误差。在Chaturapruek，Duchi和Re（2015）中，作者表明，由于SGD内采样过程固有的噪声，异步在共享内存实现中引入的误差可以渐近地忽略不计。最近，几项并行的SGD工作集中于调整工作者-服务器交互周期或频率，以减少通信开销。例如，（于，
Jin和Yang 2019）和（Yu，Yang和Zhu 2019）使用了固定时间段，而（Yu和Jin 2019）和（Lin，Stich和Jaggi 2018）建议增加时间段作为减少交流的一种方式。 Lian等人（2015）给出了计算机网络和共享内存实现的详细比较。再次，由于异步算法依赖于共享内存或中央协调器，因此它们是不分布式的。

Decentralized SGD:

Recently, numerous decentralized SGD algorithms for non-convex optimization have been proposed as a solution to the communication bottleneck often encountered in the server-client architecture (Lian et al. 2017; Jiang et al. 2017; Tang et al. 2018; Lian et al. 2018; Wang and Joshi 2018; Haddadpour et al. 2019; Assran et al. 2019; Wang et al. 2019). However almost all these works primarily focus on the performance of the algorithm during a fixed time interval, and the constant algorithm step-size, which often depends on the final time, is selected to speed-up the convergence rate. These SGD algorithms with constant step-size can only guarantee convergence to some -ball of the stationary point. Furthermore, most of the aforementioned decentralized SGD algorithms provide convergence rates in terms of the average of all local estimates of the global minimizer without ever proving a similar or faster consensus rate. In fact, most decentralized SGD algorithms can only provide bounded consensus and they require a centralized averaging step after running the algorithm until the final-time (Lian et al. 2017; Tang et al. 2018; Lian et al. 2018; Haddadpour et al. 2019; Wang et al. 2019). Finally, most application of decentralized SGD focus on distributed learning scenarios where the data is distributed identically across all agents.

最近，已经提出了许多用于非凸优化的分散式SGD算法，作为解决服务器-客户端体系结构中经常遇到的通信瓶颈的解决方案（Lian等人2017; Jiang等人2017; Tang等人2018; Lian等人等人; 2018; Wang和Joshi 2018; Haddadpour等2019; Assran等2019; Wang等2019）。但是，几乎所有这些工作都主要集中在固定时间间隔内的算法性能上，并且选择恒定算法步长（通常取决于最终时间）来加快收敛速度。这些具有恒定步长的SGD算法只能保证收敛到固定点的某个球。此外，大多数上述分散式SGD算法提供的收敛速度是全局最小化器所有本地估计值的平均值，而没有证明相似或更快的共识率。实际上，大多数去中心化SGD算法只能提供有限的共识，并且在运行算法直到最终时间之后它们需要集中平均步骤（Lian等人2017; Tang等人2018; Lian等人2018; Haddadpour等人2019; Wang等人2019）。最后，分散式SGD的大多数应用都集中在分布式学习方案中，在该方案中，数据在所有代理程序中的分布相同。

Contribution:

Currently, there exists no distributed SGD algorithm for the non-convex problems that doesn’t require constant or periodic communication among the agents. In fact, algorithms in (Lian et al. 2017; Jiang et al. 2017; Tang et al. 2018; Lian et al. 2018; Wang and Joshi 2018; Haddadpour et al. 2019; Assran et al. 2019; Wang et al. 2019) all rely on periodic communication despite the local model has not changed from previously communicated model. This is a waste of resources, especially in wireless setting and therefore we propose an approach that would allow the nodes to transmit only if the local model has significantly changed from previously communicated model. The contributions of this paper are three-fold: (i) we propose a fully distributed machine learning architecture, (ii) we present a distributed SGD algorithm built on a novel communication triggering mechanism, and provide sufficient conditions on step-sizes such that the algorithm is mean-square convergent, and (iii) we demonstrate the efficacy of the proposed event-triggered SGD algorithm for distributed supervised learning with i.i.d. and more importantly, non-i.i.d. data.
当前，不存在不需要代理之间进行恒定或定期通信的非凸问题的分布式SGD算法。实际上，（Lian等人2017; Jiang等人2017; Tang等人2018; Lian等人2018; Wang和Joshi 2018; Haddadpour等人2019; Assran等人2019; Wang等人的算法（2019年），尽管本地模型与以前传达的模型没有变化，但都依赖于定期交流。这是资源的浪费，尤其是在无线设置中，因此，我们提出了一种方法，该方法仅在本地模型与先前通信的模型发生显着变化的情况下才允许节点进行传输。本文的贡献包括三个方面：（i）我们提出了一种完全分布式的机器学习架构，（ii）我们提出了一种基于新型通信触发机制的分布式SGD算法，并为步长提供了足够的条件，从而使该算法是均方收敛的，并且（iii）我们证明了该事件触发的SGD算法对于iid分布式监督学习的有效性更重要的是，非i.d.数据。

Distributed machine learning

Our problem formulation closely follows the centralized machine learning problem discussed in Bottou, Curtis, and Nocedal(2018). Consider a networked set of n agents, each with

我们的问题表达紧随Bottou，Curtis和Nocedal（2018）中讨论的集中式机器学习问题。考虑由n个代理组成的网络集合，每个代理


total expected risk across all networked agents is given as
所有联网代理的总预期风险为：

Distributed event-triggered SGD

Convergence analysis

Our strategy for proving the convergence of the proposed
distributed event-triggered SGD algorithm to a critical point is as follows. First we show that the consensus error among the agents are diminishing at the rate of
我们证明拟议方案趋同的策略分布式事件触发的SGD算法达到临界点如下。首先，我们证明代理之间的共识误差正在以

(seeTheorem 1). Asymptotic convergence of the algorithm is then proved in Theorem 3. Theorem 4 then establishes that the weighted expected average gradient norm is a summable sequence. Convergence rate of the algorithm in the typical weak sense is given in Theorem 5. Finally, Theorem 6 proves the asymptotic mean-square convergence of the algorithm to a critical point. Theorem 1. Consider the event-triggered SGD algorithm (6) under Assumptions 1-7. Then, there holds:
然后在定理3中证明了该算法的渐近收敛性。定理4随后确定了加权期望平均梯度范数为可加序列。定理5中给出了典型弱意义上算法的收敛速度。最后，定理6证明了算法的临界点的渐近均方收敛。定理1.考虑假设1-7下的事件触发的SGD算法（6）。然后，保持：

.
Theorem 4 establishes results about the weighted sum
of expected average gradient norm and the key takeaway from this result is that, for the distributed SGD in
(8) or (6) with appropriate step-sizes, the expected average gradient norms cannot stay bounded away from zero (See Theorem 9 of (Bottou, Curtis, and Nocedal 2018

定理4建立关于加权和的结果期望的平均梯度范数，从这个结果中得出的主要结论是，对于（8）或（6）具有适当的步长，则期望的平均梯度范数不能保持远离零的边界（请参见（Bottou，Curtis和Nocedal 2018的定理9

Finally, we present the following result to illustrate that
stronger convergence results follows from the continuity assumption on the Hessian, which has not been utilized in our analysis so far.
最后，我们给出以下结果来说明来自Hessian的连续性假设产生了更强的收敛结果，该假设到目前为止尚未在我们的分析中使用。

Similar to the centralized SGD (Bottou, Curtis, and
Nocedal 2018), the analysis given here shows the meansquare convergence of the distributed algorithm to a critical point, which include the saddle points. Though SGD has shown to escape saddle points efficiently (Lee et al. 2017; Fang, Lin, and Zhang 2019; Jin et al. 2019), extension of such results for distributed SGD is currently nonexistent and is a topic for future research.

类似于集中式SGD（Bottou，Curtis 和 Nocedal 2018），此处给出的分析显示了分布式算法到临界点的均方收敛，其中包括鞍点。尽管SGD已显示可以有效地避开鞍点（Lee等人2017; Fang，Lin和Zhang 2019; Jin等人2019），但目前尚不存在将此类结果扩展到分布式SGD的情况，这是未来研究的主题。

Application to distributed supervised learning

We apply the proposed algorithm to distributedly train neural network agents for image classification task. We present extensive results on two different datasets - MNIST1 and CIFAR-102
我们将提出的算法应用于图像分类任务的分布式训练神经网络代理。我们在两个不同的数据集MNIST1和CIFAR-102上给出了广泛的结果

MNIST

MNIST data set is a handwritten digit recognition data set
containing 60000 grayscale images of 10 digits (0-9) for
training and 10000 images are used for testing. We distributedly train 10 agents that are connected in an undirected unweighted ring topology. The 10-node ring was selected only since it is one of the least connected network (besides the path) and MNIST contains 10 classes. Proposed algorithm would work for any undirected graph as along as it is connected.
Each agent aims to train its own neural network, which is
a randomly initialized LeNet-5 (LeCun et al. 1998). During
training, each agent broadcasts its weights to its neighbors at every iteration or aperiodically as described in the proposed algorithm. Here we conduct the following five experiments: (i) Centralized SGD, where a centralized version of the SGD is implemented by a central node having access to all 60000 training images from all classes; (ii) Distributed SGDr, where all the agents broadcast their respective weights at every iteration, and each agent has access to 6000 training
images, randomly sampled from the entire training set, which forms the i.i.d. case; (iii) Distributed SGD-s, where all the agents broadcast their weights at every iteration, and each agent has access to the images corresponding to a single class, which forms the non-i.i.d. case; (iv) DETSGRAD-r, where the agents aperiodically broadcast their weights using the triggering mechanism in (10), and each agent has access to 6000 training images, randomly sampled from the entire training set, i.e., i.i.d. case; (v) DETSGRAD-s, where the agents aperiodically broadcast their weights using the triggering mechanism in (10), and each agent has access to the images corresponding to a single class, i.e., non-i.i.d. case. In the single class case, for ease of programming, we set the number of training images available for each agent to 5421 (the minimum number of training images available in a single class, which is digit 5 in MNIST data set). Here we select

MNIST数据集是手写数字识别数据集
包含60000个10位数（0-9）的灰度图像
训练和10000张图像用于测试。我们分布式训练了10个以无向，无权环拓扑连接的代理。仅选择10节点环是因为它是连接最少的网络之一（路径之外），并且MNIST包含10个类别。所提出的算法将适用于任何无向图及其连接。
每个代理旨在训练自己的神经网络，即
随机初始化的LeNet-5（LeCun等，1998）。中
训练中，每个代理都会在每次迭代或不定期地将其权重广播给其邻居，如所提出的算法中所述。这里我们进行以下五个实验：（i）集中式SGD，其中SGD的集中式版本由可访问所有类别的所有60000个训练图像的中央节点实现；（ii）分布式SGDr，所有座席在每次迭代中广播各自的权重，每个座席均可以参加6000次培训
从整个训练集中随机采样的图像，形成了i.i.d.案件; （iii）分布式SGD-s，其中所有代理在每次迭代中广播其权重，并且每个代理都可以访问对应于单个类别的图像，这形成了非i.i.d。案件; （iv）DETSGRAD-r，其中特工使用（10）中的触发机制不定期地广播其权重，并且每个特工都可以访问6000个训练图像，这些图像是从整个训练集中（即i.d.案件; （v）DETSGRAD-s，其中代理使用（10）中的触发机制不定期地广播其权重，并且每个代理都可以访问对应于单个类别（即非i.d.d）的图像。案件。在单班情况下，为了便于编程，我们将每个代理可用的训练图像数量设置为5421（单班可用的最小训练图像数量，即MNIST数据集中的数字5）。在这里我们选择
.
The plots of the empirical risk vs. the iterations (parameter update steps), illustrated in Figure 1, show the convergence of the proposed algorithm. The final test accuracies of the 10 agents after 40 training epochs using different algorithms and different training settings are shown in Table 1. Results obtained here indicate that regardless of how the data are distributed (random or single class), the agents are able to train their network and the distributedly trained networks are able to yield similar performance as that of a centrally trained network. More importantly, in the single class case, agents were able to recognize images from all 10 classes even though they had access to data corresponding only to a single class during the training phase. This result has numerous implications for the machine learning community, specifically for federated multi-task learning under information flow constraints.

如图1所示，经验风险与迭代（参数更新步骤）的关系图表明了该算法的收敛性。表1显示了使用不同的算法和不同的训练设置后的40个训练纪元后10个代理的最终测试准确性。此处获得的结果表明，无论数据如何分布（随机或单类），这些代理都能够进行训练他们的网络和分布式训练的网络能够产生与集中训练的网络相似的性能。更重要的是，在单一班级的情况下，即使代理商在培训阶段可以访问仅对应于一个班级的数据，也能够识别所有10个班级的图像。这个结果对机器学习社区，特别是在信息流约束下的联合多任务学习，具有许多意义。

The total number of event-triggered parameter broadcast
events for the 10 agents using the DETSGRAD algorithm
are shown in Table 2. In the random sampling case, by employing broadcast event-triggering mechanism, we are able to reduce the inter-agent communications from 240000 to an average of 61702 over 40 epochs leading to a reduction of 74.2% in network communications. In the single class case, the agents broadcast the parameters continuously for the first 4 epochs, after which the event-trigger mechanism is started. Here, we are able to reduce the parameter broadcasts for each agent from 216840 to an average of 71933 over 40 epochs leading to a reduction of 66.8% in network communications.
Yet, as can be seen in Table 1, DETSGRAD gives similar
classification performance as distributed SGD with continuous parameter sharing with significant reduction in network

事件触发的参数广播总数
使用DETSGRAD算法的10个代理的事件
如表2所示。在随机采样的情况下，通过使用广播事件触发机制，我们能够将代理间的通信从400000减少到平均61702（超过40个时期），从而使网络通信减少了74.2％。在单类情况下，代理在前4个时期连续广播参数，此后启动事件触发机制。在这里，我们能够将每个代理的参数广播从216840减少到40个时期的平均71933，从而使网络通信减少66.8％。
但是，如表1所示，DETSGRAD给出了类似的结果
具有分布式SGD的分类性能，具有连续的参数共享功能，大大减少了网络通讯。的广播事件所占的比例
图2中显示了40个纪元以上的10个代理。正如预期的那样，随着代理数收敛到经验风险函数的临界点，广播事件的数量随着纪元数的增加而减少。

communications. The fractions of the broadcast events for
the 10 agents over 40 epochs are presented in Figure 2. As expected, the number of broadcast events reduces with the increase in epoch number as the agents converge to the critical point of the empirical risk function.

CIFAR-10

CIFAR-10 data set is an image classification data set containing 50000 color images of 10 classes (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck) for training and 10000 images are used for testing. We distributedly train 8 agents that are connected in an undirected unweighted ring topology. Each agent trains its own neural network, which is a randomly initialized ResNet-203 (He et al. 2016). We conducted the following three experiments: (i) Centralized SGD, where a centralized version of the SGD is implemented by a central node having access to all 50000 training images from all classes; (ii) Distributed SGD-r, where all the agents broadcast their respective weights at every iteration, and each agent has access to 6250 training images, randomly sampled from the entire training set; (iii) DETSGRAD-r, where the agents aperiodically broadcast their weights using the triggering mechanism in (10), and each agent has access to 6250 training images, randomly sampled from the entire training

CIFAR-10数据集是图像分类数据集，其中包含用于训练的10种类别（飞机，汽车，鸟类，猫，鹿，狗，青蛙，马，船和卡车）的50000张彩色图像，并使用10000张图像进行测试。我们分布式训练了8个以无向，无权环拓扑连接的代理。每个代理都训练自己的神经网络，该网络是随机初始化的ResNet-203（He等，2016）。我们进行了以下三个实验：（i）集中式SGD，其中SGD的集中式版本由可访问所有类别的所有50000张训练图像的中央节点实现；（ii）分布式SGD-r，其中所有代理在每次迭代中广播其各自的权重，并且每个代理可以访问从整个训练集中随机采样的6250个训练图像；（iii）DETSGRAD-r，其中代理人使用（10）中的触发机制不定期地广播其权重，并且每个代理人都可以访问从整个训练中随机采样的6250个训练图像

The plots of the empirical risk vs. epochs, illustrated in
Figure 3, show the convergence of the proposed algorithm. The final test accuracies of the 8 agents after 200 training epochs using two different algorithms are shown in Table 3. Similar to the previous case, results obtained here indicate that the distributedly trained networks are able to yield similar performance as that of a centrally trained network. The total number of event-triggered parameter broadcast events
for the 8 agents using the DETSGRAD algorithm are shown in Table 4. By employing broadcast event-triggering mechanism, we are able to reduce the inter-agent communications from 9800 to an average of 5482 over 200 epochs leading to a reduction of 44.1% in network communications. Yet, as can be seen in Table 3, DETSGRAD gives similar classification performance as distributed SGD with continuous parameter sharing with significant reduction in network communications.

经验风险与时期的关系图，如图所示
图3显示了所提出算法的收敛性。表3显示了使用两种不同算法在200个训练纪元后8个代理的最终测试准确性。与之前的情况类似，此处获得的结果表明，分布式训练网络能够产生与集中训练网络相似的性能。。事件触发的参数广播事件的总数
表4中显示了使用DETSGRAD算法的8个代理程序。通过使用广播事件触发机制，我们能够将代理程序间的通信在200个时期内从9800个减少到平均5482个，从而减少了44.1％。网络通讯。但是，如表3所示，DETSGRAD具有与分布式SGD类似的分类性能，并具有连续的参数共享功能，并且大大减少了网络通信量。

Conclusion

This paper presented the development of a distributed stochastic gradient descent algorithm with event-triggered communication mechanism for solving non-convex optimization problems. We presented a novel communication triggering mechanism, which allowed the agents to decidedly reduce the communication overhead by communicating only when the local model has significantly changed from previously communicated model. We presented the sufficient conditions on algorithm step-sizes to guarantee asymptotic mean-square convergence of the proposed algorithm to a critical point and provided the convergence rate of the proposed algorithm. We applied the developed algorithm to distributed supervised learning problem, in which a set of networked agents collaboratively train their individual neural nets to perform image classification. Results indicate that the distributedly trained networks are able to yield similar performance to that of a centrally trained network. Numerical results also show that the proposed event-triggered communication mechanism significantly reduced the inter-agent communication while yielding similar performance to that of a distributedly trained network with constant communication.

本文提出了一种基于事件触发的通信机制的分布式随机梯度下降算法，用于解决非凸优化问题。我们提出了一种新颖的通信触发机制，该机制允许代理仅在本地模型与先前通信的模型相比发生显着更改时才通过通信来确定地减少通信开销。我们给出了算法步长的充分条件，以保证所提出算法的渐近均方收敛到临界点，并提供了所提出算法的收敛速度。我们将开发的算法应用于分布式监督学习问题，在该问题中，一组网络代理共同训练他们各自的神经网络以执行图像分类。结果表明，分布式训练的网络能够产生与集中训练的网络相似的性能。数值结果还表明，所提出的事件触发通信机制显着减少了代理间通信，同时产生了与具有恒定通信的分布式训练网络相似的性能。

Reference

Agarwal, A., and Duchi, J. C. 2011. Distributed delayed stochastic
optimization. In NIPS. 873–881.
Assran, M.; Loizou, N.; Ballas, N.; and Rabbat, M. 2019. Stochastic
Gradient Push for Distributed Deep Learning. In ICML, 344–353.
Bianchi, P., and Jakubowicz, J. 2013. Convergence of a multi-agent
projected stochastic gradient algorithm for non-convex optimization.
IEEE TAC 58(2):391–405.
Bottou, L.; Curtis, F.; and Nocedal, J. 2018. Optimization methods
for large-scale machine learning. SIAM Review 60(2):223–311.
Chaturapruek, S.; Duchi, J. C.; and Re, C. 2015. Asynchronous ´
stochastic convex optimization: the noise is in the noise and sgd
don’t care. In NIPS. 1531–1539.
Chatzipanagiotis, N., and Zavlanos, M. M. 2017. On the convergence of a distributed augmented lagrangian method for nonconvex
optimization. IEEE TAC 62(9):4405–4420.
De Sa, C. M.; Zhang, C.; Olukotun, K.; Re, C.; and R ´ e, C. 2015. ´
Taming the wild: A unified analysis of hogwild-style algorithms. In
NIPS. 2674–2682.
Duchi, J.; Jordan, M. I.; and McMahan, B. 2013. Estimation,
optimization, and parallelism when data is sparse. In NIPS. 2832–2840.

Fang, C.; Lin, Z.; and Zhang, T. 2019. Sharp analysis for nonconvex
sgd escaping from saddle points. arXiv:1902.00247.
Feyzmahdavian, H. R.; Aytekin, A.; and Johansson, M. 2016. An
asynchronous mini-batch algorithm for regularized stochastic optimization. IEEE TAC 61(12):3740–3754.
George, J., and Gurram, P. 2019. Distributed Deep Learning with
Event-Triggered Communication. arXiv 1909.05020.
Guo, J.; Hug, G.; and Tonguz, O. K. 2017. A case for nonconvex
distributed optimization in large-scale power systems. IEEE TPS
32(5):3842 – 3851.
Haddadpour, F.; Kamani, M. M.; Mahdavi, M.; and Cadambe, V.
2019. Trading redundancy for communication: Speeding up distributed SGD for non-convex optimization. In ICML, 2545–2554.
Hajinezhad, D.; Hong, M.; Zhao, T.; and Wang, Z. 2016. Nestt: A
nonconvex primal-dual splitting method for distributed and stochastic optimization. In NIPS. 3215–3223.
Hajinezhad, D.; Hong, M.; and Garcia, A. 2019. Zone: Zeroth order
nonconvex multi-agent optimization over networks. IEEE TAC.
He, K.; Zhang, X.; Ren, S.; and Sun, J. 2016. Deep residual learning
for image recognition. In IEEE CVPR, 770–778.
Hong, M.; Hajinezhad, D.; and Zhao, M.-M. 2017. Prox-PDA:
The proximal primal-dual algorithm for fast distributed nonconvex
optimization and learning over networks. In ICML, 1529 – 1538.
Hong, M.; Luo, Z.; and Razaviyayn, M. 2016. Convergence analysis of alternating direction method of multipliers for a family of
nonconvex problems. SIAM JO 26(1):337–364.
Hong, M. 2018. A distributed, asynchronous, and incremental
algorithm for nonconvex optimization: An admm approach. IEEE
TCNS 5(3):935–945.
Huo, Z., and Huang, H. 2016. Asynchronous Stochastic Gradient
Descent with Variance Reduction for Non-Convex Optimization.
arXiv:1604.03584.
Jakovetic, D.; Bajovic, D.; Sahu, A. K.; and Kar, S. 2018. Convergence rates for distributed stochastic optimization over random
networks. In IEEE CDC, 4238–4245.
Jiang, Z.; Balu, A.; Hegde, C.; and Sarkar, S. 2017. Collaborative
deep learning in fixed topology networks. In NIPS. 5904–5914.
Jin, C.; Netrapalli, P.; Ge, R.; Kakade, S. M.; and Jordan, M. I.
2019. Stochastic gradient descent escapes saddle points efficiently.
arXiv:1902.04811.
Konecn˘ u, J.; McMahan, H. B.; Yu, F. X.; Richtarik, P.; Suresh, A. T.; ´
and Bacon, D. 2016. Federated learning: Strategies for improving
communication efficiency. In NIPSW.
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P.; et al. 1998. Gradientbased learning applied to document recognition. Proc. of the IEEE
86(11):2278–2324.
Lee, J. D.; Panageas, I.; Piliouras, G.; Simchowitz, M.; Jordan, M. I.;
and Recht, B. 2017. First-order methods almost always avoid saddle
points. arXiv:1710.07406.

Li, M.; Andersen, D. G.; Park, J. W.; Smola, A. J.; Ahmed, A.;
Josifovski, V.; Long, J.; Shekita, E. J.; and Su, B.-Y. 2014a. Scaling
distributed machine learning with the parameter server. In USENIX
OSDI, 583 – 598.

Li, M.; Andersen, D. G.; Smola, A. J.; and Yu, K. 2014b. Communication efficient distributed machine learning with the parameter
server. In NIPS. 19–27.
Lian, X.; Huang, Y.; Li, Y.; and Liu, J. 2015. Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization.
arXiv:1506.08272.
Lian, X.; Zhang, C.; Zhang, H.; Hsieh, C.-J.; Zhang, W.; and Liu,
J. 2017. Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient
descent. In NIPS. 5330–5340.
Lian, X.; Zhang, W.; Zhang, C.; and Liu, J. 2018. Asynchronous
decentralized parallel stochastic gradient descent. In ICML, 3043–
3052.
Lin, T.; Stich, S. U.; and Jaggi, M. 2018. Don’t use large minibatches, use local SGD. arXiv 1808.07217.
Lorenzo, P. D., and Scutari, G. 2016. NEXT: In-network nonconvex
optimization. IEEE TSIPN 2(2):120–136.
McMahan, H. B.; Moore, E.; Ramage, D.; Hampson, S.; and y Arcas,
B. A. 2017. Communication-efficient learning of deep networks
from decentralized data. In AISTATS.
Mitliagkas, I.; Borokhovich, M.; Dimakis, A. G.; and Caramanis, C.
2015. Frogwild!: Fast pagerank approximations on graph engines.
Proc. VLDB Endow. 8(8):874–885.
Nedic, A., and Olshevsky, A. 2016. Stochastic gradient-push for ´
strongly convex functions on time-varying directed graphs. IEEE
TAC 61(12):3936–3947.
Noel, C., and Osindero, S. 2014. Dogwild!-distributed hogwild for
cpu & gpu. In NIPSW.
Pu, S., and Nedic, A. 2018. Distributed Stochastic Gradient Track- ´
ing Methods. arXiv:1805.11454.
Recht, B.; Re, C.; Wright, S.; and Niu, F. 2011. Hogwild: A lockfree approach to parallelizing stochastic gradient descent. In NIPS.
693–701.
Scutari, G.; Facchinei, F.; and Lampariello, L. 2017. Parallel and
distributed methods for constrained nonconvex optimizationpart i:
Theory. IEEE TSP 65(8):1929 – 1944.
Tang, H.; Lian, X.; Yan, M.; Zhang, C.; and Liu, J. 2018. d
2
:
Decentralized training over decentralized data. In ICML, 4848–
4856.
Tatarenko, T., and Touri, B. 2017. Non-convex distributed optimization. IEEE TAC 62(8):3744 – 3757.
Wai, H.; Freris, N. M.; Nedic, A.; and Scaglione, A. 2018. Sucag:
Stochastic unbiased curvature-aided gradient method for distributed
optimization. In IEEE CDC, 1751–1756.
Wang, J., and Joshi, G. 2018. Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD
Algorithms. arXiv:1808.07576.
Wang, J.; Sahu, A. K.; Yang, Z.; Joshi, G.; and Kar, S. 2019.
MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling. arXiv:1905.09435.
Yu, H., and Jin, R. 2019. On the computation and communication
complexity of parallel SGD with dynamic batch sizes for stochastic
non-convex optimization. In ICML, 7174–7183.
Yu, H.; Hsieh, C.; Si, S.; and Dhillon, I. 2012. Scalable coordinate
descent approaches to parallel matrix factorization for recommender
systems. In IEEE ICDM, 765–774.
Yu, H.; Jin, R.; and Yang, S. 2019. On the linear speedup analysis
of communication efficient momentum SGD for distributed nonconvex optimization. In ICML, 7184–7193.
Yu, H.; Yang, S.; and Zhu, S. 2019. Parallel restarted SGD with
faster convergence and less communication: Demystifying why
model averaging works for deep learning. In AAAI-19, 5693–5700.
Zeng, J., and Yin, W. 2018. On nonconvex decentralized gradient
descent. IEEE TSP 66(11):2834–2848.
Zhang, K.; Alqahtani, S.; and Demirbas, M. 2017. A comparison
of distributed machine learning platforms. In ICCCN, 1–9.
Zhang, J.; Tu, H.; Ren, Y.; Wan, J.; Zhou, L.; Li, M.; and Wang,
J. 2018. An adaptive synchronous parallel strategy for distributed
machine learning. IEEE Access 6:19222–19230.
Zhou, Z.; Mertikopoulos, P.; Bambos, N.; Glynn, P.; Ye, Y.; Li, L.-J.;
and Fei-Fei, L. 2018. Distributed asynchronous optimization with
unbounded delays: How slow can you go? In ICML, 5970–5979.
Zhu, M., and Martnez, S. 2013. An approximate dual subgradient
algorithm for multi-agent non-convex optimization. IEEE TAC
58(6):1534 – 1539.

你可能感兴趣的:(深度学习,论文研读,算法)

Python文件加密库之cryptography使用详解 Rocky006 python 开发语言
概要在现代信息社会中，数据的安全性变得越来越重要。为了保护敏感信息，文件加密技术被广泛应用。Python的cryptography库提供了强大的加密功能，可以轻松实现文件加密和解密。本文将详细介绍如何使用cryptography库进行文件加密，包含具体的示例代码。cryptography库简介cryptography是Python中一个功能强大且易用的加密库，提供了对称加密、非对称加密、哈希算法、
数据结构：交换排序的实现 z_鑫数据结构数据结构排序算法算法 c语言
概要交换排序是一类通过比较和交换元素位置来实现排序的算法。其核心思想是在序列中进行两两比较，若元素顺序不符合排序要求，则交换它们的位置。常见的交换排序算法包括冒泡排序和快速排序，它们在不同场景下各有优劣。整体架构流程冒泡排序从数组的第一个元素开始，依次比较相邻的两个元素；如果前一个元素大于后一个元素（假设为升序排序），则交换这两个元素的位置；对数组中的每一对相邻元素都执行上述操作，经过一轮比较后，
cryptography，一个神奇的 Python 库！ Sitin涛哥 Python python 开发语言
更多资料获取个人网站：ipengtao.com大家好，今天为大家分享一个神奇的Python库-cryptography。Github地址：https://github.com/pyca/cryptography在当今数字化时代，信息安全越来越受到重视。数据加密是保护数据安全的重要手段之一，而Python的cryptography库提供了丰富的功能来支持各种加密算法和协议。本文将深入探讨crypto
Leetcode-100 贪心算法 LuckyAnJo leetcode leetcode 贪心算法算法
贪心算法简介贪心算法（GreedyAlgorithm）是一种常见的优化算法，用于解决最优化问题。该算法的核心思想是每次选择当前情况下的最优解，并期望通过这些局部最优解得到全局最优解。贪心算法通常用于那些可以分解为若干个子问题，且每个子问题的最优解可以合成全局最优解的问题。贪心算法之所以有用，是因为它可以快速地做出决策，并能在某些问题上实现较高的效率，避免了回溯与暴力解法的复杂度。贪心算法思想贪心算
在Mac M1/M2芯片上完美安装DeepCTR库：避坑指南与实战验证 ku_code_ku 机器学习 macos 推荐算法推荐系统
让推荐算法在AppleSilicon上全速运行概述作为推荐系统领域的最经常用的明星库，DeepCTR集成了CTR预估、多任务学习等前沿模型实现。但在AppleSilicon架构的Mac设备上，安装过程常因ARM架构适配、依赖库版本冲突等问题受阻。本文通过20+次环境搭建实测，总结出最稳定的安装方案。关键版本说明（2024年验证）组件推荐版本注意事项Python3.10.x向下兼容至3.7，但3.1
字节跳动算法高频题：动态规划最优模板知识产权13937636601 计算机算法动态规划
本文系统梳理字节跳动近三年算法面试中的动态规划（DP）高频题型，提炼出适用于80%场景的通用解题模板。通过背包问题、字符串处理、状态压缩等六大核心模块解析，结合跳槽、股票交易、编辑距离等15道真题案例，揭示动态规划的状态转移方程构建规律与维度优化技巧，助您在面试中实现时间复杂度与空间复杂度的双重最优解。第一章动态规划基础框架1.1动态规划三大特征特征判定标准真题案例重叠子问题递归树中存在重复计算节
macOS 使用 enca 识别文件编码类型（比 file 命令准确）知识搬运bot 软件工具/使用技巧 macos enca file iconv 文件编码
文章目录macOS上安装enca基本使用起因-iconv关于enca安装Encaenca&enconv其它用法macOS上安装encabrewinstallenca基本使用encafilepath.txt示例$enca动态规划算法.txt[0]SimplifiedChineseNationalStandard;GB2312CRLFlineterminators起因-iconv在macOS上打开一些
OpenCV图像拼接（4）图像拼接模块的一个匹配器类cv::detail::BestOf2NearestRangeMatcher 村北头的码农 OpenCV opencv 人工智能计算机视觉
操作系统：ubuntu22.04OpenCV版本：OpenCV4.9IDE:VisualStudioCode编程语言：C++11算法描述cv::detail::BestOf2NearestRangeMatcher是OpenCV库中用于图像拼接模块的一个匹配器类，专门用于寻找两幅图像之间的最佳特征点匹配。它是基于“最近邻与次近邻距离比”原则来过滤匹配点对的，以提高匹配结果的准确性。这个类特别适用于需
股票市场的量化交易策略如何应对市场情绪变化？云策量化程序化炒股量化软件量化交易量化炒股 QMT 股票交易 PTrade 量化交易股票投资 deepseek
推荐阅读：《程序化炒股：如何申请官方交易接口权限？个人账户可以申请吗？》股票市场的量化交易策略如何应对市场情绪变化？在股票市场中，量化交易策略是一种基于数学模型和算法的交易方式，它通过分析历史数据来预测未来价格走势，并据此制定交易决策。然而，市场情绪的变化对股票价格有着不可忽视的影响。本文将探讨量化交易策略如何应对市场情绪的变化，并提供一些具体的代码示例。一、市场情绪的重要性市场情绪是指投资者对市
算法笔记——前缀树、贪心算法（更新ing....... 不吃香菜的码农左神算法笔记算法数据结构贪心算法 leetcode 堆栈
前缀树、贪心算法一、前缀树1.什么是前缀树2.如何生成前缀树二、贪心算法1.拼接字符串2.金条问题3.项目会议时间问题4.项目收益最大化4.随时获得数据流的中位数一、前缀树1.什么是前缀树前缀树一般指字典树这是指一种结构而不是一类题（注意信息是在树的路上）典型应用是用于统计和排序大量的字符串（但不仅限于字符串），所以经常被搜索引擎系统用于文本词频统计。它的优点是：最大限度地减少无谓的字符串比较，查
Open3D 点云DBSCAN聚类算法 MelaCandy 算法聚类 numpy 计算机视觉图像处理 3d
目录一、DBSCAN基本原理二、代码实现2.1关键函数2.2完整代码三、实现效果3.1原始点云3.2聚类后点云Open3D点云算法汇总及实战案例汇总的目录地址：Open3D点云算法与点云深度学习案例汇总（长期更新）-CSDN博客一、DBSCAN基本原理DBSCAN（Density-BasedSpatialClusteringofApplicationswithNoise）是一种基于密度的聚类算法，
python 列表倒序输出小琳爱分享 python python
python列表倒序输出#使用reverseli1=[1,6,4,3,7,9]li2=['a','m','s','g']li1.reverse()li2.reverse()print(li1,li2)#利用list切片li1=[1,6,4,3,7,9]li2=['a','m','s','g']print(li1[::-1])print(li2[::-1])#利用算法进行转换，这里需要用到深层cop
基于WebAssembly的浏览器密码套件闲人编程 wasm 服务器易于集成跨平台性密码套件浏览器 WebAssembly
目录一、前言二、WebAssembly与浏览器密码套件2.1WebAssembly技术概述2.2浏览器密码套件的需求三、系统设计思路与架构3.1核心模块3.2系统整体架构图四、核心数学公式与算法证明4.1AES-GCM加解密公式4.2SHA-256哈希函数五、异步任务调度与GPU加速设计5.1异步任务调度5.2GPU加速六、GUI设计与功能模块七、完整代码实现九、代码自查与总结十、总结与展望一、前
密码学，算法在人工智能的实战利用 china—hbaby 人工智能密码学
在人工智能（AI）的快速发展中，数据安全和隐私保护成为了核心议题。密码学，作为保护信息安全的基石，其在AI领域的应用显得尤为重要。本文将探讨密码学在AI中的利用，并提供一些代码示例来展示其实际应用。密码学的概述即常用加密方式密码学（Cryptography）是数学和计算机科学的一个分支，它涉及保护信息的安全性和隐私性。密码学的主要目标是确保信息在传输过程中不被未授权的第三方读取或篡改，以及确保信息
力扣算法ing(35 / 100) 菥菥爱嘻嘻小白学习算法算法 leetcode typescript javascript
3.22104.二叉树的最大深度我的思路：dfs,深度优先搜索或者说能不能先根搜索，根层数3192nullmax=2202153nullmax=373nullmax=3我的代码：if(head.next===null)maxreturnfunctionmaxDepth(root:TreeNode|null):number{functionfindMax(root:TreeNode|null,dep
力扣算法ing(30 / 100) 菥菥爱嘻嘻小白学习算法算法 leetcode typescript javascript
3.1719.删除链表的倒数第n个结点给你一个链表，删除链表的倒数第n个结点，并且返回链表的头结点。示例1：输入：head=[1,2,3,4,5],n=2输出：[1,2,3,5]示例2：输入：head=[1],n=1输出：[]示例3：输入：head=[1,2],n=1输出：[1]删除指定的节点，给出头节点逆转链表，寻找第n个，删除不行不行，逆转录又要反转回去后面我想到了一个解决办法：利用数组计算总
力扣算法ing(9/100) 菥菥爱嘻嘻小白学习算法算法 leetcode 数据库 typescript
2.26438.找到字符串中所有字母的异位词438.找到字符串中所有字母异位词给定两个字符串s和p，找到s中所有p的异位词的子串，返回这些子串的起始索引。不考虑答案输出的顺序。示例1:输入:s="cbaebabacd",p="abc"输出:[0,6]解释:起始索引等于0的子串是"cba",它是"abc"的异位词。起始索引等于6的子串是"bac",它是"abc"的异位词。示例2:输入:s="abab
【C/C++】在排序数组中查找元素的第一个和最后一个位置（leetcode T34）勇士小蓝0727 c语言 c++leetcode 开发语言算法数据结构蓝桥杯
核心考点：法一双指针法;法二二分查找法题目描述：给你一个按照非递减顺序排列的整数数组nums，和一个目标值target。请你找出给定目标值在数组中的开始位置和结束位置。如果数组中不存在目标值target，返回[-1,-1]。你必须设计并实现时间复杂度为O(logn)的算法解决此问题。（示例见文末）答案详解：方法一：双指针法vectorsearchRange(vector&nums,inttarge
每日算法题-Nim 游戏 - 台阶晚夜微雨问海棠呀算法游戏
给定一个台阶数n，玩家每次可以选择跳跃1到m个台阶，最后一个台阶到达者获胜。假设两位玩家都采取最优策略，判断先手玩家是否会获胜。输入格式一行包含两个整数n和m（1≤n,m≤10^9）。输出格式如果先手玩家能获胜，输出"Yes"；否则输出"No"。n,m=map(int,input().split())ifnm时，若n%(m+1)≠0，先手可以通过策略使剩余台阶数变为(m+1)的倍数，将必败态转移给
算法每日一练 (17) 张胤尘算法每日一练算法数据结构
欢迎来到张胤尘的技术站技术如江河，汇聚众志成。代码似星辰，照亮行征程。开源精神长，传承永不忘。携手共前行，未来更辉煌文章目录算法每日一练(17)打家劫舍题目描述解题思路解题代码`c/c++``golang``lua`官方站点：力扣Leetcode算法每日一练(17)打家劫舍题目地址：打家劫舍题目描述你是一个专业的小偷，计划偷窃沿街的房屋。每间房内都藏有一定的现金，影响你偷窃的唯一制约因素就是相邻的
算法每日一练 (16) 张胤尘算法每日一练算法数据结构
欢迎来到张胤尘的技术站技术如江河，汇聚众志成。代码似星辰，照亮行征程。开源精神长，传承永不忘。携手共前行，未来更辉煌文章目录算法每日一练(16)使用最小花费爬楼梯题目描述解题思路解题代码`c/c++``golang``lua`官方站点：力扣Leetcode算法每日一练(16)使用最小花费爬楼梯题目地址：使用最小花费爬楼梯题目描述给你一个整数数组cost，其中cost[i]是从楼梯第i个台阶向上爬需
目标检测领域总结：从传统方法到 Transformer 时代的革新 DoYangTan 目标检测系列目标检测 transformer 人工智能
目标检测领域总结：从传统方法到Transformer时代的革新目标检测是计算机视觉领域的一个核心任务，它的目标是从输入图像中识别并定位出目标物体。随着深度学习的兴起，目标检测方法已经取得了显著的进展。从最早的传统方法到现如今基于Transformer的先进算法，目标检测的发展经历了多个重要的阶段。本文将详细总结目标检测领域的演进，涵盖传统方法、两阶段检测方法、单阶段检测方法和基于Transform
2024MathorCup数学建模之——MathorCup奖杯”获得者经验思路分享美赛数学建模数学建模
一、经验分享1.工具选择：顺手即可。Matlab和Python都是比较主流的选择，二者的应用场合各有不同。Python在数据分析、深度学习方面的优势愈发明显，而Matlab更适合进行物理仿真和数值计算。不过随着Python社区不断发展，其功能也愈发全面与强大，因此我们比较推荐学有余力的情况下可以更早接触Python。2.模型算法：多多益善。不一定要精通所有的算法，但是手上至少要准备一些常用的算法（
AI人工智能软件开发方案：开启智能时代的创新钥匙广州硅基技术官方人工智能
一、引言：AI浪潮下的软件开发新机遇近年来，人工智能（AI）技术的迅猛发展如同一股汹涌澎湃的浪潮，席卷了全球各个领域。从最初的概念提出到如今的广泛应用，AI历经了漫长的发展历程，终于迎来了属于它的黄金时代。回首过去，AI的发展并非一帆风顺，早期由于计算能力和算法的限制，经历了多次起伏。但随着大数据、云计算、机器学习、深度学习等技术的不断突破，AI迎来了爆发式增长。如今，AI已经深入到人们生活和工作
深度学习框架PyTorch——从入门到精通（6.2）自动微分机制 Fansv587 深度学习 pytorch 人工智能经验分享 python 机器学习
本节自动微分机制是上一节自动微分的扩展内容自动微分是如何记录运算历史的保存张量非可微函数的梯度在本地设置禁用梯度计算设置requires_grad梯度模式（GradModes）默认模式（梯度模式）无梯度模式推理模式评估模式（`nn.Module.eval()`）自动求导中的原地操作原地操作的正确性检查多线程自动求导CPU上的并发不确定性计算图保留自动求导节点的线程安全性C++钩子函数不存在线程安全
燃爆！程序员如何借助 AI 大模型冲破编程效率枷锁？（以DeepSeek，ChatGPT为例）羑悻的小杀马特. AI学习 chatgpt deepseek AI大模型开发语言
AI大模型已成为程序员提升效率的有力助手。本文聚焦DeepSeek和ChatGPT，探讨程序员如何借其冲破编程效率枷锁。在代码编写阶段，它们能快速生成基础框架、实现特定功能及复杂算法代码；调试时，精准分析错误并给出优化建议；文档生成方面，为函数、类及项目文档助力。程序员需掌握高效交互技巧，结合自身经验，合理利用AI大模型，全面提升编程效率，开启高效编程新境界。目录一·本篇背景：二、AI大模型简介2
Pytorch深度学习教程_9_nn模块构建神经网络 tRNA做科研深度学习保姆教程深度学习 pytorch 神经网络
欢迎来到《深度学习保姆教程》系列的第九篇！在前面的几篇中，我们已经介绍了Python、numpy及pytorch的基本使用，进行了梯度及神经网络的实践并学习了激活函数和激活函数，在上一个教程中我们学习了优化算法。今天，我们将开始使用pytorch构建我们自己的神经网络。欢迎订阅专栏进行系统学习：深度学习保姆教程_tRNA做科研的博客-CSDN博客目录1.理解nn模块：(1)使用nn.Sequent
【机器学习】算法分类 CH3_CH2_CHO 什么？！是机器学习！！机器学习算法有监督学习无监督学习半监督学习强化学习
1、有监督学习1.1定义使用带标签的数据训练模型。有监督学习是机器学习中最常见的一种类型，它利用已知的输入特征和对应的输出标签来训练模型，使模型能够学习到特征与标签之间的映射关系。在训练过程中，模型会不断地调整自身的参数，以最小化预测值与真实标签之间的误差，从而提高预测的准确性。1.2回归问题1.2.1目标预测连续值。回归问题的目标是预测一个连续的数值结果，模型的输出是一个实数值。1.2.2解释回
Radiance Fields from VGGSfM和Mast3r:两种先进3D重建方法的比较与分析 2401_87458718 3d
VGGSfM和Mast3r:3D场景重建的新方向在计算机视觉和3D重建领域,如何从2D图像重建3D场景一直是一个充满挑战的研究课题。近年来,随着深度学习技术的发展,一些新的方法被提出并取得了显著的进展。本文将重点介绍两种最新的基于深度学习的3D重建方法:VGGSfM和Mast3r,并通过GaussianSplatting技术对它们的性能进行全面比较和分析。VGGSfM:基于视觉几何的深度结构运动恢
基于 PyTorch 的 MNIST 手写数字分类模型欣然～ pytorch 分类人工智能
一、概述本代码使用PyTorch框架构建了一个简单的神经网络模型，用于解决MNIST手写数字分类任务。代码主要包括数据的加载与预处理、神经网络模型的构建、损失函数和优化器的定义、模型的训练、评估以及最终模型的保存等步骤。二、依赖库torch：PyTorch深度学习框架的核心库，提供了张量操作、自动求导等功能。torch.nn：PyTorch的神经网络模块，包含了各种神经网络层、损失函数等。torc
ztree设置禁用节点 3213213333332132 JavaScript ztree json setDisabledNode Ajax
ztree设置禁用节点的时候注意，当使用ajax后台请求数据,必须要设置为同步获取数据，否者会获取不到节点对象，导致设置禁用没有效果。 $(function(){ showTree(); setDisabledNode(); });
JVM patch by Taobao bookjovi java HotSpot
在网上无意中看到淘宝提交的hotspot patch，共四个，有意思，记录一下。 7050685：jsdbproc64.sh has a typo in the package name 7058036：FieldsAllocationStyle=2 does not work in 32-bit VM 7060619：C1 should respect inline and
将session存储到数据库中 dcj3sjt126com sql PHP session
CREATE TABLE sessions ( id CHAR(32) NOT NULL, data TEXT, last_accessed TIMESTAMP NOT NULL, PRIMARY KEY (id) ); <?php /** * Created by PhpStorm. * User: michaeldu * Date
Vector 171815164 vector
public Vector<CartProduct> delCart(Vector<CartProduct> cart, String id) { for (int i = 0; i < cart.size(); i++) { if (cart.get(i).getId().equals(id)) { cart.remove(i);
各连接池配置参数比较 g21121 连接池
排版真心费劲，大家凑合看下吧，见谅~ Druid DBCP C3P0 Proxool 数据库用户名称 Username Username User 数据库密码 Password Password Password 驱动名
[简单]mybatis insert语句添加动态字段 53873039oycg mybatis
mysql数据库,id自增,配置如下： <insert id="saveTestTb" useGeneratedKeys="true" keyProperty="id" parameterType=&
struts2拦截器配置云端月影 struts2拦截器
struts2拦截器interceptor的三种配置方法方法1. 普通配置法 <struts> <package name="struts2" extends="struts-default"> &
IE中页面不居中，火狐谷歌等正常 aijuans IE中页面不居中
问题是首页在火狐、谷歌、所有IE中正常显示，列表页的页面在火狐谷歌中正常，在IE6、7、8中都不中，觉得可能那个地方设置的让IE系列都不认识，仔细查看后发现，列表页中没写HTML模板部分没有添加DTD定义，就是<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3
String,int,Integer,char 几个类型常见转换 antonyup_2006 html sql .net
如何将字串 String 转换成整数 int? int i = Integer.valueOf(my_str).intValue(); int i=Integer.parseInt(str); 如何将字串 String 转换成Integer ? Integer integer=Integer.valueOf(str); 如何将整数 int 转换成字串 String ? 1.
PL/SQL的游标类型百合不是茶显示游标(静态游标)隐式游标游标的更新和删除 %rowtype ref游标(动态游标)
游标是oracle中的一个结果集,用于存放查询的结果; PL/SQL中游标的声明; 1,声明游标 2,打开游标(默认是关闭的); 3,提取数据 4,关闭游标注意的要点:游标必须声明在declare中,使用open打开游标,fetch取游标中的数据,close关闭游标隐式游标:主要是对DML数据的操作隐
JUnit4中@AfterClass @BeforeClass @after @before的区别对比 bijian1013 JUnit4 单元测试
一.基础知识 JUnit4使用Java5中的注解（annotation），以下是JUnit4常用的几个annotation： @Before：初始化方法对于每一个测试方法都要执行一次（注意与BeforeClass区别，后者是对于所有方法执行一次）@After：释放资源对于每一个测试方法都要执行一次（注意与AfterClass区别，后者是对于所有方法执行一次
精通Oracle10编程SQL(12)开发包 bijian1013 oracle 数据库 plsql
/* *开发包 *包用于逻辑组合相关的PL/SQL类型（例如TABLE类型和RECORD类型）、PL/SQL项（例如游标和游标变量）和PL/SQL子程序（例如过程和函数） */ --包用于逻辑组合相关的PL/SQL类型、项和子程序，它由包规范和包体两部分组成 --建立包规范：包规范实际是包与应用程序之间的接口，它用于定义包的公用组件，包括常量、变量、游标、过程和函数等 --在包规
【EhCache二】ehcache.xml配置详解 bit1129 ehcache.xml
在ehcache官网上找了多次，终于找到ehcache.xml配置元素和属性的含义说明文档了，这个文档包含在ehcache.xml的注释中！ ehcache.xml ： http://ehcache.org/ehcache.xml ehcache.xsd ： http://ehcache.org/ehcache.xsd ehcache配置文件的根元素是ehcahe ehcac
java.lang.ClassNotFoundException: org.springframework.web.context.ContextLoaderL 白糖_ java eclipse spring tomcat Web
今天学习spring+cxf的时候遇到一个问题：在web.xml中配置了spring的上下文监听器： <listener> <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class> </listener> 随后启动
angular.element boyitech AngularJS AngularJS API angular.element
angular.element 描述: 包裹着一部分DOM element或者是HTML字符串，把它作为一个jQuery元素来处理。（类似于jQuery的选择器啦）如果jQuery被引入了，则angular.element就可以看作是jQuery选择器，选择的对象可以使用jQuery的函数；如果jQuery不可用，angular.e
java-给定两个已排序序列，找出共同的元素。 bylijinnan java
import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class CommonItemInTwoSortedArray { /** * 题目：给定两个已排序序列，找出共同的元素。 * 1.定义两个指针分别指向序列的开始。 * 如果指向的两个元素
sftp 异常，有遇到的吗？求解 Chen.H java jcraft auth jsch jschexception
com.jcraft.jsch.JSchException: Auth cancel at com.jcraft.jsch.Session.connect(Session.java:460) at com.jcraft.jsch.Session.connect(Session.java:154) at cn.vivame.util.ftp.SftpServerAccess.connec
[生物智能与人工智能]神经元中的电化学结构代表什么? comsci 人工智能
我这里做一个大胆的猜想,生物神经网络中的神经元中包含着一些化学和类似电路的结构,这些结构通常用来扮演类似我们在拓扑分析系统中的节点嵌入方程一样,使得我们的神经网络产生智能判断的能力,而这些嵌入到节点中的方程同时也扮演着"经验"的角色.... 我们可以尝试一下...在某些神经
通过LAC和CID获取经纬度信息 dai_lm lac cid
方法1：用浏览器打开http://www.minigps.net/cellsearch.html，然后输入lac和cid信息(mcc和mnc可以填0)，如果数据正确就可以获得相应的经纬度方法2：发送HTTP请求到http://www.open-electronics.org/celltrack/cell.php?hex=0&lac=<lac>&cid=&
JAVA的困难分析 datamachine java
前段时间转了一篇SQL的文章（http://datamachine.iteye.com/blog/1971896），文章不复杂，但思想深刻，就顺便思考了一下java的不足，当砖头丢出来，希望引点和田玉。 -----------------------------------------------------------------------------------------
小学5年级英语单词背诵第二课 dcj3sjt126com english word
money 钱 paper 纸 speak 讲，说 tell 告诉 remember 记得，想起 knock 敲，击，打 question 问题 number 数字，号码 learn 学会，学习 street 街道 carry 搬运，携带 send 发送，邮寄，发射 must 必须 light 灯，光线，轻的 front
linux下面没有tree命令 dcj3sjt126com linux
centos p安装 yum -y install tree mac os安装 brew install tree 首先来看tree的用法 tree 中文解释：tree 功能说明：以树状图列出目录的内容。语　　法：tree [-aACdDfFgilnNpqstux][-I <范本样式>][-P <范本样式
Map迭代方式，Map迭代，Map循环蕃薯耀 Map循环 Map迭代 Map迭代方式
Map迭代方式，Map迭代，Map循环 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 蕃薯耀 2015年
Spring Cache注解+Redis hanqunfeng spring
Spring3.1 Cache注解依赖jar包：  <dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-redis</artifactId>
Guava中针对集合的 filter和过滤功能 jackyrong filter
在guava库中，自带了过滤器(filter)的功能，可以用来对collection 进行过滤，先看例子： @Test public void whenFilterWithIterables_thenFiltered() { List<String> names = Lists.newArrayList("John"
学习编程那点事 lampcy 编程 android PHP html5
一年前的夏天，我还在纠结要不要改行，要不要去学php？能学到真本事吗？改行能成功吗？太多的问题，我终于不顾一切，下定决心，辞去了工作，来到传说中的帝都。老师给的乘车方式还算有效，很顺利的就到了学校，赶巧了，正好学校搬到了新校区。先安顿了下来，过了个轻松的周末，第一次到帝都，逛逛吧！接下来的周一，是我噩梦的开始，学习内容对我这个零基础的人来说，除了勉强完成老师布置的作业外，我已经没有时间和精力去
架构师之流处理---------bytebuffer的mark,limit和flip nannan408 ByteBuffer
1.前言。如题，limit其实就是可以读取的字节长度的意思，flip是清空的意思，mark是标记的意思。 2.例子. 例子代码: String str = "helloWorld"; ByteBuffer buff = ByteBuffer.wrap(str.getBytes()); Sy
org.apache.el.parser.ParseException: Encountered " ":" ": "" at line 1, column 1 Everyday都不同 $转义 el表达式
最近在做Highcharts的过程中，在写js时，出现了以下异常：严重: Servlet.service() for servlet jsp threw exception org.apache.el.parser.ParseException: Encountered " ":" ": "" at line 1,
用Java实现发送邮件到163 tntxia java实现
/* 在java版经常看到有人问如何用javamail发送邮件？如何接收邮件？如何访问多个文件夹等。问题零散，而历史的回复早已经淹没在问题的海洋之中。本人之前所做过一个java项目，其中包含有WebMail功能，当初为用java实现而对javamail摸索了一段时间，总算有点收获。看到论坛中的经常有此方面的问题，因此把我的一些经验帖出来，希望对大家有些帮助。此篇仅介绍用
探索实体类存在的真正意义 java小叶檀 POJO
一. 实体类简述实体类其实就是俗称的POJO,这种类一般不实现特殊框架下的接口，在程序中仅作为数据容器用来持久化存储数据用的 POJO（Plain Old Java Objects）简单的Java对象它的一般格式就是 public class A{ private String id; public Str