血_影

Distributed training of Deep Learning models with PyTorch

The motive of this article is to demonstrate the idea of Distributed Computing in the context of training large scale Deep Learning (DL) models. In particular, the article first presents the basic concepts of distributed computing and how it fits into the idea of Deep learning. Then it moves on to listing the standard requirements (hardware and software) for setting up an environment capable of handling distributed applications. Finally, to provide a hands-on experience, it demonstrates a specific distributed algorithm (namely Synchronous SGD) for training DL models from a theoretical as well as implementation perspective.

What is Distributed computing?

Distributed computing refers to the way of writing a program that makes use several distinct components connected over network. Typically, large scale computation is achieved by such an arrangement of computers capable of handling high density numeric computations in parallel. In distributed computing terminology, these computers are often referred to as nodes and a collection of such nodes form a cluster over the network. These nodes are usually connected via Ethernet, but other high-bandwidth networks are also used to take full advantage of the distributed architecture.

How does Deep learning benefit from distributed computing?

Although Neural Networks, the main workhorse of DL, has been in the literature from quite a while, nobody could utilise its full potential until recently. One of the primary reasons for the sudden boost in its popularity has something to with massive computational power, the very idea we are trying to address in this article. Deep learning requires training Deep neural networks (DNN) with massive number of parameters on a huge amount of data. Distributed computing is a perfect tool to take advantage of the modern hardware to its fullest. Here is the core idea:
A properly crafted distributed algorithm can:

“Distribute” computation (forward and backward pass of a DL model) along with data across multiple nodes for coherent processing.
It can then establish an effective “Synchronization” among the nodes to achieve consistency.

MPI: A distributed computing standard .

One more terminology you have to get used to — Message Passing Interface (MPI). MPI is the workhorse of almost all of distributed computing. MPI is an open standard that defines a set of rules on how the nodes will talk to each other over network and also a programming model/API. MPI is not a software or tool, it’s a specification. A group of individuals, organizations from academia and industry came forward in the summer of 1991 which eventually led to the creation of MPI Forum. The forum, with a consensus, crafted a syntactic and semantic specification of a library that is to be served as a guideline for different hardware vendors to come up with portable/ flexible/optimized implementations. Several hardware vendors have their own implementation of MPI — “OpenMPI”, “MPICH”, “MVAPICH”, “Intel MPI” and lot more.

In this tutorial, we are going to use Intel MPI as it is very performant and also optimized for Intel platforms. Original Intel MPI is a C library and very low level in nature.

The Setup

Proper setup of a distributed system is very important. Without proper hardware and network arrangements, it’s pretty much useless even if one has conceptual understanding of it’s programming model. Below are the key arrangements need to be made:

A set of nodes connected in a common network forming a cluster is typically required. It is recommended to have high-end servers as nodes and high-bandwidth network like InfiniBand.
Linux systems with user accounts of exact same name are required on all the nodes in the cluster.
Nodes must have password-less SSH connectivity among them. This is very crucial for seamless connectivity.
An MPI implementation must be installed. This tutorial focuses on Intel MPI only.
A common filesystem is required which is visible from all the nodes and the distributed applications must reside on it. Network Filesystem (NFS) is one way to achieve this.

Types of parallelization strategies

Model parallelism
Data parallelism

Model parallelism

Model parallelism refers to a model being logically split into several parts (i.e., some layers in one part and some in other), then placing them on different hardware/devices. Although placing the parts on different devices does have benefits in terms of execution time (asynchronous processing of data), it is usually employed to avoid memory constraints. Models with very large number of parameters, which are difficult fit into a single system due to high memory footprint, benefits from this type of strategy.

Data parallelism

Data parallelism, on the other hand, refers to processing multiple pieces (technically batches) of data through multiple replicas of the same network located on different hardware/devices. Unlike model parallelism, each replica may be an entire network and not just a part of it. This strategy, as you might have guessed, can scale up well with increasing amount of data. But, as the entire network has to reside on a single device, it cannot help models with high memory footprints. The illustration below should make it clear.

Practically, Data parallelism is more popular and frequently employed in large organizations for executing production quality DL training algorithms. So, in this tutorial, we will fix our focus on data parallelism.

The “torch.distributed” API

PyTorch offers a very elegant and easy-to-use API as an interface to the underlying MPI library written in C. PyTorch needs to be compiled from source and must be linked against the Intel MPI installed on the system. We will now see the basic usage of torch.distributed and how to execute it.

# filename 'ptdist.py'
import torch
import torch.distributed as dist
def main(rank, world):
    if rank == 0:
        x = torch.tensor([1., -1.]) # Tensor of interest
        dist.send(x, dst=1)
        print('Rank-0 has sent the following tensor to Rank-1')
        print(x)
    else:
        z = torch.tensor([0., 0.]) # A holder for recieving the tensor
        dist.recv(z, src=0)
        print('Rank-1 has recieved the following tensor from Rank-0')
        print(z)

if __name__ == '__main__':
    dist.init_process_group(backend='mpi')
    main(dist.get_rank(), dist.get_world_size())

Peer-to-peer communication . Executing the above code using mpiexec, a distributed process scheduler comes with any standard MPI implementation, results in:

cluster@miriad2a:~/nfs$ mpiexec -n 2 -ppn 1 -hosts miriad2a,miriad2b python ptdist.py
Rank-0 has sent the following tensor to Rank-1
tensor([ 1., -1.])
Rank-1 has recieved the following tensor from Rank-0
tensor([ 1., -1.])

The first line to be executed is dist.init_process_group(backend) which basically sets up the internal communication channel among the participating nodes. It takes an argument to specify which backend to use. As we are using MPI throughout, its backend=’mpi’ in our case. There are other backends as well (like “TCP”, “Gloo” and “NCCL”).
Two parameters need to be retrieved — the world size and rank.
“World” refers to the collection of all nodes that have been specified in a particular context of mpiexec invocation (see the -hosts flag in mpiexec).
“Rank” is a unique integer assigned by the MPI runtime to each of the processes. It starts from 0. The order in which they are specified in the argument of -hosts is used to assign the numbers. So, in this case, the process on node “miriad2a” will be assigned Rank 0 and “miriad2b” will be Rank 1.
x is a tensor that Rank 0 intends to send to Rank 1. It does so by dist.send(x, dst=1).
z is something that Rank 1 created before receiving the tensor. We need an already created tensor of same shape as a holder for receiving the incoming tensor. The values of z will eventually be replaced by the value of x.
Just like dist.send(…), the receiving counterpart is dist.recv(z, src=0) which receives the tensor into z.

Communication collectives

What we saw in the last section is an example of “peer-to-peer” communication where rank(s) send data to specific rank(s) in a given context. Although this is useful as it provides user with granular control over the communication, there exist other standard and frequently used patterns of communication called collectives. Below is the description of one particular collective (known as all-reduce) which is of interest to us in the context of Synchronous SGD algorithm.

The “All-reduce” collective

All-reduce is a way of synchronized communication where a given reduction operation is operated on all the ranks and the reduced result is made available to all of them. The below figure illustrates the idea (uses summation as the reduction operation).

def main(rank, world):
    if rank == 0:
        x = torch.tensor([1.])
    elif rank == 1:
        x = torch.tensor([2.])
    elif rank == 2:
        x = torch.tensor([-3.])
    
    dist.all_reduce(x, op=dist.reduce_op.SUM)
    print('Rank {} has {}'.format(rank, x))

if __name__ == '__main__':
    dist.init_process_group(backend='mpi')
    main(dist.get_rank(), dist.get_world_size())

Basic usage of all-reduce collective in PyTorch When launched *in a world of 3*, results in

cluster@miriad2a:~/nfs$ mpiexec -n 3 -ppn 1 -hosts miriad2a,miriad2b,miriad2c python ptdist.py
Rank 1 has tensor([0.])
Rank 0 has tensor([0.])
Rank 2 has tensor([0.])

The if rank == … elif is a pattern we encounter again and again in distributed computing. In this case, it is used to create different tensors on different ranks.
They all execute an all-reduce together (see that dist.all_reduce(…) is outside if … elif block) with summation (dist.reduce_op.SUM) as reduction operation.
x from every rank is summed up and the summation is placed inside the same x of every rank.

Moving on to Deep learning

It is assumed that the reader is familiar with the standard Stochastic Gradient Descent (SGD) algorithm which is often used to train deep learning models. We will now see a variant of SGD (called Synchronous SGD) that makes use of the All-reduce collective to scale up. To lay the foundation, let’s start with the mathematical formulation of standard SGD.

$\theta_{new}=\theta_{old} - \lambda\bigtriangledown_\theta\sum_{D}{Loss(X,y)}$

where D is a set (mini-batch) of samples, θ is the set of all parameters, λ is the learning rate and Loss(X, y) is some loss function averaged over all samples in D.
The core trick that Synchronous SGD relies on is splitting the summation in the update rule over smaller subsets of (mini)batches. D is split into R number of subsets D₁, D₂, . . (preferably with same number of samples in each) such that

$D=\quad \bigcup_{r=1}^{R} D_r$

Splitting the summation of standard SGD update formula leads to

$\theta_{new}=\theta_{old} - \lambda\bigtriangledown_{\theta}\left[ \sum_{D_1}Loss(X,y)+\sum_{D_2}Loss(X,y) + .. +\sum_{D_R}Loss(X,y) \right]$

Now, as the gradient operator is distributive over summation operator, we get

$\theta_{new}=\theta_{old} - \lambda\left[ \bigtriangledown_{\theta}\sum_{D_1}Loss(X,y)+\bigtriangledown_{\theta}\sum_{D_2}Loss(X,y) + .. +\bigtriangledown_{\theta}\sum_{D_R}Loss(X,y) \right]$

What do we get out of this?

Have a look at those individual gradient terms (inside square brackets) in the above equation. They can now be computed independently and summed up to get the original gradient without any loss/approximation. This is where the data parallelism comes into picture. Here is the whole story:

Split the entire dataset into R equal chunks. The letter R is used to refer to Replica.
Launch R processes/ranks using MPI and bind each process to one chunk of the dataset.
Let each rank compute the gradient using a mini-batch (dᵣ) of size B from its own portion of data, i.e., rank r computes

$\bigtriangledown_{\theta}\sum_{d_r \in D_r}Loss(X,y)$

Sum up all the gradients of all the ranks and make the resulting gradient available to all of them to proceed further.

The last point is exactly the all-reduce algorithm. So, all-reduce must be executed every time all ranks have computed one gradient (on a mini-batch of size B) on their own portion of the dataset. A subtle point to note here is that summing up the gradients (on mini-batches of size B) from all R ranks leads to an effective batch size of
$B_{effective} = R \times B$
The following are the crucial parts of the implementation (the boilerplate codes are not shown)

model = LeNet()
# first synchronization of initial weights
sync_initial_weights(model, rank, world_size)

optimizer = optim.SGD(model.parameters(), lr=1e-3, momentum=0.85)

model.train()
for epoch in range(1, epochs + 1):
    for data, target in train_loader:
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        
        # The all-reduce on gradients
        sync_gradients(model, rank, world_size)
        
        optimizer.step()

def sync_initial_weights(model, rank, world_size):
    for param in model.parameters():
        if rank == 0:
            # Rank 0 is sending it's own weight
            # to all it's siblings (1 to world_size)
            for sibling in range(1, world_size):
                dist.send(param.data, dst=sibling)
        else:
            # Siblings must recieve the parameters
            dist.recv(param.data, src=0)

def sync_gradients(model, rank, world_size):
    for param in model.parameters():
        dist.all_reduce(param.grad.data, op=dist.reduce_op.SUM)

All R ranks create their own copy/replica of the model with random weights.
Individual replicas with random weights may lead to initial de-synchronization. It is preferable to synchronize the initial weights among all the replicas. The sync_initial_weights(…) routine does exactly that. Let any one of the ranks send its weights to its siblings and the siblings must receive them to initialize themselves with it.
Fetch a mini-batch (of size B) from the respective portion of a rank and compute forward and backward pass (gradient). Important point to note here as a part of the setup, is all processes/ranks should have its own portion of data visible (usually on its own hard-disk OR on a shared Filesystem).
Execute all-reduce collective on the gradients of each replica with summation as the reduction operation. The sync_gradients(…) routine does the gradient synchronization.
After gradients have been synchronized, every replica can execute a standard SGD update on its own weights independently. The optimizer.step() does the job as usual.

Now a question might arise, “How do we ensure that independent updates will remain in sync?”.
If we take a look at the update equation for the first update

$\theta_{firstupdate} \leftarrow \theta_{initial} - \lambda\bigtriangledown_{\theta}\sum_{ D}Loss(X,y)$

Point 2 & 4 above ensure that the initial weights and the gradients are synchronized individually. For obvious reason, a linear combination of them will also be in sync (λ is a constant). A similar logic holds for all consecutive updates.

Performance comparison

The biggest bottleneck for any distributed algorithm is the synchronization. Distributed algorithms are beneficial only if the synchronization time is significantly less than computation time. Let’s have a simple comparison between the standard and synchronous SGD to see when is the later one beneficial.

Definitions. Let’s assume the size of the entire dataset is N. Mini-batches of size B are processed by the network which takes time Tcomp. In the distributed case, time taken for all-reduce synchronization is Tsync. If there are R replicas, time taken for one epoch

For non-distributed (standard) SGD

$T_{epoch} = (No. of mini-batches) \times (Time taken for mini-bathc) \Rightarrow T_{epoch} =\frac{N} {B} \times T_{comp}$

For Synchronous SGD

$T_{epoch} = (No. of mini-batches) \times (Time taken for mini-bathc) \Rightarrow T_{epoch} =\frac{N} {R \ \times B} \times (T_{comp} + T_{sync})$

So, for the distributed setting to be significantly beneficial over non-distributed one, we need to have

$\frac{N} {RB}(T_{comp} + T_{sync}) < \frac{N} {B} T_{comp}$

OR, equivalently

$\frac{T_{sync}}{T_{comp}} < (R - 1)$

The three factors contributing to the above inequality can be tweaked to extract more and more benefit out of the distributed algorithm.

Tsync can be reduced by connecting the nodes over a high bandwidth (fast) network.
Tcomp can be increased by increasing batch size B.
R can be increased by connecting more nodes over the network and having more replicas.

Hopefully, the article was clear enough to convey the central idea of Distributed Computing in the context of Deep Learning. Although, Synchronous SGD is quite popular, there are other distributed algorithms which are also used quite frequently (like Asynchronous SGD and its variants). But, what is more important is to be able to think about deep learning methods in a parallel manner. Please realize that not all algorithms can be parallelized out-of-the-box; some require approximations to be made which break theoretical guarantees given by the original algorithms. It is up to the algorithm designer/implementer to tackle these approximations in an efficient way.

VSCODE中open函数读取不了相对路径怎么办青岑浪 vscode ide python
在VScode中使用f=open(txt,'r')读取文件时报错：Nosuchfileordirectory记录一下我的解决过程和看到的好文章，供大家参考搜索到一个博主的解决方案Vscode的相对路径读取问题及处理_解决vscode相对路径-CSDN博客，在目录的.vscode文件夹中，修改launch.json文件，添加一行代码："cwd":"${fileDirname}",就可以一劳永逸的解决
深度好文图解 RocketMQ 的系统架构橘野禾系统架构 kafka java 分布式后端
今天给大家分享一篇学习RocketMQ系统架构核心知识点的梳理和总结,在讲解时力求精简、通俗易懂，通过图解来给正在学习RocketMQ的小伙伴带来帮助。RocketMQ是阿里巴巴的分布式消息中间件，在2012年开源，在2017年成为Apache顶级项目。1集群架构RocketMQ的集群架构如下图：从上图可以看到，整个集群中有四个角色：NameServer集群、Broker主从集群、Producer
ubuntu使用apt-get制作OfflinePackage 欧阳子陵运维 ubuntu 包依赖管理器离线包制作过程
ubuntu使用apt-get制作Offline-package（以mysql为例）近来在研究工程软件的部署，虽说这是运维的工作，但还是蛮有意思的。问题：软件环境配置有哪些需要考虑的因素软件安装有两种方式make安装，需要自己解决依赖问题apt-get安装，稳定，自动配置服务软件的安装在没有网络的情况下如何安装：自己下载好文件采用apt-get离线包的方式。文本采用的是apt-get的离线包的安装
批量共享，一步到位的软件神器维度哥批量共享
今天介绍一个可以一键共享文件夹的软件神器，更厉害的是可以批量设置共享并编辑共享和安全权限。批量共享一键批量共享文件夹这个软件下载之后打开就能直接使用，不需要安装。选择好文件之后设置访问权限以及共享权限，就可以直接共享给别人了。可以根据需求自己增删系统用户。也可以选择指定的用户进行共享。这里可以方便你更好的查看、管理共享用户和对应的权限。在共享设置里，如果不清楚怎么设置，可以全选设置即可。如果还有一
esp32 IDF框架开发经常遇到的问题藤一泓笔记 ESP32 单片机
目的在基于IDF框架开发时，经常遇到很多问题，查了很多资料也找不到解决方法，所以开设了这个话题，我将不定期的讲收录到的问题，写入好文章中。esp32重启问题ESP32运行报错:crst:0xc(SW_CPU_RESET),boot:0x13(SPI_FAST_FLASH_BOOT)configsip:0,SPIWP:0xee原因：esp32供电不稳定解决方法：换一根好点的数据线。或者在供电处并联一
javascript正则努力的程序员30*15k javascript 正则表达式开发语言
@TOC引言无意中从网上查找到一篇关于正则表达式的好文章，就进行了分享给大家，希望对大家有帮助。亲爱的读者朋友，如果你点开了这篇文章，说明你对正则很感兴趣。想必你也了解正则的重要性，在我看来正则表达式是衡量程序员水平的一个侧面标准。关于正则表达式的教程，网上也有很多，相信你也看了一些。与之不同的是，本文的目的是希望所有人认真读完，都有实质性的提高。本文内容共有七章，用JavaScript语言完整地
Python Web开发（三）：HTTP请求的url路由是Dream呀 python 前端 http django 后端
本文目录：一、要实现的目标二、创建项目app1.APP介绍2.创建APP三、返回页面内容给浏览器四、url路由1.添加路由记录1.1解决ERROR:Couldnotfindaversionthatsatisfiestherequirementxxx1.2启动web服务2.路由子表`【系列好文推荐】`前言：作者简介：是Dream呀，华为云享专家、CSDN原力计划作者、Python领域优质创作者，专注
c语言万能编程模板_C语言实现模板汤義喆 c语言万能编程模板
好文网为大家准备了关于C语言实现模板的文章,好文网里面收集了五十多篇关于好C语言实现模板好文,希望可以帮助大家。更多关于C语言实现模板内容请关注好文网。ctrl+D请收藏!篇一：C语言模板C语言课程实训报告姓名：赵建升学号：10好文网为大家准备了关于C如何实现模板范文,好文网里面收集了五十多篇关于好C如何实现模板好文,希望可以帮助大家。更多关于C如何实现模板内容请关注好文网篇一：C语言模板C语言课
Systrace 学习笔记程序员Android android java 操作系统 epoll 移动开发
和你一起终身学习，这里是程序员Android经典好文推荐，通过阅读本文，您将收获以下知识点:一、Systrace简介二、Systrace预备知识三、Why60fps四、SystemServer解读五、SurfaceFlinger解读六、Input解读七、Vsync解读八、Vsync-App：基于Choreographer的渲染机制详解九、MainThread和RenderThread解读十、Bin
跨端方案选型：对比Uni-app与Taro在复杂电商项目中的技术选型依据参考向贤前端开发 uni-app taro
跨端方案选型：对比Uni-app与Taro在复杂电商项目中的技术选型依据参考请赏析：Uni-app与Taro复杂电商项目选型对比指南一、核心选型维度速记技术栈匹配→跨端能力→性能优化→开发效率→生态支持→长期维护二、关键维度对比分析1.技术栈匹配性框架技术栈适用团队学习成本Uni-appVue.js语法+小程序API熟悉Vue或小程序的团队低（语法与Vue高度一致）TaroReact/Vue/类R
Ubuntu Linux运维实战指南4_文件系统基础知识 IT_张三 Ubuntu Linux运维指南 linux 运维 ubuntu
4文件系统的层次结构文件系统是Ubuntu的核心内容之一。在Linux系统中，一切都是文件，而文件系统就是文件的组织和管理方式。可以这么说，在本书中除前3章外，其余的所有章节都会涉及文件系统。深入理解和掌握文件系统是每个Linux学习者都必须面对的问题。而掌握好文件系统，Linux系统中的许多难题都会迎刃而解。本章将介绍什么是文件系统、文件系统的层次结构、Linux文件系统的组织结构、Linux中
Linux中常见命令使用海绵宝宝 Linux linux 运维服务器
Linux命令，本质是一个二进制可执行程序，与Windows系统中的.exe文件是一个意思ls-l-l看到的信息，开始是d，说明是文件夹，开始是-，则是文件w-h让文件大小更人性化的显示文件操作命令touch创建文件用法：touchLinux路径touch命令无选项，参数必填查看文件内容cat准备好文件内容后，可以通过cat查看内容用法：cat没有选项，只有必填参数，参数表示：被查看的文件路径。m
AI 在未来相机领域的应用前景如何？程序员Android 人工智能数码相机智能电视
和你一起终身学习，这里是程序员Android人工智能（AI）在手机相机领域的应用已成为近年来技术创新的核心驱动力之一。随着计算摄影、深度学习算法和硬件加速技术的进步，AI正在重新定义手机摄影的可能性，并为未来带来更多颠覆性潜力。以下是AI在手机相机中的关键潜力方向及具体应用场景：经典好文推荐，通过阅读本文，您将收获以下知识点:1.计算摄影的深度进化多帧合成与超分辨率：AI通过分析多张连续拍摄的帧（
《哪吒之魔童闹海》迅雷BT磁力下载[HD-5.39GB/6.32GB]百度云1280P资源共享 go
在2025年的春节档期，一部备受瞩目的动画电影《哪吒之魔童闹海》震撼上映，并迅速成为观众热议的焦点。作为《哪吒之魔童降世》的续作，该片不仅延续了前作的精良制作与颠覆性叙事风格，更在剧情、角色塑造、视觉特效等方面进行了全面升级。然而，需要强调的是，本文并不鼓励或提供任何形式的电影下载链接，而是旨在通过深入赏析这部电影，带领读者领略其独特的魅力与价值。《哪吒之魔童闹海》由导演饺子执导，成都可可豆动画影
架构设计（15）面向服务架构SOA论文赏析 CoderIsArt 架构设计研究架构 SOA
题目:论面向服务架构设计以及应用摘要本文以我参与的某公司业务上云项目为例,论述了面向服务架构设计方法和实现过程.该项目的目标是构建以某酒厂生产的白酒产品为主的电子商城,实现该白酒的线下营销升级为在线营销的战略目标,包括:线上抢购,支付,线下配送,防伪溯源等一系列电子商务功能.在此项目中,我作为系统架构师及主要管理人员,参与了该项目的需求开发\系统架构设计等主导工作.SOA将应用程序的不同功能单元,
系统架构设计师【论文】: 论面向服务架构设计及其应用（包括解题思路和经典范文）数据知道系统架构软考高级系统架构设计师面向服务架构论文
文章目录真题题目（2015年试题3、2018年试题3）论文解题思路素材准备精品范文赏析摘要正文总结真题题目（2015年试题3、2018年试题3）题目:论面向服务架构设计及其应用面向服务架构（Service-OrientedArchitecture,SOA)是一种应用框架，将日常的业务应用划分为单独的业务功能服务和流程，通过采用良好定义的接口和标准协议将这些服务关联起来。通过实施甚于SOA的系统架构
三维重建（九）——如何完成一篇好文章周末不下雨三维重建人工智能深度学习
文章目录一、撰写顶会论文1.1顶会激烈1.2提高文章接收的概率1.3撰写一篇出色的论文二、论文写作中最大的错误三、在开始之前要问自己的一些问题（从现在就要开始想）四、如何讲述一个故事五、如何讲述一个科研故事5.1解决什么问题5.2如何解决5.3解决的结果六、策略七、CVPR的审稿表八、其他8.1oral、spotlight、poster8.2一些醒悟的话8.3以后会讲的东西以顶会论文为例——最高标
【2023-03-06】小红书C++开发一面面经 TechGuide 大厂实战面经 c++开发语言
恭喜发现宝藏！搜索公众号【TechGuide】回复公司名，解锁更多新鲜好文和互联网大厂的笔经面经，目前已更新至美团、微软…作者@TechGuide【全网同名】背景面试时间：3.6面试岗位：C++开发面试类型：实习有很多首次参加校招或者实习的小伙伴，声哥提醒下注意两点：对于面试，保持松弛感和准备好八股文、熟悉好项目一样重要，有时候和面试官聊得来就体现在这种松弛感营造的氛围。对于笔试，前几次可以当作练
Pycharm打开django支持月光有害 pycharm django sqlite
在PyCharm中打开“Settings/Preferences”->“Languages&Frameworks”->“Django”。勾上EnableDjangosupport然后配置好文件根目录就好了
c#通过excel数据导出电子发票PDF---导出pdf（自定义模板）铛铛铛h c#excel pdf
项目场景：c#通过excel数据导出电子发票PDF客户提供导出的excel数据，批量生成pdf发票例如：项目场景：只要涉及导出及填充pdf模板数据问题描述怎么制作pdf模板使用adobeacrbatDC制作pdf所需模板，注意准备表单时选择好文本域等填写字段数据名称，只要是pdf填充同理！！！！不要使用2018版本，有bug，无法打印！准备好模板，c#接收excel数据处理完后导出excel：使用
企业级Flutter项目-走出第一步独立开发者_猫哥译文 flutter
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-tvr787Ed-1620962575682)(https://ducafecat.tech/2021/05/14/translation/flutter-in-business-first-steps/2021-05-14-11-08-00.png)]老铁记得转发，猫哥会呈现更多Flutter好文~~~~微信flutt
twrp选择sdcard为0B的问题解决新青年. 踩坑记录刷机其他经验分享
如果刷入的twrp版本在3.2.3版本以下的有无法选择sdcard的bug,必须先格式化Data分区,尝试一下.,请更新twrp版本尝试解决,由于笔者没有找到适用于我手机的(oppor11)的刷机包,所以我选择了去奇兔下载twrp,他的是3.2.3来的,更新好之后进入Rec,继续格式化Data,把Data分区文件系统改为ext4,然后重启到Recovery,再重启到系统,传好文件,再进入rec,之
用HTML写一首绝句古诗,《绝句二首》_杜甫的诗词_诗词名句网 Artemis Lee 用HTML写一首绝句古诗
作品赏析迟日江山丽，春风花草香。泥融飞燕子，沙暖睡鸳鸯。清代的诗论家陶虞开在《说杜》一书中指出，杜集中有不少“以诗为画”的作品。这一首写于成都草堂的五言绝句，就是极富诗情画意的佳作。诗一开始，就从大处着墨，描绘出在初春灿烂阳光的照耀下，浣花溪一带明净绚丽的春景，用笔简洁而色彩浓艳。“迟日”即春日，语出《诗经·豳风·七月》“春日迟迟”。这里用以突出初春的阳光，以统摄全篇。同时用一“丽”字点染“江山”
构建大规模分布式服务--高并发、高可用架构系列，高质量原创好文 90后小伙追梦之路后端架构 java java 程序人生分布式开发语言面试
当我们在谈论“服务治理”的时候，都在谈论些什么？我从业之初接触到的便是一堆基于Webservice、Hessain等实现的跨语言的分布式系统，那是SOA架构和理念十分盛行的时代，我常常听到前辈们在谈论“SOA治理”等高大上的词，但我当时并没有理解何为“治理”，甚至在想：为什么不叫“管理”呢？在此之前，我仅在小学课本上接触过“污水治理”这个词。直到近些年互联网企业大规模服务化进程的推进，以Dubbo
Python Turtle艺术绘画赏析与编程自由徜徉碧海蓝天 python 开发语言爬虫编程
PythonTurtle是一个强大的绘图库，通过简单的命令和指令，可以在屏幕上绘制出各种图形和艺术作品。本文将介绍PythonTurtle艺术绘画的一些技巧和示例代码，帮助您了解如何使用PythonTurtle库来创作独特的艺术作品。在开始之前，确保您已经安装了Python和Turtle库。如果您还没有安装，可以通过Python的官方网站下载并安装Python，然后在命令行中使用以下命令安装Tur
Python 潮流周刊#70：微软 Excel 中的 Python 正式发布！（摘要） python
本周刊由Python猫出品，精心筛选国内外的250+信息源，为你挑选最值得分享的文章、教程、开源项目、软件工具、播客和视频、热门话题等内容。愿景：帮助所有读者精进Python技术，并增长职业和副业的收入。分享了12篇文章，12个开源项目，2则音视频，全文2000字。以下是本期摘要：文章&教程①微软Excel中的Python正式发布②UV汇总：五篇好文章和一个pre-commit技巧③Spiderw
古诗词欣赏：杨万里的秋凉晚步一日进步一点
1、原文：秋凉晚步[宋]杨万里秋气堪悲未必然，轻寒正是可人天。绿池落尽红蕖却，荷叶犹开最小钱。2、译文：秋天真的是让人感觉悲凉的季节吗？未必是这样吧，轻微的寒冷，正是最让人感觉舒适的天气，碧绿的荷叶虽然快要落完了，但是粉红的荷花却还在盛开，新长出的荷叶就像最小的铜钱那么大。3、注释可人：合人意。红蕖（音同“渠”）：荷花。却：开尽。最小钱：新出荷叶才象小铜钱那么大。4、赏析：向来诗人容易悲愁，秋风飒
小胖爱交友：答谢义薄云天小胖666_4771
图片发自App何必客气义薄云天你好我我也好何必客气义薄云天亦是朋友些许小事何必苦恼何必苦恼同甘共苦义薄云天你好茫茫人海芸芸众生既能相遇，何必客气义薄云天你好何必客气人生能得几回醉，既能相遇，何必苦恼义薄云天劝君莫客气你好我也好文采有限真心感谢闲言碎语敬请谅解人生难得几回醉义薄云天可好图片发自App大姐何必烦恼跳跳舞就好
故事，就是故事妈妈能送给小朋友最好的礼物，请欣赏图画书《我家的故事妈妈》图画与书的灵动
您现在阅读的是“图画与书的灵动”亲子共读公益推广的第296篇图画书——《我家的故事妈妈》图文赏析推荐年龄：2-10岁（托班；小、中、大班；1-5年级）跨出柴米油盐酱醋茶的第一步，是故事妈妈们变化和成长的开始。一本本精彩的好书，打开了我们生命的视野，为了把故事说的生动有趣，我们努力学习新才艺，三不五时还要接受小朋友们突如其来，五花八门的挑战，早就习惯当八爪鱼的妈妈们，激发出更多自我的潜能，以家为圆心
2023-07-28 云汐若
书名:昆虫记文章：黑肚皮的塔兰图拉毒蛛作者：法布尔优美词汇：温柔可人、面不改色、泰然自若、天经地义、没心没肺凯旋而归、毫不客气、地地道道、稀奇古怪、三三两两丑陋不堪、豺狼虎豹、忽如其来、兴趣盎然、熠熠生辉悄无声息、黯然失色、仪态万方、一无所知、怪模怪样精彩句段：蜘蛛是真正的纺织高手、聪明的猎人，悲惨的婚姻，还有其他吸引人的特征。赏析：对蜘蛛特点的总结，体现了作者对他们进行了认真而又仔细的研究。蜘蛛
Java常用排序算法/程序员必须掌握的8大排序算法 cugfy java
分类： 1）插入排序（直接插入排序、希尔排序） 2）交换排序（冒泡排序、快速排序） 3）选择排序（直接选择排序、堆排序） 4）归并排序 5）分配排序（基数排序）所需辅助空间最多：归并排序所需辅助空间最少：堆排序平均速度最快：快速排序不稳定：快速排序，希尔排序，堆排序。先来看看8种排序之间的关系： 1.直接插入排序（1
【Spark102】Spark存储模块BlockManager剖析 bit1129 manager
Spark围绕着BlockManager构建了存储模块，包括RDD，Shuffle，Broadcast的存储都使用了BlockManager。而BlockManager在实现上是一个针对每个应用的Master/Executor结构，即Driver上BlockManager充当了Master角色，而各个Slave上(具体到应用范围，就是Executor)的BlockManager充当了Slave角色
linux 查看端口被占用情况详解 daizj linux 端口占用 netstat lsof
经常在启动一个程序会碰到端口被占用，这里讲一下怎么查看端口是否被占用，及哪个程序占用，怎么Kill掉已占用端口的程序 1、lsof -i:port port为端口号 [root@slave /data/spark-1.4.0-bin-cdh4]# lsof -i:8080 COMMAND PID USER FD TY
Hosts文件使用周凡杨 hosts locahost
一切都要从localhost说起，经常在tomcat容器起动后，访问页面时输入http://localhost:8088/index.jsp，大家都知道localhost代表本机地址，如果本机IP是10.10.134.21，那就相当于http://10.10.134.21:8088/index.jsp，有时候也会看到http: 127.0.0.1:
java excel工具 g21121 Java excel
直接上代码，一看就懂，利用的是jxl： import java.io.File; import java.io.IOException; import jxl.Cell; import jxl.Sheet; import jxl.Workbook; import jxl.read.biff.BiffException; import jxl.write.Label; import
web报表工具finereport常用函数的用法总结（数组函数）老A不折腾 finereport web报表函数总结
ADD2ARRAY ADDARRAY(array,insertArray, start):在数组第start个位置插入insertArray中的所有元素，再返回该数组。示例： ADDARRAY([3,4, 1, 5, 7], [23, 43, 22], 3)返回[3, 4, 23, 43, 22, 1, 5, 7]. ADDARRAY([3,4, 1, 5, 7], "测试&q
游戏服务器网络带宽负载计算墙头上一根草服务器
家庭所安装的4M，8M宽带。其中M是指，Mbits/S 其中要提前说明的是： 8bits = 1Byte 即8位等于1字节。我们硬盘大小50G。意思是50*1024M字节，约为 50000多字节。但是网宽是以“位”为单位的，所以，8Mbits就是1M字节。是容积体积的单位。 8Mbits/s后面的S是秒。8Mbits/s意思是每秒8M位，即每秒1M字节。我是在计算我们网络流量时想到的
我的spring学习笔记2-IoC（反向控制依赖注入） aijuans Spring 3 系列
IoC（反向控制依赖注入）这是Spring提出来了，这也是Spring一大特色。这里我不用多说，我们看Spring教程就可以了解。当然我们不用Spring也可以用IoC，下面我将介绍不用Spring的IoC。 IoC不是框架，她是java的技术，如今大多数轻量级的容器都会用到IoC技术。这里我就用一个例子来说明：如：程序中有 Mysql.calss 、Oracle.class 、SqlSe
高性能mysql 之选择存储引擎(一) annan211 mysql InnoDB MySQL引擎存储引擎
1 没有特殊情况，应尽可能使用InnoDB存储引擎。原因：InnoDB 和 MYIsAM 是mysql 最常用、使用最普遍的存储引擎。其中InnoDB是最重要、最广泛的存储引擎。她被设计用来处理大量的短期事务。短期事务大部分情况下是正常提交的，很少有回滚的情况。InnoDB的性能和自动崩溃恢复特性使得她在非事务型存储的需求中也非常流行，除非有非常
UDP网络编程百合不是茶 UDP编程局域网组播
UDP是基于无连接的,不可靠的传输与TCP/IP相反 UDP实现私聊,发送方式客户端,接受方式服务器 package netUDP_sc; import java.net.DatagramPacket; import java.net.DatagramSocket; import java.net.Ine
JQuery对象的val()方法执行结果分析 bijian1013 JavaScript js jquery
JavaScript中，如果id对应的标签不存在（同理JAVA中，如果对象不存在），则调用它的方法会报错或抛异常。在实际开发中，发现JQuery在id对应的标签不存在时，调其val()方法不会报错，结果是undefined。
http请求测试实例（采用json-lib解析） bijian1013 json http
由于fastjson只支持JDK1.5版本，因些对于JDK1.4的项目，可以采用json-lib来解析JSON数据。如下是http请求的另外一种写法，仅供参考。 package com; import java.util.HashMap; import java.util.Map; import
【RPC框架Hessian四】Hessian与Spring集成 bit1129 hessian
在【RPC框架Hessian二】Hessian 对象序列化和反序列化一文中介绍了基于Hessian的RPC服务的实现步骤，在那里使用Hessian提供的API完成基于Hessian的RPC服务开发和客户端调用，本文使用Spring对Hessian的集成来实现Hessian的RPC调用。定义模型、接口和服务器端代码 |---Model &nb
【Mahout三】基于Mahout CBayes算法的20newsgroup流程分析 bit1129 Mahout
1.Mahout环境搭建 1.下载Mahout http://mirror.bit.edu.cn/apache/mahout/0.10.0/mahout-distribution-0.10.0.tar.gz 2.解压Mahout 3. 配置环境变量 vim /etc/profile export HADOOP_HOME=/home
nginx负载tomcat遇非80时的转发问题 ronin47
　　nginx负载后端容器是tomcat（其它容器如WAS,JBOSS暂没发现这个问题）非８０端口，遇到跳转异常问题。解决的思路是：$host:port 详细如下：　　该问题是最先发现的，由于之前对nginx不是特别的熟悉所以该问题是个入门级别的： ? 1 2 3 4 5
java-17-在一个字符串中找到第一个只出现一次的字符 bylijinnan java
public class FirstShowOnlyOnceElement { /**Q17.在一个字符串中找到第一个只出现一次的字符。如输入abaccdeff，则输出b * 1.int[] count:count[i]表示i对应字符出现的次数 * 2.将26个英文字母映射：a-z <--> 0-25 * 3.假设全部字母都是小写 */ pu
mongoDB 复制集开窍的石头 mongodb
mongo的复制集就像mysql的主从数据库，当你往其中的主复制集(primary)写数据的时候，副复制集(secondary)会自动同步主复制集(Primary)的数据,当主复制集挂掉以后其中的一个副复制集会自动成为主复制集。提供服务器的可用性。和防止当机问题 mo
[宇宙与天文]宇宙时代的经济学 comsci 经济
宇宙尺度的交通工具一般都体型巨大，造价高昂。。。。。在宇宙中进行航行，近程采用反作用力类型的发动机，需要消耗少量矿石燃料，中远程航行要采用量子或者聚变反应堆发动机，进行超空间跳跃，要消耗大量高纯度水晶体能源以目前地球上国家的经济发展水平来讲，
Git忽略文件 Cwind git
有很多文件不必使用git管理。例如Eclipse或其他IDE生成的项目文件，编译生成的各种目标或临时文件等。使用git status时，会在Untracked files里面看到这些文件列表，在一次需要添加的文件比较多时（使用git add . / git add -u），会把这些所有的未跟踪文件添加进索引。 ==== ==== ==== 一些牢骚
MySQL连接数据库的必须配置 dashuaifu mysql 连接数据库配置
MySQL连接数据库的必须配置 1.driverClass：com.mysql.jdbc.Driver 2.jdbcUrl：jdbc:mysql://localhost:3306/dbname 3.user：username 4.password：password 其中1是驱动名；2是url，这里的‘dbna
一生要养成的60个习惯 dcj3sjt126com 习惯
一生要养成的60个习惯第1篇让你更受大家欢迎的习惯 1 守时，不准时赴约,让别人等,会失去很多机会。如何做到： ①该起床时就起床， ②养成任何事情都提前15分钟的习惯。 ③带本可以随时阅读的书，如果早了就拿出来读读。 ④有条理，生活没条理最容易耽误时间。 ⑤提前计划：将重要和不重要的事情岔开。 ⑥今天就准备好明天要穿的衣服。 ⑦按时睡觉，这会让按时起床更容易。 2 注重
[介绍]Yii 是什么 dcj3sjt126com PHP yii2
Yii 是一个高性能，基于组件的 PHP 框架，用于快速开发现代 Web 应用程序。名字 Yii （读作易）在中文里有“极致简单与不断演变”两重含义，也可看作 Yes It Is! 的缩写。 Yii 最适合做什么？ Yii 是一个通用的 Web 编程框架，即可以用于开发各种用 PHP 构建的 Web 应用。因为基于组件的框架结构和设计精巧的缓存支持，它特别适合开发大型应
Linux SSH常用总结 eksliang linux ssh SSHD
转载请出自出处：http://eksliang.iteye.com/blog/2186931 一、连接到远程主机格式： ssh name@remoteserver 例如： ssh [email protected] 二、连接到远程主机指定的端口格式： ssh name@remoteserver -p 22 例如： ssh i
快速上传头像到服务端工具类FaceUtil gundumw100 android
快速迭代用 import java.io.DataOutputStream; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.IOExceptio
jQuery入门之怎么使用 ini JavaScript html jquery Web css
jQuery的强大我何问起（个人主页：hovertree.com）就不用多说了，那么怎么使用jQuery呢？首先，下载jquery。下载地址：http://hovertree.com/hvtart/bjae/b8627323101a4994.htm，一个是压缩版本，一个是未压缩版本，如果在开发测试阶段，可以使用未压缩版本，实际应用一般使用压缩版本(min)。然后就在页面上引用。
带filter的hbase查询优化 kane_xie 查询优化 hbase RandomRowFilter
问题描述 hbase scan数据缓慢，server端出现LeaseException。hbase写入缓慢。问题原因直接原因是： hbase client端每次和regionserver交互的时候，都会在服务器端生成一个Lease,Lease的有效期由参数hbase.regionserver.lease.period确定。如果hbase scan需
java设计模式-单例模式 men4661273 java 单例枚举反射 IOC
单例模式1，饿汉模式 //饿汉式单例类.在类初始化时，已经自行实例化 public class Singleton1 { //私有的默认构造函数 private Singleton1() {} //已经自行实例化 private static final Singleton1 singl
mongodb 查询某一天所有信息的3种方法，根据日期查询 qiaolevip 每天进步一点点学习永无止境 mongodb 纵观千象
// mongodb的查询真让人难以琢磨，就查询单天信息，都需要花费一番功夫才行。 // 第一种方式： coll.aggregate([ {$project:{sendDate: {$substr: ['$sendTime', 0, 10]}, sendTime: 1, content:1}}, {$match:{sendDate: '2015-
二维数组转换成JSON tangqi609567707 java 二维数组 json
原文出处：http://blog.csdn.net/springsen/article/details/7833596 public class Demo { public static void main(String[] args) { String[][] blogL
erlang supervisor wudixiaotie erlang
定义supervisor时，如果是监控celuesimple_one_for_one则删除children的时候就用supervisor:terminate_child (SupModuleName, ChildPid)，如果shutdown策略选择的是brutal_kill，那么supervisor会调用exit(ChildPid, kill)，这样的话如果Child的behavior是gen_