Phoenixtree_DongZhao

MyDLNote - Attention: [NLA系列] Asymmetric Non-local Neural Networks for Semantic Segmentation

Asymmetric Non-local Neural Networks for Semantic Segmentation

Zhen Zhu , Mengde Xu , Song Bai , Tengteng Huang , Xiang Bai

Huazhong University of Science and Technology, University of Oxford

[GitHub]： https://github.com/MendelXu/ANN

[paper]： https://arxiv.org/pdf/1908.07678.pdf

这篇文章的写作很棒。

[Non-Local Attention 系列]

Non-local neural networks

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond [my CSDN]

Asymmetric Non-local Neural Networks for Semantic Segmentation [my CSDN]

Efficient Attention: Attention with Linear Complexities [my CSDN]

CCNet: Criss-Cross Attention for Semantic Segmentation [my CSDN]

Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining [my CSDN]

Image Restoration via Residual Non-local Attention Networks [my CSDN]

Table of Contents

Asymmetric Non-local Neural Networks for Semantic Segmentation

Abstract

Introduction

Asymmetric Non-local Neural Network

Revisiting Non-local Block

Asymmetric Pyramid Non-local Block

Asymmetric Fusion Non-local Block

Network Architecture

Experiments

Datasets and Evaluation Metrics

Implementation Details

Comparisons with Other Methods

Conclusion

Abstract

The non-local module works as a particularly useful technique for semantic segmentation while criticized for its prohibitive computation and GPU memory occupation.

[文章的motivation，即文章要解决的问题]

In this paper, we present Asymmetric Non-local Neural Network to semantic segmentation, which has two prominent components: Asymmetric Pyramid Non-local Block (APNB) and Asymmetric Fusion Non-local Block (AFNB). APNB leverages a pyramid sampling module into the non-local block to largely reduce the computation and memory consumption without sacrificing the performance. AFNB is adapted from APNB to fuse the features of different levels under a sufficient consideration of long range dependencies and thus considerably improves the performance.

[网络结构介绍，简洁而清晰]

Extensive experiments on semantic segmentation benchmarks demonstrate the effectiveness and efficiency of our work. In particular, we report the state-of-the-art performance of 81.3 mIoU on the Cityscapes test set. For a 256 × 128 input, APNB is around 6 times faster than a non-local block on GPU while 28 times smaller in GPU running memory occupation.

[最后，实验结果要和motivation相呼应，即说明文章提出的方法确实有效解决了motivation提出的问题]

Introduction

Some recent studies [20, 33, 47] indicate that the performance could be improved if making sufficient use of long range dependencies. However, models that solely rely on convolutions exhibit limited ability in capturing these long range dependencies. A possible reason is the receptive field of a single convolutional layer is inadequate to cover correlated areas. Choosing a big kernel or composing a very deep network is able to enlarge the receptive field. However, such strategies require extensive computation and parameters, thus being very inefficient [44]. Consequently, several works [33, 47] resort to use global operations like non-local means [2] and spatial pyramid pooling [12, 16].

传统CNN不能高效的捕捉long range dependency. Non-local 和 spatial pyramid pooling 可以有效解决这个问题。

In [33], Wang et al. combined CNNs and traditional non-local means [2] to compose a network module named nonlocal block in order to leverage features from all locations in an image. This module improves the performance of existing methods [33]. However, the prohibitive computational cost and vast GPU memory occupation hinder its usage in many real applications. The architecture of a common non-local block [33] is depicted in Fig. 1(a). The block first calculates the similarities of all locations between each other, requiring a matrix multiplication of computational complexity , given an input feature map with size $C\times H\times W$ . Then it requires another matrix multiplication of computational complexity to gather the influence of all locations to themselves. Concerning the high complexity brought by the matrix multiplications, we are interested in this work if there are efficient ways to solve this without sacrificing the performance.

描述了Non-local的结构和复杂度。

这两段是问题的提出。

Figure 1: Architecture of a standard non-local block (a) and the asymmetric non-local block (b). $N=H\cdot W$ while $S\ll N$ .

We notice that as long as the outputs of the key branch and value branch hold the same size, the output size of the non-local block remains unchanged. Considering this, if we could sample only a few representative points from key branch and value branch, it is possible that the time complexity is significantly decreased without sacrificing the performance. This motivation is demonstrated in Fig. 1 when changing a large value N in the key branch and value branch to a much smaller value S (From (a) to (b)).

本文的观察和动机。Non-local是计算每个点和所有点的相关性（即每个像素点都对应一张attention map）；而本文认为不需要计算每个点的attention map，而是计算每个像素点与某几个局部区域的 attention map。

In this paper, we propose a simple yet effective nonlocal module called Asymmetric Pyramid Non-local Block (APNB) to decrease the computation and GPU memory consumption of the standard non-local module [33] with applications to semantic segmentation. Motivated by the spatial pyramid pooling [12, 16, 47] strategy, we propose to embed a pyramid sampling module into non-local blocks, which could largely reduce the computation overhead of matrix multiplications yet provide substantial semantic feature statistics. This spirit is also related to the sub-sampling tricks [33] (e.g., max pooling). Our experiments suggest that APNB yields much better performance than those sub-sampling tricks with a decent decrease of computations. To better illustrate the boosted efficiency, we compare the GPU times of APNB and a standard non-local block in Fig. 2, averaging the running time of 10 different runs with the same configuration. Our APNB largely reduces the time cost on matrix multiplications, thus being nearly 6 times faster than a non-local block.

描述APNB的形成机制（Non-Local + Spatial Pyramid Pooling），具备低复杂度、高 performance。

Besides, we also adapt APNB to fuse the features of different stages of a deep network, which brings a considerable improvement over the baseline model. We call the adapted block as Asymmetric Fusion Non-local Block (AFNB). AFNB calculates the correlations between every pixel of the low-level and high-level feature maps, yielding a fused feature with long range interactions. Our network is built based on a standard ResNet-FCN model by integrating APNB and AFNB together.

描述AFNB：融合了low-level和high-level的特征，生成了具有long range interactions的、融合的特征。

Recent advances focus on exploring the context information and can be roughly categorized into five directions:

为了挖掘 context information，现有的方法大致包括5个研究方向：

Encoder-Decoder：U-Net

Conditional Random Field: Deeplab

Different Convolutions: Dilated Conv.

Spatial Pyramid Pooling: PSPNet, Atrous Spatial Pyramid Pooling layer (ASPP)

Non-local Network: GCNet(19CVPR), NLNet

Different from these works, our network uniquely incorporates pyramid sampling strategies with non-local blocks to capture the semantic statistics of different scales with only a minor budget of computation, while maintaining the excellent performance as the original non-local modules.

Asymmetric Non-local Neural Network

While APNB aims to decrease the computational overhead of non-local blocks, AFNB improves the learning capacity of non-local blocks thereby improving the segmentation performance.

APNB的目标是降低 non-local blocks 的计算开销，而AFNB提高了 non-local blocks 的学习能力，从而提高了分割性能。

Revisiting Non-local Block

Non-local 原理就不多写了，这里列出后面要用到的公式。函数can take the form from softmax, rescaling, and none. 选择 softmax，就是Embedded Gaussian.

Asymmetric Pyramid Non-local Block

Motivation and Analysis

By inspecting the general computing flow of a non-local block, one could clearly find that Eq. (2) and Eq. (4) dominate the computation. The time complexities of the two matrix multiplications are both $\small O(\hat{C}N^2)=O(\hat{C}H^2W^2)$ . In semantic segmentation, the output of the network usually has a large resolution to retain detailed semantic features [6, 47]. That means N is large (for example in our training phase, N = 96 × 96 = 9216). Hence, the large matrix multiplication is the main cause of the inefficiency of a non-local block (see our statistic in Fig. 2). A more straightforward pipeline is given as

这段很好理解，就是详细介绍 Non-local 的复杂度为啥那么高。

We hold a key yet intuitive observation that by changing N to another number $\small S(S\ll N)$ , the output size will remain the same, as

Returning to the design of the non-local block, changing N to a small number S is equivalent to sampling several representative points from θ and γ instead of feeding all the spatial points, as illustrated in Fig. 1. Consequently, the computational complexity could be considerably decreased.

这段也很好理解，就是把N缩小到S，S远远小于N。那这个操作是怎么实现的呢？

Solution

Based on the above observation, we propose to add sampling modules $\small \mathcal{P}_\theta$ and $\small \mathcal{P}_\gamma$ after θ and γ to sample several sparse anchor points denoted as $\small \theta _P\in\mathcal{R}^{\hat{C}\times S}$ and $\small \gamma_P\in\mathcal{R}^{\hat{C}\times S}$ , where S is the number of sampled anchor points. Mathematically, this is computed by

$\large \theta _P=\mathcal{P}_\theta (\theta ),~\gamma _P=\mathcal{P}_\gamma(\gamma )$ (8)

The similarity matrix $\small {V}_P$ between $\small \theta$ and the anchor points $\small {\theta}_P$ is thus calculated by

Note that $\small {V}_P$ is an asymmetric matrix of size N × S. $\small {V}_P$ then goes through the same normalizing function as a standard non-local block, giving the unified similarity matrix $\small \vec{V}_P$ . And the attention output is acquired by

where the output is in the same size as that of Eq. (4). Following non-local blocks, the final output $\small Y_P$ is given as

The time complexity of such an asymmetric matrix multiplication is only O(CNS ˆ ), significantly lower than O(CNˆ 2 ) in a standard non-local block. It is ideal that S should be much smaller than N. However, it is hard to ensure that when S is small, the performance would not drop too much in the meantime.

这部分详细介绍采用pyramid pooling降低复杂度。Non-local 中，输入的 query 和 key 是经过 1x1 卷积得到的 CxN 维；而这里，query 和 key 是经过 1x1 卷积，还要用 pyramid pooling 进行采样，得到 CxS 维。即，non-local是把每个像素点作为 query，而APNB是对图像做采样，用几个 sparse anchor points 作为query。如此，计算复杂度从 $\small O(\hat{C}N^2)$ 降低到 $O(\hat{C}NS)$ 。

作者同时认为，S 不能太小。

下面部分则是更进一步介绍具体的 pyramid pooling 是如何操作的。

Spatial Pyramid Pooling

As discovered by previous works [16, 47], global and multi-scale representations are useful for categorizing scene semantics. Such representations can be comprehensively carved by Spatial Pyramid Pooling [16], which contains several pooling layers with different output sizes in parallel. In addition to this virtue, spatial pyramid pooling is also parameter-free and very efficient. Therefore, we embed pyramid pooling in the non-local block to enhance the global representations while reducing the computational overhead.

By doing so, we now arrive at the final formulation of Asymmetric Pyramid Non-local Block (APNB), as given in Fig. 3. As can be seen, our APNB derives from the design of a standard non-local block [33]. A vital change is to add a spatial pyramid pooling module after θ and γ respectively to sample representative anchors. This sampling process is clearly depicted in Fig. 4, where several pooling layers are applied after θ or γ and then the four pooling results are flattened and concatenated to serve as the input to the next layer. We denote the spatial pyramid pooling modules as $\small \mathcal{P}^n_\theta$ and $\small \mathcal{P}^n_\gamma$ , where the superscript $\small n$ means the width (or height) of the output size of the pooling layer (empirically, the width is equal to the height). In our model, we set $\small n\subseteq \{1,3,6,8\}$ . Then the total number of the anchor points is

Figure 3: Overview of the proposed Asymmetric Non-local Neural Network. In our implementation, the key branch and the value branch in APNB share the same 1×1 convolution and sampling module, which decreases the number of parameters and computation without sacrificing the performance.

Figure 4: Demonstration of the pyramid max or average sampling process.

SPP具体的操作细节。以H = 128 and W = 256为例，本文的方法降低复杂度为non-local的 (256x128)/112=298倍。

注意：在APNB中，作者把key和value的conv1x1及pyramid pooling共享了，即用的是同一组矩阵。

Asymmetric Fusion Non-local Block

Fusing features of different levels are helpful to semantic segmentation and object tracking as hinted in [16, 18, 26, 41, 46, 51]. Common fusing operations such as addition/concatenation, are conducted in a pixel-wise and local manner. We provide an alternative that leverages long range dependencies through a non-local block to fuse multi-level features, called Fusion Non-local Block. A standard non-local block only has one input source while FNB has two: a high-level feature map $\small X_h\in \mathcal{R}^{C_h\times N_h}$ and a low-level feature map $\small X_l\in \mathcal{R}^{C_l\times N_l}$ . Nh and Nl are the numbers of spatial locations of $\small X_h$ and $\small X_l$ , respectively. $\small C_h$ and $\small C_l$ are the channel numbers of $\small X_h$ and $\small X_l$ , respectively. Likewise, 1 × 1 convolutions $\small W_h^\varphi$ , $\small W_l^\theta$ and $\small W_l^\gamma$ are used to transform $\small X_h$ and $\small X_l$ to embeddings $\small \varphi_h\in \mathcal{R}^{\hat{C}\times N_h}$ , $\small \theta _l\in \mathcal{R}^{\hat{C}\times N_l}$ and $\small \gamma _l\in \mathcal{R}^{\hat{C}\times N_l}$ as

the similarity matrix $\small V_F$

The output $\small O_F\in \mathcal{R}^{N_l\times \hat{C}}$ reflects the bonus of $\small X_l$ to $\small X_h$ , which are carefully selected from all locations in $\small X_l$ .

AFNB的细节，其实图3已经解释的很清楚了。

Network Architecture

The overall architecture of our network is depicted in Fig. 3. We choose ResNet-101 [13] as our backbone network following the choice of most previous works [38, 47, 48]. We remove the last two down-sampling operations and use the dilation convolutions instead to hold the feature maps from the last two stages 1/8 of the input image. Concretely, all the feature maps in the last three stages have the same spatial size. According to our experimental trials, we fuse the features of Stage4 and Stage5 using AFNB. The fused features are thereupon concatenated with the feature maps after Stage5, avoiding situations that AFNB could not produce accurate strengthened features particularly when the training just begins and degrades the overall performance. Such features, full of rich long range cues from different feature levels, serve as the input to APNB, which then help to discover the correlations among pixels. As done for AFNB, the output of APNB is also concatenated with its input source. Note that in our implementation for APNB, $\small W_\theta$ and $\small W_\gamma$ share parameters in order to save parameters and computation, following the design of [42]. This design doesn’t decrease the performance of APNB. Finally, a classifier is followed up to produce channel-wise semantic maps that later receive their supervisions from the ground truth maps. Note we also add another supervision to Stage4 following the settings of [47], as it is beneficial to improve the performance.

网络构成细节：

1. ResNet 101

2. stage 4 和 5 分辨率不变，采用的正是 dilated ResNet [参考我的博客 Dilated Residual Networks]

3. AFNB 融合的是 stage 4 和 5

4. AFNB 之后跟 APNB，残差连接方式

5. $\small W_\theta$ and $\small W_\gamma$ share parameters [42: OCnet: Object context network for scene parsing]

6. add another supervision to Stage4 [47: Pyramid scene parsing network]

Experiments

Datasets and Evaluation Metrics

Datasets：Cityscapes [9], ADE20K [50] and PASCAL Context [21].

Evaluation Metric：Mean IoU (mean of classwise intersection over union)

Implementation Details

Training Objectives

Following [47 Pyramid scene parsing network], our model has two supervisions: one after the final output of our model while another at the output layer of Stage4. For Lfinal, we perform online hard pixel mining, which excels at coping with difficult cases.

Comparisons with Other Methods

1. Efficiency Comparison with Non-local Block

2. Performance Comparisons

3. Ablation Study

Efficacy of the APNB and AFNB

Selection of Sampling Methods：

Influence of the Anchor Points Numbers

Conclusion

个人总结：

这篇文章两个亮点：

1. 与Non-local 不同的是：

Non-local：每个点（query）与每个点之间的相关；pixel-wise

APNB：每个点与局部内容之间的关系；patch-wise

2. 融合不同level的的特征：AFNB

华为余承东“剧透”新形态手机；自DeepSeek发布以来，英伟达市值已蒸发4200亿美元；Java 24正式发布 | 极客头条极客日报华为智能手机 java
「极客头条」——技术人员的新闻圈！CSDN的读者朋友们好，「极客头条」来啦，快来看今天都有哪些值得我们技术人关注的重要新闻吧。整理|郑丽媛出品|CSDN（ID：CSDNnews）一分钟速览新闻点！华为余承东“揭秘”新形态手机：不是卷轴屏/伸缩屏，但男生女生都会喜欢腾讯去年营收增长8%，马化腾：重组AI团队，增加AI相关的资本开支金山办公：2024年WPSOffice全球月度活跃设备数达6.32亿，
Moodle + Websoft9：创新教育的强大组合，助力教学与学习开源软件
Moodle+Websoft9：构建未来课堂的技术基石一、Moodle：开源生态的深度解析•模块化设计：支持超800个官方插件，如H5P交互内容创作、BigBlueButton虚拟课堂，满足个性化教学需求。•学习分析引擎：内置LearningAnalyticsAPI，可集成Python/R语言进行深度学习，预测学生学业风险。•移动优先战略：MoodleApp支持离线学习、扫码签到，2023年新增A
书籍-《动手学深度学习（英文版）》
书籍：DiveintoDeepLearning作者：AstonZhang，ZacharyC.Lipton，MuLi，AlexanderJ.Smola出版：CambridgeUniversityPress编辑：陈萍萍的公主@一点人工一点智能下载：书籍下载-《动手学深度学习（英文版）》01书籍介绍深度学习已经彻底改变了模式识别，为计算机视觉、自然语言处理和自动语音识别等领域提供了强大的工具。应用深度学
根据论文复现大模型方法以及出错处理技巧 Ai玩家hly 从0倒1 论文复现大模型复现 Ai大模型复现
复现一篇论文中的大模型搭建涉及以下几个关键步骤：理解论文的模型架构、数据集处理、超参数设置以及实验环境的搭建。这里给出一个基本的实现方法示例，假设我们选择复现一个图像分类任务中的经典模型，例如ResNet。实现步骤示例1.理解论文和模型架构选择一篇关于ResNet的论文作为示例，例如《DeepResidualLearningforImageRecognition》（Heetal.,2015）。2.
从 DeepSeek 到 AI 工具箱：Websoft9 应用托管平台赋能高校教学与科研人工智能deepseek
从DeepSeek到AI工具箱：Websoft9应用托管平台赋能高校教学与科研人工智能技术的快速发展正在重塑高校的教学与科研生态。从智能教学辅助到跨学科研究，AI工具的应用场景不断扩展，而技术落地的复杂性也带来新的挑战。在这一背景下，如何将大模型能力与多样化AI工具无缝整合，构建安全、易用的科研教学环境，成为高校数字化转型的关键命题。一、高校智能化转型的三大痛点技术门槛高•AI工具部署依赖专业运维
集成学习（Ensemble Learning）基础知识1 代码骑士 #机器学习集成学习机器学习人工智能
文章目录一、集成学习1、基本概念2、回顾:误差的偏差-方差分解3、为什么集成学习有效？4、基学习器：“好而不同”5、集成学习的两个基本问题（1）如何训练出具有差异性的多个基学习器？（2）如何将多个基学习器的预测结果集成为最终的强学习器预测结果？二、自助法（Bagging）1、Bagging2、BootstrapBootstrap采样的数学性质3、Bagging:集成学习的两个基本问题（1）如何训练
【微信小程序变通实现DeepSeek支持语音】技术与健康微信小程序小程序
微信小程序实现录音转文字，并调用后端服务（Node.js）进行语音识别和，然后调用DeepSeek处理的完整实现。整体架构前端（微信小程序）：实现录音功能。将录音文件上传到后端。接收后端返回的语音识别结果，并显示在可编辑的文本框中。调用DeepSeek处理文本。后端（Node.js）：接收小程序上传的录音文件。调用腾讯云语音识别（ASR）服务，将语音转换为文字。返回识别结果给小程序。提供DeepS
AI数字人分身系统+deepseek深层技术刨析 Yxh18137784554 数字人人工智能音视频架构
#数字人分身系统##ai数字人#AI数字人分身系统：解码技术源头架构，重塑数字未来**在元宇宙加速渗透、人机交互边界持续突破的今天，AI数字人分身系统正从科幻概念演变为商业与社会的核心工具。其背后，一套融合顶尖AI技术与工程化思维的技术架构，正在重新定义“数字生命”的可能性。本文将从技术源头出发，深度解析AI数字人分身系统的核心架构设计，揭示其如何实现“形神兼备”的数字化身。---一、技术云罗数字
实战LLM强化学习——使用GRPO（DeepSeek R1出圈算法）大富大贵7 程序员知识储备1 程序员知识储备2 程序员知识储备3 经验分享
引言近年来，深度强化学习（DRL）已经成为解决复杂决策问题的一个强有力工具，尤其是在自然语言处理（NLP）领域的广泛应用。通过不断优化决策策略，DRL能在大量数据中学习最佳行为，尤其是大型语言模型（LLM）在任务中展现出的巨大潜力。然而，随着模型规模的扩大和任务复杂性的增加，传统的强化学习算法开始暴露出训练效率低、收敛速度慢等问题。为了解决这些挑战，DeepSeek公司提出了一个新的强化学习算法—
Manus详细介绍 accurater c++算法笔记深度学习人工智能神经网络
第一章Manus的技术背景与核心突破初识ManusAI1.1什么是Manus？Manus是由中国团队Monica.im于2025年3月推出的全球首款通用型AI智能体（AIAgent）。其名称源自拉丁语“MensetManus”，意为“手脑并用”，强调将大模型的逻辑推理能力转化为实际生产力。与传统的对话式AI（如ChatGPT、DeepSeek）不同，Manus的核心定位是“执行型助手”，能够自主完
ChatGPT、DeepSeek、GIS与Python机器学习强强联合！地质灾害风险评估、易发性分析、信息化建库及灾后重建 WangYan2022 DeepSeek ChatGPT 地下水地质灾害 DeepSeek ChatGPT GIS 灾后重建
在地质灾害频繁肆虐的当下，精准开展风险评价刻不容缓。如今，一门极具创新性的教程震撼登场，它将ChatGPT、DeepSeek等前沿技术与GIS、Python以及机器学习深度交融，为学员打造出前所未有的学习体验，助力大家在地质灾害风险评价领域强势突围，一路领先。前沿技术融合，铸就智能学习核心动力教程最闪耀的亮点之一，便是大胆引入了ChatGPT和DeepSeek技术。它们恰似无所不能的“数据魔法师”
Chainlink 预言机的原理解析 Chainlink资讯预言机 Chainlink 智能合约
本文来自于8月19日Chainlink开发者社区中国负责人Frank，在DAppLearning分享会上对于Chainlink预言机的原理的讲解，以下是这节分享会的总结内容。有兴趣的小伙伴可以结合视频一起学习：为什么区块链无法主动获取外界数据区块链的特点区块链是一个封闭的确定性系统，每一笔交易都需要不同节点共识，只有超过一定数量的节点共识成功，交易才会被真正认可，并写入区块链。因为对于外部API的
DeepSeek智能政务大脑：城市服务知识库构建全指南——从RAG架构到民生场景落地实践 Coderabo DeepSeek R1模型企业级应用政务架构
DeepSeek赋能城市智慧升级：基于RAG架构的市民服务智能知识库构建全解一、需求分析与技术选型1.1市民服务场景需求市民服务智能知识库需要解决政务咨询效率低下、专业术语难理解、多轮对话能力弱等核心问题。系统需具备：自然语言理解能力（NLU）异构知识整合能力政策法规精准解读能力多轮对话上下文管理应急服务联动机制1.2DeepSeek技术栈选择基于DeepSeek-Large语言模型构建核心系统，
deepseek具体应用场景 ahyouxiang 人工智能
DeepSeek的具体应用场景非常广泛，涵盖了多个领域和行业。以下是基于证据的详细总结：金融领域DeepSeek在金融领域的应用表现突出，例如通过其大语言模型（如DeepSeekLLM67Bt）提供数学、逻辑推理等能力，帮助金融机构提升服务效率。此外，DeepSeek还被应用于智能安全体产品中，通过安全大模型实现个性化开发和优化。医疗领域在医疗领域，DeepSeek的技术被用于辅助诊断和患者记录管
DeepSeek 大模型落地成都高新区：科技赋能警务的创新变革 AGI大模型学习科技人工智能 DeepSeek 大模型 chatgpt 大模型应用 AI大模型
在科技飞速发展的当下，人工智能正以前所未有的速度融入各个领域，深刻改变着人们的生活与工作方式。公安领域也不例外，积极拥抱科技创新，成为提升警务效能、维护社会稳定的关键路径。全国第一例警用DeepSeek大模型落地成都高新区，这一突破性举措在警务智能化发展进程中具有里程碑意义，为公安工作带来了全方位的革新。一、警用DeepSeek大模型落地的时代背景近年来，国产AI蓬勃发展，不断涌现出令人瞩目的成果
利用 HAI 平台进行 DeepSeek 模型训练的详细指南
摘要本文旨在为非专业用户提供在HAI平台上进行DeepSeek模型训练的详细步骤。从创建项目、上传数据集、配置训练参数到启动训练任务并监控训练过程，本文将逐步指导用户完成整个流程。此外，本文还包含可运行的示例代码模块和相关章节配图，以帮助用户更好地理解和操作。引言HAI（HyperAI）平台是一个强大的AI模型训练平台，但对于非专业用户来说，其复杂性可能会成为使用的障碍。本文将详细介绍如何在HAI
DeepSeek的实际应用场景：AI技术如何赋能多领域创新 2501_91189350 人工智能
DeepSeek作为新一代智能技术平台，凭借其强大的算法能力和灵活的部署方式，正在多个行业掀起效率革命。本文将从真实案例出发，解析DeepSeek在不同场景中的落地应用。‌场景一：金融风控建模‌在信贷风险评估领域，传统模型存在数据维度单一、更新滞后等问题。某银行引入DeepSeek的‌动态特征工程模块‌，通过实时整合用户行为数据、社交网络信息等100+维度特征，成功将坏账识别准确率提升至98.5%
DeepSeek爆火，背后模型竟藏着这些秘密！ qq_23519469 ai
DeepSeek是什么来头最近，AI圈可是被一个名字刷爆了屏，那就是DeepSeek！它就像一颗横空出世的超级新星，在全球范围掀起了一阵狂热的追捧潮，这热度，简直了！大家都在疯狂讨论它，各种测评、对比层出不穷。它到底有啥过人之处，能让这么多人都为之疯狂？今天咱就来好好唠唠。DeepSeek，全称杭州深度求索人工智能基础技术研究有限公司，是一家专注于开发先进大语言模型（LLM）和相关技术的企业。它成
Ai时代初期全球不同纬度的层级辐射现象龙胥伯人工智能
基于最新研究成果与行业动态，AI时代的"层级辐射"现象可被科学解构为以下六大维度，结合技术演进、产业实践和社会影响进行系统性分析：一、技术能力的层级跃迁模型效率革命DeepSeek研发的R1-Zero模型通过动态架构设计，将样本利用率提升40%以上，训练周期大幅缩短。这种技术突破推动AI从实验室走向规模化应用，在智能制造、生物医药等领域催生新生态。大语言模型的训练方式（预训练→多任务学习→强化学习
使用Deepseek书写一篇综述论文，如何提示？学术乙方小知识经验分享
使用DeepSeek撰写综述论文时，可以通过以下提示和步骤来高效完成任务：明确研究主题与范围在开始撰写之前，首先需要明确研究主题、文献综述的时间跨度、地理范畴和文献类型。这有助于聚焦研究方向，避免偏离主题。制定详细的提示词提示词的设计是高效利用DeepSeek的关键。可以参考以下模板：研究背景与现状：请帮我梳理XXX领域的研究背景与现状，包括国内外的主要研究成果和研究热点。文献筛选与阅读：请帮我筛
程序员集体失业？DeepSeek这6个反常识用法竟能替代写代码后端
上周三凌晨两点，我盯着满屏报错的SpringBoot项目抓耳挠腮时，无意间在GitHubtrending榜发现了个宝藏项目。这个让3000+程序员连夜改简历的AI工具，居然把我的烂代码变成了性能提升40%的优雅实现——这可不是什么天方夜谭，而是我亲身经历的DeepSeek实战故事。你可能不信，现在用自然语言描述需求就能生成可运行代码。就像上周我接到个紧急任务：要在三天内完成电商平台的优惠券系统。当
国产信创AI IDE：开启智能编程新时代 InsCode AI IDE
国产信创AIIDE：开启智能编程新时代随着信息技术的迅猛发展，软件开发工具也在不断演进。近年来，人工智能（AI）技术的应用为编程工具带来了革命性的变化。其中，国产信创AIIDE——InsCodeAIIDE，作为一款由CSDN、GitCode和华为云CodeArtsIDE联合开发的新一代集成开发环境（IDE），以其智能化、高效化的特点，正在引领智能编程的新时代。最新接入DeepSeek-V3模型，点
DeepSeek私有化部署搭建、本地知识库、可联网查询RAG检索增强生成 TonyH2002 DeepSeek 本地部署私有化搭建联网查询
一、如何私有化部署DeepSeek如何部署DeepSeek，具体可参考以下内容：喂饭式教程-腾讯云轻量服务器部署DeepSeek：https://cloud.tencent.com/developer/article/2494571喂饭式教程-腾讯云HAI服务部署DeepSeek：https://cloud.tencent.com/developer/article/2495288喂饭式教程-腾讯
AIGC时代品牌突围战：10招玩转DeepSeek内容推荐（深度扩展版）白雪讲堂人工智能大数据机器学习
一、认知革命：从SEO到GEO的生死迭代案例对比：传统SEO困境：某家电品牌2023年投入200万SEO优化，关键词排名TOP3但流量下降42%（SEMrush数据）GEO突破案例：某母婴品牌通过结构化数据改造，AI推荐量从日均300次飙升至1.2万次（来源：DeepSeek官方案例库）实战要点：内容形态改造：将产品参数表升级为JSON-LD格式（某手机品牌实现参数类问题100%引用）流量分配逻辑
Java开发者必看！零成本集成DeepSeek-R1打造AI办公神器，源码级实战教程让你效率翻倍！ Leaton Lee java 人工智能开发语言
目录开篇互动一、为什么是DeepSeek-R1？它凭什么碾压传统AI工具？二、手把手部署DeepSeek-R1本地环境（附避坑指南）步骤1：Docker一键部署步骤2：下载模型步骤3：验证部署三、Java整合DeepSeek-R1：从理论到实战1.添加HTTP客户端依赖（以SpringBoot为例）2.封装AI工具类（核心代码解析）3.实战场景1：自动生成周报（附Prompt技巧）四、高阶玩法：A
DeepSeek + 药物研发：解决药物研发周期长、成本高-降低80%、失败率高-减少40% Debroon 医疗大模型研发 +慢病逆转人工智能深度学习
DeepSeek+药物研发：解决药物研发周期长、成本高-降低80%、失败率高-减少40%论文大纲1.WHY——研究背景与现实问题1.1研究要解决的现实问题与提出背景1.2研究所要解决的问题类别1.3正反例对比关联：和前人的工作有什么关系？3.总结归纳3.1总结收获3.2探索思考4.WHAT——核心发现或论点5.HOW——研究过程、创新与关键数据6.HOWGOOD——理论贡献与实践意义解法拆解1.1
程序员不用写代码？DeepSeek这个隐藏功能让我惊掉下巴后端
凌晨三点半，显示器蓝光映着我的黑眼圈。就在我第18次修改接口文档时，同事老王突然在微信甩来个神秘链接："用这个，今晚能睡个好觉"。我点开那个叫DeepSeek的页面，没想到接下来的三个小时，我经历了职业生涯最魔幻的加班夜。你见过会自己写测试用例的AI吗？那天晚上，我把项目需求文档往DeepSeek的对话框一扔，它竟然像资深架构师似的，先把需求拆解成模块，接着自动生成了带注释的接口文档。最绝的是，在
2025年从DeepSeek到Manus：AI如何重塑企业价值报告600+份汇总解读|附PDF下载
原文链接：https://tecdat.cn/?p=41172当前全球AI技术正从实验室走向产业化深水区，本报告以企业价值重构为核心，通过技术演进路径、行业竞争范式、落地实施策略三大维度，揭示AI如何从成本中心转变为价值引擎。数据显示，2025年生成式AI在中国创造的潜在经济价值达2万亿美元，其中制造业、电子行业生产力增益最为显著。本报告汇总解读基于《发布机构：华中科技大学数智管理与传播研究团队、
程序员别再用GitHub了！这个国产神器让你的开发效率原地起飞后端
去年这个时候，我还在为团队协作的代码管理头疼不已。直到某天凌晨三点，盯着满屏的Git指令的我突然发现，自己居然把feature分支合并到了生产环境——这个要命的失误让我在茶水间被同事调侃了整整三个月。就在我准备写辞职信的时候，一个偶然的机会让我遇到了DeepSeek，这个国产开发神器彻底改变了我的职业生涯。你可能很难想象，现在我的团队每天要处理200多个合并请求，但再也没出现过那次凌晨三点的事故。
windows下使用vscode+cline插件体验MCP，体验使用AI控制浏览器，踩坑记录（至少让你节省3个小时弯路版）（喂饭级别）几道之旅人工智能智能体及数字员工 windows vscode ide 人工智能
为什么网上天天说MCP，你这儿却一点动静都没有？1️⃣人家很早之前就用上了制定标准的Claudedesktop，这玩意儿在咱这儿用不了。对策：使用vscode+cline+deepseek（或其它同级别国产大模型deepseek-V3其实有时比R1效果还好）2️⃣人家也Claude，但人家能用Cursor，咱太穷了，用不了。对策：使用vscode+cline+deepseek（或其它同级别国产大模
redis学习笔记——不仅仅是存取数据 Everyday都不同 returnSource expire/del incr/lpush 数据库分区 redis
最近项目中用到比较多redis，感觉之前对它一直局限于get/set数据的层面。其实作为一个强大的NoSql数据库产品，如果好好利用它，会带来很多意想不到的效果。（因为我搞java，所以就从jedis的角度来补充一点东西吧。PS：不一定全，只是个人理解，不喜勿喷） 1、关于JedisPool.returnSource(Jedis jeids) 这个方法是从red
SQL性能优化-持续更新中。。。。。。 atongyeye oracle sql
1 通过ROWID访问表--索引你可以采用基于ROWID的访问方式情况,提高访问表的效率, , ROWID包含了表中记录的物理位置信息..ORACLE采用索引(INDEX)实现了数据和存放数据的物理位置(ROWID)之间的联系. 通常索引提供了快速访问ROWID的方法,因此那些基于索引列的查询就可以得到性能上的提高. 2 共享SQL语句--相同的sql放入缓存 3 选择最有效率的表
[JAVA语言]JAVA虚拟机对底层硬件的操控还不完善 comsci JAVA虚拟机
如果我们用汇编语言编写一个直接读写CPU寄存器的代码段，然后利用这个代码段去控制被操作系统屏蔽的硬件资源，这对于JVM虚拟机显然是不合法的，对操作系统来讲，这样也是不合法的，但是如果是一个工程项目的确需要这样做，合同已经签了，我们又不能够这样做，怎么办呢？那么一个精通汇编语言的那种X客，是否在这个时候就会发生某种至关重要的作用呢？ &n
lvs- real 男人50 LVS
#!/bin/bash # # Script to start LVS DR real server. # description: LVS DR real server # #. /etc/rc.d/init.d/functions VIP=10.10.6.252 host='/bin/hostname' case "$1" in sta
生成公钥和私钥 oloz DSA 安全加密
package com.msserver.core.util; import java.security.KeyPair; import java.security.PrivateKey; import java.security.PublicKey; import java.security.SecureRandom; public class SecurityUtil {
UIView 中加入的cocos2d，背景透明 374016526 cocos2d glClearColor
要点是首先pixelFormat:kEAGLColorFormatRGBA8，必须有alpha层才能透明。然后view设置为透明glView.opaque = NO;[director setOpenGLView:glView];[self.viewController.view setBackgroundColor:[UIColor clearColor]];[self.viewControll
mysql常用命令香水浓 mysql
连接数据库 mysql -u troy -ptroy 备份表 mysqldump -u troy -ptroy mm_database mm_user_tbl > user.sql 恢复表（与恢复数据库命令相同） mysql -u troy -ptroy mm_database < user.sql 备份数据库 mysqldump -u troy -ptroy
我的架构经验系列文章 - 后端架构 - 系统层面 agevs JavaScript jquery css html5
系统层面：高可用性所谓高可用性也就是通过避免单独故障加上快速故障转移实现一旦某台物理服务器出现故障能实现故障快速恢复。一般来说，可以采用两种方式，如果可以做业务可以做负载均衡则通过负载均衡实现集群，然后针对每一台服务器进行监控，一旦发生故障则从集群中移除；如果业务只能有单点入口那么可以通过实现Standby机加上虚拟IP机制，实现Active机在出现故障之后虚拟IP转移到Standby的快速
利用ant进行远程tomcat部署 aijuans tomcat
在javaEE项目中，需要将工程部署到远程服务器上，如果部署的频率比较高，手动部署的方式就比较麻烦，可以利用Ant工具实现快捷的部署。这篇博文详细介绍了ant配置的步骤（http://www.cnblogs.com/GloriousOnion/archive/2012/12/18/2822817.html），但是在tomcat7以上不适用，需要修改配置，具体如下： 1.配置tomcat的用户角色
获取复利总收入 baalwolf 获取
public static void main(String args[]){ int money=200; int year=1; double rate=0.1; &
eclipse.ini解释 BigBird2012 eclipse
大多数java开发者使用的都是eclipse，今天感兴趣去eclipse官网搜了一下eclipse.ini的配置，供大家参考，我会把关键的部分给大家用中文解释一下。还是推荐有问题不会直接搜谷歌，看官方文档，这样我们会知道问题的真面目是什么，对问题也有一个全面清晰的认识。 Overview 1、Eclipse.ini的作用 Eclipse startup is controlled by th
AngularJS实现分页功能 bijian1013 JavaScript AngularJS 分页
对于大多数web应用来说显示项目列表是一种很常见的任务。通常情况下，我们的数据会比较多，无法很好地显示在单个页面中。在这种情况下，我们需要把数据以页的方式来展示，同时带有转到上一页和下一页的功能。既然在整个应用中这是一种很常见的需求，那么把这一功能抽象成一个通用的、可复用的分页（Paginator）服务是很有意义的。 &nbs
[Maven学习笔记三]Maven archetype bit1129 ArcheType
archetype的英文意思是原型，Maven archetype表示创建Maven模块的模版，比如创建web项目，创建Spring项目等等. mvn archetype提供了一种命令行交互式创建Maven项目或者模块的方式， mvn archetype 1.在LearnMaven-ch03目录下，执行命令mvn archetype:gener
【Java命令三】jps bit1129 Java命令
jps很简单，用于显示当前运行的Java进程，也可以连接到远程服务器去查看 [hadoop@hadoop bin]$ jps -help usage: jps [-help] jps [-q] [-mlvV] [<hostid>] Definitions: <hostid>: <hostname>[:
ZABBIX2.2 2.4 等各版本之间的兼容性 ronin47
zabbix更新很快，从2009年到现在已经更新多个版本，为了使用更多zabbix的新特性，随之而来的便是升级版本，zabbix版本兼容性是必须优先考虑的一点客户端AGENT兼容 zabbix1.x到zabbix2.x的所有agent都兼容zabbix server2.4：如果你升级zabbix server，客户端是可以不做任何改变，除非你想使用agent的一些新特性。 Zabbix代理（p
unity 3d还是cocos2dx哪个适合游戏？ brotherlamp unity自学 unity教程 unity视频 unity资料 unity
unity 3d还是cocos2dx哪个适合游戏？问：unity 3d还是cocos2dx哪个适合游戏？答：首先目前来看unity视频教程因为是3d引擎，目前对2d支持并不完善，unity 3d 目前做2d普遍两种思路，一种是正交相机，3d画面2d视角，另一种是通过一些插件，动态创建mesh来绘制图形单元目前用的较多的是2d toolkit，ex2d，smooth moves，sm2，
百度笔试题：一个已经排序好的很大的数组，现在给它划分成m段，每段长度不定，段长最长为k，然后段内打乱顺序，请设计一个算法对其进行重新排序 bylijinnan java 算法面试百度招聘
import java.util.Arrays; /** * 最早是在陈利人老师的微博看到这道题： * #面试题#An array with n elements which is K most sorted，就是每个element的初始位置和它最终的排序后的位置的距离不超过常数K * 设计一个排序算法。It should be faster than O(n*lgn)。
获取checkbox复选框的值 chiangfai checkbox
<title>CheckBox</title> <script type = "text/javascript"> doGetVal: function doGetVal() { //var fruitName = document.getElementById("apple").value;//根据
MySQLdb用户指南 chenchao051 mysqldb
原网页被墙，放这里备用。 MySQLdb User's Guide Contents Introduction Installation _mysql MySQL C API translation MySQL C API function mapping Some _mysql examples MySQLdb
HIVE 窗口及分析函数 daizj hive 窗口函数分析函数
窗口函数应用场景：（1）用于分区排序（2）动态Group By （3）Top N （4）累计计算（5）层次查询一、分析函数用于等级、百分点、n分片等。函数说明 RANK() &nbs
PHP ZipArchive 实现压缩解压Zip文件 dcj3sjt126com PHP zip
PHP ZipArchive 是PHP自带的扩展类，可以轻松实现ZIP文件的压缩和解压，使用前首先要确保PHP ZIP 扩展已经开启，具体开启方法就不说了，不同的平台开启PHP扩增的方法网上都有，如有疑问欢迎交流。这里整理一下常用的示例供参考。一、解压缩zip文件 01 02 03 04 05 06 07 08 09 10 11
精彩英语贺词 dcj3sjt126com 英语
I'm always here 我会一直在这里支持你 &nb
基于Java注解的Spring的IoC功能 e200702084 java spring bean IOC Office
java模拟post请求 geeksun java
一般API接收客户端（比如网页、APP或其他应用服务）的请求，但在测试时需要模拟来自外界的请求，经探索，使用HttpComponentshttpClient可模拟Post提交请求。此处用HttpComponents的httpclient来完成使命。 import org.apache.http.HttpEntity ; import org.apache.http.HttpRespon
Swift语法之 ---- ?和!区别 hongtoushizi ?swift !
转载自： http://blog.sina.com.cn/s/blog_71715bf80102ux3v.html Swift语言使用var定义变量，但和别的语言不同，Swift里不会自动给变量赋初始值，也就是说变量不会有默认值，所以要求使用变量之前必须要对其初始化。如果在使用变量之前不进行初始化就会报错： var stringValue : String //
centos7安装jdk1.7 jisonami jdk centos
安装JDK1.7 步骤1、解压tar包在当前目录 [root@localhost usr]#tar -xzvf jdk-7u75-linux-x64.tar.gz 步骤2：配置环境变量在etc/profile文件下添加 export JAVA_HOME=/usr/java/jdk1.7.0_75 export CLASSPATH=/usr/java/jdk1.7.0_75/lib
数据源架构模式之数据映射器 home198979 PHP 架构数据映射器 datamapper
前面分别介绍了数据源架构模式之表数据入口、数据源架构模式之行和数据入口数据源架构模式之活动记录，相较于这三种数据源架构模式，数据映射器显得更加“高大上”。一、概念数据映射器（Data Mapper）：在保持对象和数据库（以及映射器本身）彼此独立的情况下，在二者之间移动数据的一个映射器层。概念永远都是抽象的，简单的说，数据映射器就是一个负责将数据映射到对象的类数据。 &nb
在Python中使用MYSQL pda158 mysql python
缘由　　近期在折腾一个小东西须要抓取网上的页面。然后进行解析。将结果放到数据库中。　　了解到 Python在这方面有优势，便选用之。　　由于我有台 server上面安装有 mysql，自然使用之。在进行数据库的这个操作过程中遇到了不少问题，这里记录一下，大家共勉。　　 python中mysql的调用　　百度之后能够通过MySQLdb进行数据库操作。
单例模式 hxl1988_0311 java 单例设计模式单件
package com.sosop.designpattern.singleton; /* * 单件模式：保证一个类必须只有一个实例，并提供全局的访问点 * * 所以单例模式必须有私有的构造器，没有私有构造器根本不用谈单件 * * 必须考虑到并发情况下创建了多个实例对象 * */ /** * 虽然有锁，但是只在第一次创建对象的时候加锁，并发时不会存在效率
27种迹象显示你应该辞掉程序员的工作 vipshichg 工作
1、你仍然在等待老板在2010年答应的要提拔你的暗示。 2、你的上级近10年没有开发过任何代码。 3、老板假装懂你说的这些技术，但实际上他完全不知道你在说什么。 4、你干完的项目6个月后才部署到现场服务器上。 5、时不时的，老板在检查你刚刚完成的工作时，要求按新想法重新开发。 6、而最终这个软件只有12个用户。 7、时间全浪费在办公室政治中，而不是用在开发好的软件上。 8、部署前5分钟才开始测试。

MyDLNote - Attention: [NLA系列] Asymmetric Non-local Neural Networks for Semantic Segmentation

Asymmetric Non-local Neural Networks for Semantic Segmentation

Abstract

Introduction

Related Work

Asymmetric Non-local Neural Network

Revisiting Non-local Block

Asymmetric Pyramid Non-local Block

Asymmetric Fusion Non-local Block

Network Architecture

Experiments

Datasets and Evaluation Metrics

Implementation Details

Comparisons with Other Methods

Conclusion

你可能感兴趣的:(deep,learning,semantic,segmentation,non-local)