哈喽十八子

【DIN论文精读】Deep Interest Network for Click-Through Rate Prediction

文章目录

- Paper
- 摘要 Abstract
1. 简介 Introduction
2. 前人的相关工作 Related Work
3. 系统概述 System Overview
- 3.1 用户行为数据的特性 Characteristic of User Behavior Data
- 3.2 特征表示 Feature Representation
- 3.3 Metrics
4. 模型架构 Model Architecture
- 4.1 基准模型 Base Model
- 4.2 深度兴趣网络的设计 Deep Interest Network Design
- 4.3 基于数据的激活函数 Data Dependent Activation Function
- 4.4 自适应正则化技术 Adaptive Regularization Technique
5. 实现 Implementation
6. 实验 Experiments
- 6.1 可视化 Visualization of DIN
- 6.2 正则化 Regularization
- 6.3 DIN和基准模型的对比 Comparison of DIN and base model
7 总结 Conclusions

Paper

本博客仅作为学习交流材料，论文版权归原作者所有：
ArXiv: https://arxiv.org/abs/1706.06978v2
APA：Zhou, G. , Gai, K. , Zhu, X. , Song, C. , Fan, Y. , & Zhu, H. , et al. (2018). Deep Interest Network for Click-Through Rate Prediction. (pp.1059-1068).

摘要 Abstract

To better extract users’ interest by exploiting the rich historical behavior data is crucial for building the click-through rate (CTR) prediction model in the online advertising system in e-commerce industry.

在电子商务的工业应用，如何利用（注：用户的）丰富的历史行为数据，更好地提取用户的兴趣，是构建点击率(CTR)预测模型的关键。

There are two key observations on user behavior data:

关于用户行为数据，有两个关键的观察发现：

i) diversity. Users are interested in different kinds of goods when visiting e-commerce site.
ii) local activation. Whether users click or not click a good depends only on part of their related historical behavior.

多样性。用户在访问电子商务网站时，会对不同种类的商品产生兴趣。
局部激活。用户是否点击商品，只取决于他们相关的历史行为中的一部分。

However, most traditional CTR models lack of capturing these structures of behavior data.

然而，大多数传统的CTR模型，缺乏捕捉这些行为数据的结构。

In this paper, we introduce a new proposed model, Deep Interest Network (DIN), which is developed and deployed in the display advertising system in Alibaba.

在本文中，我们提出了一种新的模型——深度兴趣网络(DIN)，它被开发并部署在阿里巴巴的展示广告系统中。

DIN represents users’ diverse interests with an interest distribution and designs an attention-like network structure to locally activate the related interests according to the candidate ad, which is proven to be effective and significantly outperforms traditional model.

DIN通过兴趣分布来代表用户的不同兴趣，并设计了一个类注意力的神经网络结构，根据候选广告，在局部激活相关兴趣，该模型被证明是有效的，显著优于传统模型。

Over-fitting problem is easy to encounter on training such industrial deep network with large scale sparse inputs. We study this problem carefully and propose a useful adaptive regularization technique.

对于这种具有大规模稀疏输入的工业深度网络，在训练时容易遇到过拟合问题。我们仔细研究了这个问题，并提出了一种有效的自适应正则化技术。

1. 简介 Introduction

Display advertising business brings billions dollars income yearly in Alibaba. In cost-per-click (CPC) advertising system, advertisements are ranked by the eCPM (effective cost per mille) which is the product of the bid price and CTR (click-through rate). Hence, the performance of CTR prediction model has a straight impact on the final revenue and plays a key role in the advertising system.

展示广告业务每年为阿里巴巴带来数十亿美元的收入。在每次点击成本(CPC)广告系统中，广告是按照 eCPM（千次展示有效收益） 进行排名的，eCPM是投标价格和点击率(CTR)的乘积。因此，点击率预测模型的表现直接影响最终的收入，在广告系统中起着关键作用。

Driven by the success of deep learning in image recognition, computer vision and natural language processing, a number of deep learning based methods have been proposed for CTR prediction task [1, 2, 3, 4]. These methods usually first employ embedding layer on the input, mapping original large scale sparse id features to the distributed representations, then add fully connected layers (in other words, multilayer perceptrons, MLPs) to automatically learn the nonlinear relations among features. Compared to traditional commonly used logistic regression model [5, 6], MLPs can reduce a lot of feature engineering jobs, which is time and manpower consuming in industry applications. MLPs now have become a popular model structure on CTR prediction problem. However, in the fields with rich internet-scale user behavior data, such as online advertising and recommendation system in e-commence industry, these MLPs models often lack of deep understanding and exploiting the specific structures of behavior data, thus leave space for further improvement.

深度学习在图像识别、计算机视觉和自然语言处理等领域均取得了成功，在此推动下，许多基于深度学习的方法被提出，用于CTR预测[1,2,3,4]。这些方法通常首先在输入上采用embedding层，将原始的大尺度稀疏id特征映射到（注：低维的）分布式表示上，然后加入全连接层（即多层感知器，MLPs）来自动学习特征之间的非线性关系。与传统常用的逻辑回归模型相比[5,6]，MLP可以减少大量特征工程工作，在工业应用中，这部分工作是非常费时费力的。目前，MLPs模型已经成为研究CTR预测问题的一种常用模型结构。然而，在拥有丰富的、互联网规模的（注，即大量的）用户行为数据的领域，如电子商务行业的在线广告、推荐系统等，对于行为数据的具体结构，这些MLPs模型往往缺乏深入的理解和挖掘，这留下了进一步改进的空间。

To summarize the structures of user behavior data collected in the display advertising system in Alibaba, we report two key observations:

Diversity. Users are interested in different kinds of goods when visiting e-commerce site. For example, a young mother may be interested in T-shits, leather handbag, shoes, earrings, children’s coat, etc at the same time.

Local activation. Due to the diversity of users’ interests, only a part of users’ historical behavior contribute to each click. For example, a swimmer will click a recommended goggle mostly due to the bought of bathing suit while not the books in her last week’s shopping list.

为了总结在阿里巴巴展示广告系统中收集的用户行为数据的结构，我们提出了两个关键的观察结果:

多样性。用户在访问电子商务网站时，会对不同种类的商品产生兴趣。例如，一位年轻的母亲可能同时对T恤、皮包、鞋子、耳环、儿童外套等感兴趣。
局部激活。由于用户兴趣的多样性，只有一部分用户的历史行为会影响到（注：后面的）每次点击。例如，一个游泳者会点击推荐的泳镜，主要是因为买了泳衣，而不是她上周购物清单上的书。

In this paper, we introduce a new proposed model, named Deep Interest Network (DIN), which is developed and deployed in the display advertising system in Alibaba. Inspired by the attention mechanism used in machine translation model[7], DIN represents users’ diverse interests with an interest distribution and designs an attention-like network structure to locally activate the related interests according to the candidate ad. We demonstrate this phenomenon in the experiment section 6.1. Behaviors with higher relevance to the candidate ad get higher attention scores and dominant the prediction. Experiments on Alibaba’s productive CTR prediction datasets prove that the proposed DIN model significantly outperforms MLPs under the GAUC (group weighted AUC, see section 3.3) metric measurement.

在本文中，我们提出了一种新的模型，名为深度兴趣网络(DIN)，它被开发并部署在阿里巴巴的展示广告系统中。DIN受到机器翻译模型[7]中注意力机制的启发，通过兴趣分布来代表用户的多种兴趣，并设计了一个类注意力的网络结构，根据候选广告在局部激活相关兴趣。我们在6.1节的实验中，展示了这一现象。与候选广告（注：商品）相关度越高的行为，其注意力得分越高，在预测中占主导地位。基于阿里巴巴丰富的CTR预测数据集，实验证明，本文提出的DIN模型在GAUC（分组加权AUC，见3.3节）度量方式向下，显著优于MLPs（注：多层感知器系列模型）。

Overfitting problem is easy to encounter on training such industrial deep network with large scale sparse inputs. Experimentally we show with addition of fine-grained user behavior feature (e.g., good id), the deep network models easily fall into the overfitting trap and cause the model performance to drop rapidly. In this paper, we study this problem carefully and propose a useful adaptive regularization technique, which is proven to be effective for improving the network convergence in our application. DIN is implemented at a multi-GPU distributed training platform named X-Deep Learning (XDL), which supports model-parallelism and data-parallelism. To utilize the structural property of internet behavior data, we employ the common feature trick proposed in [8] to reduce the storage and computation cost. Due to the high performance and exibility of XDL platform, we accelerate training process about 10 times and optimize hyparameters automatically with high tuning efficiency.

对于这种具有大规模稀疏输入的工业级深度神经网络，训练时很容易遇到过拟合问题。实验表明，随着细粒度用户行为特征(例如商品id)的加入，深度神经网络模型很容易陷入过拟合陷阱，导致模型性能急剧下降。在本文中，我们仔细地研究了这个问题，并提出了一种有效的自适应正则化技术，在我们的实际应用中，（注：该技术）被证明对提高网络模型的收敛性是有效的。DIN是在多GPU分布式训练平台上实现的，该平台名为X-Deep Learning(XDL)，支持模型并行和数据并行。为了利用网络行为（注：用户上网行为）数据的结构特性，我们采用[8]中提出的通用特征技巧来降低存储和计算成本。由于XDL平台的高性能和灵活性，我们可以将训练过程加快10倍左右，并以很高的调优效率自动优化超参数。

The contributions of the paper are summarized as follows:

We study and summarize two key structures of internet-scale user behavior data in industrial e-commence applications: diversity and local activation.

We propose a deep interest network (DIN), which can better capture the specific structures of behavior data and bring improvements of model performance.

We introduce a useful adaptive regularization technique to overcome the overfitting problem in training industrial deep network with large scale sparse inputs, which can generalize to similar industry tasks easily.

We develop XDL, a multi-GPU distributed training platform for deep networks, which is scalable and flexible to support our diverse experiments with high performance.

本文的贡献总结如下:

我们研究和总结了工业电子商务应用中，互联网规模的用户行为数据的两个关键结构：多样性和局部激活。
我们提出了一个深度兴趣网络(DIN)，它可以更好地捕捉（注：用户）行为数据的具体结构，并带来模型性能的改进。
我们引入了一种有效的自适应正则化技术来克服具有大规模稀疏输入的工业深度网络训练中的过拟合问题，它可以很容易地推广到类似的工业任务。
我们开发了XDL，一个用于深度网络的多GPU分布式训练平台，它具有可扩展性和灵活性，可以以较高的性能，支持我们的各种实验。

In this paper we focus on the CTR prediction task in the scenario of display advertising in e-commerce industry. Methods discussed here can be applied in similar scenarios with rich internet-scale user behavior data, such as personalized recommendation in e-commerce sites, feeds ranking in social networks etc.

本文主要研究电子商务行业展示广告场景下的点击率预测任务。本文所讨论的方法，也可以应用于具有丰富的互联网规模用户行为数据的其他类似场景，如电子商务网站的个性化推荐、社交网络的订阅排名等。

The rest of the paper is organized as follows. We discuss related work in Section 2. Section 3 gives an overview of our display advertising system, including user behavior data and feature representations. Section 4 describes the design of DIN model as well as the adaptive regularization technique. Section 5 gives a brief introduction of developed XDL platform. Section 6 exhibits experiments and analytics. Finally, we conclude the paper in Section 7.

本文的其余部分组织如下。我们将在第2节讨论（注：前人的）相关工作。第3节为我们的展示广告系统提供了一个总览，包括用户行为数据和特征表示。第4节介绍了DIN模型的设计以及自适应正则化技术。第5节简要介绍了先进的XDL平台。第6节展示了实验和分析。最后，在第7节对本文进行总结。

2. 前人的相关工作 Related Work

The CTR prediction model has evolved from shallow to deep structure, with the scale of feature and sample becoming larger and larger at the same time. Along with the mining of feature representations, the design of model structure involves more insights.

CTR预测模型处于逐渐演化的过程中，由浅层结构到深层结构，特征和样本的量级也越来越大。随着特征表示的挖掘，更多的思想被引入模型结构的设计过程中。

As a pioneer work, NNLM [9] proposes to learn distributed representation for words, aiming to avoid curse of dimensionality in language modeling. This idea, we name it embedding, has inspired many natural language models and CTR prediction models that need to handle large-scale sparse inputs.

作为一项开创性的工作，NNLM[9]提出学习词汇的分布式表示，旨在避免语言建模中的维数灾难。这种思想，我们称之为Embedding，启发了许多自然语言模型和CTR预测模型，它们需要处理大规模的稀疏输入。

LS-PLM [8] and FM [10] models can be viewed as a class of networks with one hidden layer, which first employs embedding layer on sparse inputs and then imposes special designed transformation functions for output, aiming to capture the combination relationships among features.

LS-PLM[8]和FM[10]模型可以看作是一类具有单个隐藏层的网络，它首先在稀疏输入上使用Embedding，然后对输出施加专门设计的变换函数，旨在捕捉特征之间的组合关系（注：特征交叉）。

Deep Crossing [1], Wide&Deep Learning [4] and the YouTube Recommendation CTR model [2] extend LS-PLM and FM by replacing the transformation function with complex MLP networks, which enhances the model capability greatly. They follow a similar model structure with combination of embedding layer (for learning the distributed representation of sparse id features) and MLPs (for learning the combination relationships of features automatically). This kind of CTR prediction model replaces the manually artificial feature combinations to a great extent. Our base model follows this kind of model structure. However, it’s worth mentioning that for CTR prediction tasks with user behavior data, features are ofter contained with multi-hot sparse ids, e.g., search terms and watched videos in YouTube recommendation system. These models often add a pooling layer after embedding layer, with operations like sum or average, to get a fixed size embedding vector. This will cause loss of information and can’t take full advantage of inner structure of user rich behavior data.

Deep Crossing[1]、Wide&Deep [4]和YouTube推荐CTR模型[2]扩展了LS-PLM和FM，将转换函数替换为复杂MLP网络，大大增强了模型的能力。它们遵循相似的模型结构，结合Embedding（用于学习稀疏id特征的分布式表示）和MLPs（用于自动学习特征的组合关系)（注：特征交叉）。这种CTR预测模型在很大程度上替代了人工的特征组合。我们的基准模型遵循这种模型结构。但值得一提的是，对于具有用户行为数据的CTR预测任务，往往会包含多个multi-hot稀疏id，如YouTube推荐系统中的搜索词、观看视频等。这些模型通常在Embedding层之后添加池化层，通过求和或求平均等操作，得到一个固定大小的Embedding向量。这将导致信息丢失，不能充分利用用户丰富的行为数据的内部结构（注：内部结构特征和信息）。

Attention mechanism, which originates from Neural Machine Translation field [7], gives us inspiration. NMT takes a weighted sum of all the annotations to get an expected annotation and focus only on information relevant to the generation of the next target word in the Bi-directional RNN [11] machine translation task. This inspired us to design attention-like structure to better model user’s historical diverse interests. A recent work, DeepIntent [3] also applies attention technique to better model the rich inner structure of data, which learns to assign attention scores to different words to obtain better sentence representation in sponsored search. However, there is no interaction between query and document, that is, given the model , query or document representations are fixed. This scenario is different from us since in DIN model user representation is adaptive changing with different candidate ads in display advertising system. In other words, DeepIntent captures the diversity structure of data but misses the local activation property, while the proposes DIN model captures both.

注意力机制，源于神经机器翻译（注：后面的NMT）领域[7]，给我们带来了启示。NMT对所有注释进行加权求和，以得到一个期望的注释，在双向RNN[11]机器翻译任务中，NMT只关注与生成下一个目标词相关的信息。这启发了我们设计类似注意力的结构，以更好地模拟用户的历史兴趣多样性。最近的一项工作，DeepIntent[3]也应用了注意力技术，来更好地对数据内部的丰富结构进行建模，它通过学习如何给不同的单词分配注意分数，以在付费搜索中获得更好的句子表示。但是，查询和文档之间不存在交互，也就是说，给定了模型，查询或文档的表示（注：表示向量）就固定了。这个场景与我们的不同，因为在DIN模型和展示广告系统中，用户表示向量是随着不同的候选广告而自适应变化的。换句话说，DeepIntent捕获了数据的多样性结构，但忽略了局部激活属性，而本文提出的DIN模型同时捕获了两者。

3. 系统概述 System Overview

The overall scenario of the display advertising system is illustrated in Figure 1. Note that, in the e-commerce sites, advertisements are natural goods. Hence, without special declaration, we refer to ads as goods in the rest of this paper.

展示广告系统的总体方案如图1所示。请注意，在电子商务网站中，广告是天然的商品。因此，无需特别声明，我们在本文其余部分将广告称为商品。

When a user visits the e-commerce site, system

checks his historical behavior data

generates candidate ads by matching module

predicts the click probability of each ad and selects appropriate ads which can attract attention (click) by ranking module

logs the user reactions given the displayed ads.

This turns to be a closed-loop consumption and generation of user behavior data. At Alibaba, hundreds of millions of users visit the e-commerce site everyday, leaving us with lots of real data.

当用户访问电商网站时，系统

检查用户的历史行为数据
通过匹配模块生成候选广告
预测每个广告的点击概率，通过排名模块，选择合适的、能够吸引用户注意力(即用户大概率会点击)的广告
记录向用户显示该广告后，用户的反应

这形成一个闭环过程，涉及的用户行为数据的消费和生成。在阿里巴巴，每天都有数亿用户访问这个电子商务网站，给我们留下了大量的真实数据。

3.1 用户行为数据的特性 Characteristic of User Behavior Data

Table 1 shows examples of user behavior collected from our online product. There are two obvious characteristics of users’ behavior data in our system:

Diversity. Users are interested in different kind of goods.

Local activation. Only a part of users’ historical behaviors are relevant to the candidate ad.

表1展示了用户行为示例，从我们的线上生产环境中收集得到。在我们的系统中，用户行为数据有两个明显的特点:

多样性。用户对不同种类的商品感兴趣。
局部激活。只有一部分用户的历史行为与候选广告（注：对候选商品的选择倾向）有关。

3.2 特征表示 Feature Representation

Our feature set is composed of sparse ids, a traditional industry setting like [1, 4, 5]. We group them into four groups, as described in Table 2. Note that in our setting there are no combination features. We capture the interaction of features with deep network.

我们的特征集由稀疏的id组成，典型的工业设置方式如文章[1,4,5]。我们将它们分成四组，如表2所示。注意，在我们的设置中没有组合特征。我们利用深度网络捕获特征的相互作用（注：特征交叉）。

3.3 Metrics

Area under receiver operator curve (AUC)[12] is a commonly used metric in CTR prediction area. In practice, we design a new metric named GAUC, which is the generalization of AUC. GAUC is a weighted average of AUC calculated in the subset of samples group by each user. The weight can be impressions or clicks. An impression based GAUC is calculated as follows:

接受者操作曲线下面积(Area under receiver operator curve, AUC)[12]是CTR预测区域常用的指标。在实际应用中，我们设计了一种新的度量，称为GAUC，它是AUC的推广。GAUC是AUC的加权平均值，在样本组子集中计算，按照每个用户。权重可以是曝光或者点击。基于曝光的GAUC计算公式如下:

GAUC is practically proven to be more indicative in display advertisement settings, where CTR model is applied to rank candidate ads for each user and model performance is mainly measured by how good the ranking list is, that is, a user specific AUC. Hence, this method can remove the impact of user bias and measure more accurately the performance of the model over all users. With years of application in our production system, GAUC metric is verified to be more stable and reliable than AUC.

实际证明，GAUC在展示广告设置中更具有指标性（注：说明性），CTR模型对每个用户的候选广告进行排名，模型性能主要通过排名列表的好坏来衡量，即用户特定的AUC。因此，该方法可以消除用户偏差的影响，更准确地衡量模型对所有用户的性能。在我们的生产系统中应用多年后，GAUC度量被证明比AUC更加稳定可靠。

4. 模型架构 Model Architecture

Different from the sponsored search, most of users come into display advertising system without clear target. Hence, our system need an effective approach to extract users’ interests from the rich historical behavior while building the click-through rate (CTR) prediction model.

与付费搜索不同的是，大多数用户进入展示广告系统时没有明确的目标。因此，我们的系统在构建点击率(CTR)预测模型的同时，需要一种有效的方法，从丰富的历史行为中提取用户的兴趣。

4.1 基准模型 Base Model

Following the popular model structure introduced in [1, 4, 2], our base model is composed with two steps:

transfer each sparse id feature into a embedded vector space

apply MLPs to fit the output.

按照[1,4,2]中介绍的流行的模型结构，我们的基准模型由两个步骤组成:

将每个稀疏的id特征，映射到一个Embedding表示向量空间
使用MLP，拟合输出

Notice that the input contains user behavior sequence ids, of which the length can be various. Thus we add a pooling layer (e.g. sum operation) to summarize the sequence and get a fixed size vector. As illustrated in the left part of Figure 2, the base model works well practically, which now serves the main trafic of our online display advertising system.

需要提及的是，模型输入包含用户行为序列id，其长度可以是不同的。因此，我们添加了池化层(例如加和操作)，对序列进行汇总，得到一个固定大小的向量。如图2左侧所示，基准模型在实际中运行良好，它现在服务于我们的在线展示广告系统的主要流量。

However, going deep into the pooling operation, we will find that much information is lost, that is, it destroys the inner structure of user behavior data. This observation inspires us to design a better model.

但是，深入考虑池化操作，我们会发现很多信息丢失了，即破坏了用户行为数据的内部结构。这个观察结果驱使我们设计一个更好的模型。

4.2 深度兴趣网络的设计 Deep Interest Network Design

In our display advertising scenario, we wish our model to truly reveal the relationship between the candidate ad and users’ interest based on their historical behaviors.

在我们的展示广告场景中，我们希望我们的模型，能够基于用户的历史行为，真实地揭示候选广告和用户兴趣之间的关系。

As discussed above, behavior data contains two structures: diversity and local activation.

如上所述，行为数据包含两个结构:多样性和局部激活。

The diversity of behavior data reflects users’ various interests. User click of ad often originates from just part of user’s interests. We find it is similar to the attention mechanism. In NMT task it is assumed that the importance of each word in each decode process is different in a sentence. Attention network [7] (can be viewed as a special designed pooling layer) learns to assign attention scores to each word in the sentence, which in other words follows the diversity structure of data.

行为数据的多样性反映了用户兴趣的多样性。用户点击广告往往只是由于用户的部分兴趣。我们发现它与注意力机制相似。在NMT（注：Neural Machine Translation）任务中，假设每个词在每一个解码过程中的重要性在句子中是不同的。注意力网络[7] (可以看作是一个特殊设计的池化层)学习给句子中的每个单词分配注意分数，也就是说遵循数据的多样性结构。

However, it is unsuitable to directly apply the attention layer in our applications, where embedding vector of user interest should vary according to different candidate ads, that is, it should follow the local activation structure.

然而，在我们的应用中，不适合直接应用注意力层，用户兴趣的Embedding向量应该根据候选广告的不同而不同，也就是应该遵循局部激活结构。

Let’s check what will happen if the local activation structure is not followed. Now we get the distributed representation of users( $V_u$ ) and ads( $V_a$ ). For the same user, $V_u$ is a fixed point in embedding space. It is the same to ad embedding $V_a$ . Assume that we use inner product to calculate the relevance between user and ad, $V_u \cdot V_a$ . If both $F (U; A)$ and $F (U; B)$ are high, which means user $U$ is relevant to both ad $A$ and $B$ . Under this way of calculation, any point on the line between the vector of $V_a$ and $V_b$ will get high relevance score. It brings a hard constraint to the learning of distributed representation vector for both user and ad. One may increase the embedding dimensionality of the vector space to satisfy the constraint, which can work perhaps, but will cause a huge increase of model parameters.

让我们看看，如果不遵循局部激活结构，会发生什么。现在我们得到了用户( $V_u$ )和广告( $V_a$ )的分布式表示。对于同一个用户， $V_u$ 是Embedding空间中的一个固定点， $V_a$ 同样也是。假设我们使用内积来计算用户 $U$ 和广告 $A$ 之间的相关性， $V_u \cdot V_a$ 。如果 $F (U; A)$ 和 $F (U; B)$ 都很高，说明用户 $U$ 与广告 $A$ 和 $B$ 都相关。在这种计算方式下， $V_a$ 向量与 $V_b$ 向量直线上的任意一点，都会得到较高的相关性得分。这给用户和广告的分布式表示向量的学习带来了很大的限制。一种方法是增加向量空间的表示维数来满足约束条件，这种方法也许可行，但会导致模型参数的大幅增加。

In this paper we introduce a new design network, named DIN, which follows the two structures of data. DIN is illustrated in the right part of Figure 2. Mathematically, the embedding vector $V_u$ of user $U$ turns to be a function of the embedding vector $V_a$ of ad $A$ , that is:

本文介绍了一种新设计网络，命名为DIN，它遵循了上述两种数据结构。图2的右侧展示了DIN的结构。在数学上，用户 $U$ 的表示向量 $V_u$ ，被表示成广告 $A$ 的表示向量 $V_a$ 的函数，即:

Where, $V_i$ is the embedding of behavior id $i$ , such as good id, shop id, etc. $V_u$ is the weighted sum of all the behavior ids. $w_i$ is the attention score that the behavior id $i$ contributes to the overall user interest embedding vector $V_u$ with respect to the candidate ad $A$ . In our implementation, $w_i$ is the output of activation unit (denoted by function $g$ ) with inputs of $V_i$ and $V_a$ .

其中， $V_i$ 为行为id $i$ 的表示向量，如商品id, 店铺id等。 $V_u$ 为所有行为id的加权和。 $w_i$ 是行为id $i$ 对表示向量 $V_u$ 对候选广告 $A$ 的总体用户兴趣贡献的注意得分。在我们的实现中， $w_i$ 是激活单元的输出(用函数 $g$ 表示)，输入为 $V_i$ 和 $V_a$ 。

In all, DIN designs the activation unit to follow local activation structure and weighted sum pooling to follow diversity structure. To the best of our knowledge, DIN is the first model which follows both of the two structures of user behavior data in CTR prediction tasks at the same time.

DIN设计的激活单元遵循局部激活结构，加权和池化遵循多样性结构。据我们所知，在CTR预测任务中，DIN是第一个同时遵循两种用户行为数据结构的模型。

4.3 基于数据的激活函数 Data Dependent Activation Function

PReLU [13] is a common used activation function and is chosen in our setting at the beginning, which is defined as

PReLU[13]是一个常用的激活函数，我们在一开始的设置中选择它，它被定义为

PReLU plays the role as the Leaky ReLU to avoid zero gradients [14] while the $a_i$ is small. Previous research has shown that PReLU can improve accuracy with a little extra risk of overfftting.

PReLU扮演着 Leaky ReLU 的角色，以避免0梯度[14]，其中 $a_i$ 很小。先前的研究表明，PReLU可以提高精度，但有一点额外的过拟合风险。

However, in our application with large scale sparse input ids, training such industrial-scale network still faces a lot of challenge. To further improve the convergence rate and performance of our model, we consider and design a novel data dependent activation function, which we name it Dice:

然而，在我们大规模稀疏输入id的应用中，训练这样的工业规模网络仍然面临着很大的挑战。为了进一步提高模型的收敛速度和性能，我们考虑并设计了一个新颖的、数据依赖的激活函数，我们将其命名为Dice:

$E[y_i]$ and $Var[y_i]$ in training step are calculate directly from each mini batch data, meanwhile we adopt the momentum method to estimate the running $E[y_i]'$ and $Var[y_i]'$ :

训练步骤中的 $E[y_i]$ 和 $Var[y_i]$ 由每个小批量数据直接计算，同时我们采用动量法对实时的 $E[y_i]'$ 和 $Var[y_i]'$ 进行估计:

$t$ is the mini batch step of the training process, and $\alpha$ is a super parameter like 0.99. In the test step we used the running $E[y_i]'$ and $Var[y_i]'$ .

$t$ 是训练过程的小批处理步骤， $\alpha$ 是一个超参数，比如0.99。在测试步骤中，我们使用实时的 $E[y_i]'$ 和 $Var[y_i]'$ 。

The key idea of Dice is to adaptively adjust the rectifier point according to data, which is different from PReLU using a hard rectifier based on $y_i \overset{?}{>} 0$ . In this way, Dice can be viewed as a soft rectifier with two channel: $a_iy_i$ and $y_i$ based on $p_i$ . $p_i$ is a weight to keep the original $y_i$ , it will be lower while $y_i$ deviate from $E[y_i]$ of each mini batch data. Experiments show Dice provides an obviously improvement on convergence rate and GAUC.

Dice的关键思想是根据数据自适应调整整流点，这与PReLU使用基于 $y_i \overset{?}{>} 0$ 的硬整流是不同的。因此，Dice可以被视为具有两个通道的软整流器： $a_iy_i$ 和基于 $p_i$ 的 $y_i$ 。 $p_i$ 是保留原来 $y_i$ 的权值，当每个小批量数据的 $y_i$ 偏离 $E[y_i]$ 时，它会更低。实验表明，在收敛速度和GAUC方面，Dice算法有明显的改进。

4.4 自适应正则化技术 Adaptive Regularization Technique

Not surprisingly, overfitting problem is encountered while training our model with large scale parameters and sparse inputs. We show experimentally, with addition of fine-grained user visited good ids feature, model performance falls rapidly after the first epoch.

不足为奇的是，在大量参数和稀疏输入的场景下，训练我们的模型时，会遇到过拟合问题。实验表明，随着加入细粒度的用户访问商品id这一特征，模型性能在第一个周期后迅速下降。

Many methods have been proposed to reduce overfitting, such as L2 and L1 regularization [15], and Dropout [16]. However, with sparse and high dimensional data, CTR prediction task faces greater challenge. It is known that internet-scale user behavior data follows the long-tail law, that is, lots of feature ids occur a few times in the training samples, while little of them occur many times. This inevitably introduces noise into the training process and intensifies overfitting. An easy way to reduce overfitting is to filter out those low-frequency feature ids, which can be viewed as manual regularization. However, such frequency based filter is quite rough in terms of information loss and threshold setting. Here we introduce an adaptive regularization method, in which we impose different regularization intensity on feature ids according to their occurrence frequency.

许多方法被提出来减少过拟合，如L2和L1正则化[15]，Dropout[16]。然而，在稀疏、高维数据的情况下，CTR预测任务面临着更大的挑战。我们知道，互联网规模的用户行为数据遵循长尾规律，即大量特征id在训练样本中多次出现，而很少有特征id多次出现。这不可避免地在训练过程中引入了噪声，并加剧了过拟合。减少过拟合的一个简单方法是过滤掉那些低频特征id，这可以被视为手动正则化。但是这种基于频率的滤波器在信息丢失和阈值设置方面都比较粗糙。本文提出了一种自适应正则化方法，根据特征的出现频率对其施加不同的正则化强度。

Denote that,

（注：本文提出的自适应正则化方法）表示为公式（8）

The update formula is shown as Eq.(9). $B$ stands for mini-batch samples with size of $b$ . $n_i$ is frequency of feature $i$ and $\lambda$ is regularization parameter.

更新公式如式(9)所示。 $B$ 为小批量样品，大小为 $b$ 。 $n_i$ 是特征 $i$ 的频率， $\lambda$ 是正则化参数。

The idea behind Eq.(9) is to penalize low-frequency features and relax high-frequency features to control the gradient update variance.

Eq.(9)的思想是通过惩罚低频特征，放松高频特征，来控制梯度更新方差。

A similar practice of adaptive regularization can be found in [17] which sets regular coefficient to be proportional to feature frequency, as shown below:

一种类似的自适应正则化做法，见文章[17]，该方法将正则系数设置为与特征频率成正比，如下图所示（注：Eq10）:

However, in our dataset, training with regularization of Eq.(10) shows no obvious alleviation of overfitting. On the contrary, it slows down the convergence of training process. Eq.(10) applies greater penalty on high-frequency good ids than long-tail goods, while the former contributes more on both the metric and online income in our special e-commerce system. Besides, we also evaluate dropout technique and find slight improvement on overfitting.

然而，在我们的数据集中，Eq.(10)的正则化训练并没有明显减轻过拟合。相反，它会减慢训练过程的收敛速度。与长尾商品相比，Eq.(10)对高频商品id的惩罚更大，而在我们特殊的电子商务系统中，高频商品id对效果度量和在线收入的贡献都更大。此外，我们还评估了dropout技术，发现过拟合有轻微的改善。

5. 实现 Implementation

DIN is implemeted at a multi-GPU distributed training platform named X-Deep Learning (XDL) , which supports model-parallelism and data-parallelism. XDL is designed to solve the challenges of training industrial scale deep learning networks with large scale spare inputs and tens of billions parameters. In our observation, most of these deep networks published now are constructed with two steps: i) employ the embedding technique to cast the original sparse input into low dimensional and dense vectors ii) bridge with networks like MLPs, RNN, CNN etc. Most of the parameters are focused in the first embedding step which needs to be distributed over multi machines. The second network step can be handle within single machine. Under such thinking, we architecture the XDL platform in a bridge manner, as illustrated in figure 3, which is composed of three main kinds of components:

DIN在多GPU分布式训练平台XDL上实现，该平台支持模型并行和数据并行。XDL旨在解决训练工业规模深度学习网络的挑战，即具有大规模稀疏输入和数百亿参数的。在我们的观察中，目前发表的这些深度网络大多是通过两个步骤构建的：i)利用embedding技术，将原始的稀疏输入转换为低维和密集的向量；ii)与MLPs、RNN、CNN等网络桥接。大部分参数集中在Embedding的第一步，需要分布在多台机器上。第二个网络步骤可以在单机中处理。在这样的思路下，我们采用桥接的方式构建了XDL平台，如图3所示，主要由三种组件组成:

Distributed Embedding Layer. It is a model-parallelism module, parameters of embedding layer are distributed over multi-GPUs. Embedding Layer works as a predefined network unit, which provides with forward and backward modes.

分布式Embedding层。它是一个模型并行模块，Embedding层的参数分布在多个GPU上。Embedding层是一个预定义的网络单元，提供正向和反向两种模式。

Local Backend. It is a standalone module, which aims to handle the local network training. Here we reuse the open-sourced deep learning frameworks, like tensorflow, mxnet, theano[18, 19, 20] etc. With the unified data exchange interface and abstraction, it is easy for us to integrate and switch in different kinds of frameworks. Another benefit of the backend architecture is the convenience to easily follow up the open source community and utilize the latest published network structures or updating algorithms which are developed with these open-sourced deep learning frameworks.

局部后端。它是一个独立的模块，旨在执行局部网络训练。这里我们重用了开源的深度学习框架，如tensorflow、mxnet、theano[18,19,20]等。使用统一的数据交换接口和提取方式，便于我们在不同的框架中集成和切换。后端架构的另一个好处是便于跟踪开源社区，使用最新发布的网络结构或更新的算法，这些算法是用这些开源的深度学习框架开发的。

Communication Component. It is the base module, which helps to parallel both the embedding layer and backend. In our first version, it is implemented with MPI.

通信组件。它是基础模块，有助于embedding层和后端的并行运行。在我们的第一个版本中，它是用MPI实现的。

Besides, we also employ the common feature trick [8], regarding with the structural property of data. Readers can find detailed introduction in [8].

此外，针对数据的结构属性，我们还采用了常用的特征技巧[8]。读者可以在[8]中找到详细的介绍。

Due to the high performance and exibility of XDL platform, we accelerate training process about 10 times and optimize hyparameters automatically with high tuning efficiency.

由于XDL平台的高性能和灵活性，我们可以将训练过程加快10倍左右，并以很高的调优效率优化超参数。

6. 实验 Experiments

6.1 可视化 Visualization of DIN

In DIN model, sparse id features are encoded as embedding vectors. Here we randomly select 9 categories (dress, sport shoes, bags, etc) and 100 goods for each category. Fig. 4 shows the visualization of embedding vectors of goods based on t-SNE[21], in which points with same shape correspond to same category. It shows clearly the clustering property of DIN embeddings.

在DIN模型中，稀疏的id特征被编码为embedding向量。在这里，我们随机选择了9个类别(服装、运动鞋、包等)，每个类别100种商品。图4给出了基于t-SNE[21]的商品embedding向量可视化，其中形状相同的点对应同一个类别。它清楚地显示了DIN embedding的聚类特性。

Besides, we color the points in Fig. 4 in a prediction manner: assume all the goods are candidates for the young mother (example in Table 1), they are colored by the prediction value (red ones get higher CTR than blue ones). DIN model identifies goods that meet user’s diverse interests correctly.

此外，我们基于预测结果，对图4中的点进行了染色：假设所有商品都是年轻母亲的候选商品(如表1)，根据预测值对其进行染色（红色商品的CTR高于蓝色商品）。DIN模型正确地识别出了满足用户不同兴趣的商品。

Further, we go deep into DIN model to check the working mechanism. As described in section 4.2, DIN designs the attention unit to locally activate the related behaviors with respect to candidate ads. Fig 5 illustrates the activation intensity (attention score $w$ ). As expected, behaviors of high relevance with candidate ad get high attention intensity.

在此基础上，我们深入探索DIN模型，了解其工作机制。如第4.2节所述，DIN设计了注意单元，用于局部激活候选广告的相关行为。图5给出了激活强度（注意得分 $w$ ）。正如所料，与候选广告高度相关的行为，会获得较高的注意强度。

6.2 正则化 Regularization

Both the base model and our proposed DIN model encounter overfitting problem during training with addition of fine-grained features, such as good id feature. Fig. 6 illustrates the training process with/without the fine-grained good id feature, which demonstrates the overfftting problem clearly.

在训练过程中，基准模型和我们提出的DIN模型，都遇到了过拟合问题，因为加入了细粒度特征，例如商品id特征。图6给出了有/没有细粒度商品id特征时的训练过程，很好地说明了过拟合的问题。

We now compare different kinds of regularizations experimentally.

我们现在通过实验，比较不同种类的正则化方法。

Dropout. Randomly discard 50% of good ids in each sample.

Filter. Filter good ids by occurrence frequency in samples and leave only the most frequent good ids. In our setup, top 20 million good ids are left.

L2 regularization. Parameter $\lambda$ is searched and set to be 0.01.

Regularization in DiFacto. DiFacto proposed this method of Eq.(10). Parameter $\lambda$ is searched and set to be 0.01.

Adaptive regularization. Our proposed method of Eq. (9). We use Adam as the optimization method. Parameter $\lambda$ is searched and set to be 0.01.

Dropout。在每个样本中，随机丢弃50%商品id。
过滤。根据样本中的出现频率，过滤商品id，只留下最频繁的商品id。在我们的设置中，保留了最高频的前2000万个商品id。
L2正规化。参数 $\lambda$ 被搜索，设置为0.01。
DiFacto正则化。DiFacto提出了Eq.(10)的方法。参数 $\lambda$ 被搜索，并被设置为0.01。
自适应正则化。我们提出了公式(9)的方法。我们使用Adam作为优化方法。参数 $\lambda$ 被搜索，并设置为0.01。

Comparison results are shown in Fig. 6. Validation result demonstrates the effectiveness of our proposed adaptive regularization method. Trained using the adaptive regularization technique, model with fine-grained good id feature achieves 0.7% gain in GAUC compared to model without it, which is a significant improvement in the CTR prediction task.

对比结果如图6所示。验证结果证明了我们提出的自适应正则化方法的有效性。使用自适应正则化技术训练后，具有细粒度商品id特征的模型，实现了0.7%的GAUC增长，相比于没有正则化方法，这在CTR预测任务中是一个显著的提高。

Dropout method causes slower convergence in first epoch, while overfitting is somewhat alleviated after the first epoch completes. Frequency filter keep the same speed of convergence with no-operation setup in first epoch. After the first epoch, overfitting is also alleviated, however still worse than dropout setup. In adptive regularization setup, we hardly see overfitting after the first epoch. Loss and GAUC on validation set almost converge when the second epoch completes.

Dropout方法在第一轮训练收敛较慢，而在第一轮训练之后，过拟合得到一定程度的缓解。频率滤波器在第一轮训练中，保持了相同的收敛速度，相比于无操作的方式（注：无正则化）。在第一轮训练之后，过拟合也得到了缓解，但效果仍然比dropout较差。使用自适应正则化时，在第一个训练轮次之后，我们几乎看不到过拟合现象。当第二轮训练完成时，验证集上的Loss和GAUC几乎收敛。

（注：结合图6曲线，这里猜测，每一轮训练约有45个iteration。图6中，横坐标iteration=45的位置，第1个epoch完成，第2个epoch开始，对于Without Reg曲线，训练loss骤降，验证loss骤升，过拟合明显；对于其他曲线，使用了不同的正则化方式，过拟合缓解，但缓解程度不尽相同，模型性能也存在差异。这是作者在这一段想要表达的内容。）

Regularization in DiFacto [17] of Eq.(10) set a greater penalty on high-frequency good id. However, in our task, high-frequency good id characterize users’ interest more confidently, and low-frequency good id will bring a lot of noise. The experiment of frequency filter can illustrate this point. Our method softens the low-frequency good id by applying a regular inverse of the frequency of the commodity.

式(10)的DiFacto正则化方法对高频率的商品id的施加了更大的惩罚。但在我们的任务中，高频的商品id在表征用户兴趣方面，有着更高的可信度，而低频的商品id会带来很多噪音。频率滤波器的实验可以说明这一点。我们的方法削弱了低频的商品id（注：对模型权重的更新贡献），通过使用商品频率的正则逆（注：参照公式9，对商品频率 $n_i$ 取倒数）。

6.3 DIN和基准模型的对比 Comparison of DIN and base model

We test the model performance on the productive display advertising system in Alibaba. Both training and testing datasets are generated from system logs, including impression and click logs. We collect two weeks’ samples for training and sample of the following day for testing, which is a productive setting in our system. Both base model and our proposed DIN model are constructed on the same feature representation as described in Table 3.2. Parameters are tuned separately and we report the best results. GAUC is used to evaluate the model performance.

我们测试了模型的性能，在阿里巴巴生产环境的展示广告系统上。训练和测试数据集都是从系统日志生成的，包括曝光和点击日志。我们收集两周的样本用于训练，收集了随后一天的样本用于测试，这在我们的系统中是一个高效的设置。基准模型和我们提出的DIN模型都是基于表3.2（注：猜测是作者笔误，应该是表2）所述的相同的特征表示构建的。参数是单独调优的，我们报告了最佳结果。采用GAUC指标，对模型性能进行了评价。

Results are shown in Table 3 and Fig. 7. Obviously, DIN model trained with with adaptive regularization outperforms base model significantly. DIN with adpative_reg used only half iterations of base model to get the highest GAUC of base model. And it achieved total 1.08% GAUC gain than base model in the end, which is a big improvement in our productive system. Dice achieved 0.23% GAUC gain than DIN with adpative reg. With better understanding and exploitation of structures of user behavior data, DIN model shows better ability for capturing the nonlinear relationships of user and candidate ad.

结果如表3和图7所示。显然，使用自适应正则化训练的DIN模型，其效果明显优于基准模型。使用自适应正则化方法的DIN，只使用相比于基准模型一半的迭代次数，就得到了基准模型的最高GAUC。最终达到了高于基准模型1.08%的GAUC提升，这是对我们生产系统的一个很大的改进。与使用自适应正则化的DIN相比，使用Dice后（注：激活函数，公式4，5），获得了0.23%的GAUC增益。随着对用户行为数据结构的更好的理解和利用，DIN模型在捕捉用户和候选广告的非线性关系方面表现出了更好的能力。

7 总结 Conclusions

In this paper, we focus on the CTR prediction task in the scenario of display advertising in e-commerce industry, which involves internet-scale user behavior data. We reveal and summarize the two key structures of data: diversity and local activation and design a novel model named DIN with better exploitation of data structures. Experiments show DIN brings more interpretability and achieves better GAUC performance compared with popular MLPs model. Besides, we study the overfitting problem in training such industrial deep networks and propose an adaptive regularization technique which can help reduce overfitting greatly in our scenario. We suppose these two approaches could be instructive to other industrial deep learning tasks.

本文中，我们聚焦于电子商务行业中、展示广告场景下的 点击率预测任务，该场景涉及了互联网规模的用户行为数据。我们揭示和总结了该数据的两个关键结构：多样性和局部激活，并设计了一个新的模型，名为DIN，更好地利用数据结构。实验结果表明，与目前流行的MLP系列模型相比，DIN模型具有更好的可解释性和更高的GAUC性能。此外，我们还研究了此类工业级深度网络训练中的过拟合问题，并提出了一种自适应正则化技术，可以大大减少我们的场景中的过拟合。我们认为，这两种方法对其他工业级深度学习任务，也具有指导意义。

Different from the fields of image recognition and natural language process with mature and state-of-the-art deep network structures, applications with rich internet-scale user behavior data still face a lot of challenges and are worth of making more efforts to study and design more common and useful network structures. We will continue to focus on this direction.

图像识别和自然语言处理等领域，具有成熟和先进的深度网络结构。与此不同，对于拥有丰富的、互联网规模的用户行为数据的应用，仍然面临着许多挑战，值得我们花更多的精力，来研究和设计更常见、更有用的网络结构。我们将继续朝着这个方向努力。

End

你可能感兴趣的:(论文,推荐算法,深度学习,深度学习,推荐算法,人工智能)

【优秀文章】7月优秀文章推荐
优秀文章智能自主运动体与人工智能技术——环境感知、SLAM定位、路径规划、运动控制、多智能体协同作者：fpga和matlabC++之红黑树认识与实现作者：zzh_zao【手把手带你刷好题】–C语言基础编程题(十)作者：草莓熊Lotso飞算JavaAI：从“码农”到“代码指挥官”的终极进化论作者：可涵不会debug前端网页开发学习（HTML+CSS+JS）有这一篇就够！作者：一颗小谷粒
蛋白质结构预测/功能注释/交互识别/按需设计，中国海洋大学张树刚团队直击蛋白质智能计算核心任务 hyperai
蛋白质作为生命活动的主要承担者，在人体生理功能中扮演关键角色。然而传统研究面临结构解析成本高昂、功能注释严重滞后、新型蛋白质设计效率低下等挑战。近年来，生命科学对蛋白质复杂特性解析的需求日益迫切，大数据、深度学习、多模态计算等技术的突破性发展，为构建蛋白质智能计算体系提供了全新的发展契机。蛋白质智能计算体系的构建，使得蛋白质在大规模功能注释、交互预测及三维结构建模等领域取得显著成果，为药物发现与生
【心灵鸡汤】深度学习技能形成树：从零基础到AI专家的成长路径全解析智算菩萨人工智能深度学习
引言：技能树的生长哲学在这个人工智能浪潮汹涌的时代，深度学习犹如一棵参天大树，其根系深深扎入数学与计算科学的沃土，主干挺拔地承载着机器学习的核心理念，而枝叶则繁茂地延伸至计算机视觉、自然语言处理、强化学习等各个应用领域。对于初入此领域的新手而言，理解这棵技能树的生长规律，掌握其形成过程中的关键节点和发展阶段，将直接决定其在人工智能道路上能够走多远、攀多高。技能树的概念源于游戏设计，但在学习深度学习
推荐算法（推广搜）——广告和推荐有什么不同？
导语近几年新兴起一个行业：推广搜。即推荐、广告、搜索算法的简称。各大厂都隐隐将其作为公司核心技术来发展。此文将带领大家探秘广告和推荐有什么区别以及其相似处。再此强调一下，广告算法里面的推荐广告和自然推荐结果里的推荐系统进行对比，但因为广告算法里面还有“搜索广告”，搜索广告和推荐系统差异性就太大了，这里不做讨论。一、不同点1.1本质不同推荐广告和自然推荐本质中要处理的群体和衡量的利益完全不一样。（图
MongoDB + Voyage AI 详解：重塑数据库与AI的协同范式 csdn_tom_168 NoSQL 数据库 mongodb 人工智能 AI
MongoDB+VoyageAI详解：重塑数据库与AI的协同范式2025年2月，MongoDB官方宣布收购VoyageAI，这一举措标志着数据库与人工智能技术的深度融合迈入新阶段。通过整合VoyageAI的先进AI检索与嵌入模型能力，MongoDB旨在重新定义AI时代的数据库架构，为企业构建智能应用提供端到端的数据基础设施。一、收购背景与技术战略1.行业趋势驱动AI数据挑战：随着生成式AI与大语言
HarmonyOS5.0仓颉引擎与盘古大模型：个性化作业批改系统架构设计与实现 H老师带你学鸿蒙系统架构 HarmonyOS5.0 鸿蒙华为仓颉教育
人工智能与边缘计算的融合正在重塑教育评价体系。本文将展示如何基于HarmonyOS5.0仓颉并发引擎和盘古大模型，构建新一代智能作业批改系统。系统架构全景graphTDA[学生端设备]-->|提交作业|B[仓颉边缘处理]B-->C[盘古大模型分析]C-->D[个性化反馈生成]D-->E[学生终端]D-->F[教师仪表盘]subgraphHarmonyOS分布式系统B-->|设备协同|G[教室平板集
知识图谱的个性化智能教学推荐系统(论文+源码) 毕设工作室_wlzytw python论文项目知识图谱人工智能
目录摘要Abstract目录第1章绪论1.1研究背景及意义1.2国内外研究现状1.2.1知识图谱1.2.2个性化推荐系统1.3本文研究内容及创新点1.4全文组织结构第2章相关理论与技术概述2.1知识图谱2.1.1知识图谱的介绍与发展2.1.2知识图谱的构建2.3协同过滤推荐算法2.2.1推荐算法概述2.2.2Pearson相关系数2.2.3Spearman相关系数2.4Bert模型和Albert模
阿里云瑶池数据库 Data Agent for Meta 正式发布，让 AI 更懂你的业务！数据库观点资讯人工智能
背景随着生成式人工智能（GenerativeAI）从概念验证迈向规模化商业落地，AIAgent已成为企业核心业务流程的重要组成部分。然而，当模型调用日益便捷时，核心痛点已不再是模型本身，而是集中在一个关键要素上：数据。AIAgent的落地瓶颈已从技术能力转向高质量、高相关性、安全合规的数据供给。企业面临的核心挑战在于：数据孤岛导致知识库分散，通用大模型难以理解专业业务传统数据管理依赖人工开发维护，
【TVM 教程】如何处理 TVM 报错
ApacheTVM是一个深度的深度学习编译框架，适用于CPU、GPU和各种机器学习加速芯片。更多TVM中文文档可访问→https://tvm.hyper.ai/运行TVM时，可能会遇到如下报错：---------------------------------------------------------------AnerroroccurredduringtheexecutionofTVM.F
【PaddleOCR】OCR文本检测与文本识别数据集整理，持续更新......
博主简介：曾任某智慧城市类企业算法总监，目前在美国市场的物流公司从事高级算法工程师一职，深耕人工智能领域，精通python数据挖掘、可视化、机器学习等，发表过AI相关的专利并多次在AI类比赛中获奖。CSDN人工智能领域的优质创作者，提供AI相关的技术咨询、项目开发和个性化解决方案等服务，如有需要请站内私信或者联系任意文章底部的的VX名片（ID：xf982831907）博主粉丝群介绍：①群内初中生、
多模态大模型的技术应用与未来展望：重构AI交互范式的新引擎 zhaoyi_he 重构人工智能
一、引言：为什么多模态是AI发展的下一场革命？过去十年，深度学习推动了计算机视觉和自然语言处理的飞跃，但两者的发展路径长期割裂。随着生成式AI和大模型时代的到来，**多模态大模型（MultimodalFoundationModels）**以统一的建模方式处理图像、文本、音频、视频等多源数据，重塑了“感知-认知-决策”链条，为AGI迈出关键一步。OpenAI的GPT-4o、Google的Gemini
使用 C++ 实现 MFCC 特征提取与说话人识别系统 whoarethenext c++开发语言 mfcc 语音识别
使用C++实现MFCC特征提取与说话人识别系统在音频处理和人工智能领域，C++凭借其卓越的性能和对硬件的底层控制能力，在实时音频分析、嵌入式设备和高性能计算场景中占据着不可或缺的地位。本文将引导你了解如何使用C++库计算核心的音频特征——梅尔频率倒谱系数(MFCCs)，并进一步利用这些特征构建一个说话人识别（声纹识别）系统。Part1:在C/C++中计算MFCCs直接从零开始实现MFCC的所有计算
ImportError: /nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkComplete_12_4 爱编程的喵喵 Python基础课程 python ImportError torch nvJitLink 解决方案
大家好，我是爱编程的喵喵。双985硕士毕业，现担任全栈工程师一职，热衷于将数据思维应用到工作与生活中。从事机器学习以及相关的前后端开发工作。曾在阿里云、科大讯飞、CCF等比赛获得多次Top名次。现为CSDN博客专家、人工智能领域优质创作者。喜欢通过博客创作的方式对所学的知识进行总结与归纳，不仅形成深入且独到的理解，而且能够帮助新手快速入门。本文主要介绍了ImportError:/home/
【机器学习&深度学习】多分类评估策略一叶千舟深度学习【理论】深度学习【应用必备常识】大数据人工智能
目录前言一、多分类3大策略✅宏平均（MacroAverage）✅加权平均（WeightedAverage）✅微平均（MicroAverage）二、类比理解2.1宏平均（MacroAverage）2.1.1计算方式2.1.2适合场景2.1.3宏平均不适用的场景2.1.4宏平均一般用在哪些指标上？2.1.5怎么看macroavg指标？2.1.6宏平均值低说明了什么？2.1.7从宏平均指标中定位模型短板
网络安全相关专业总结（非常详细）零基础入门到精通，收藏这一篇就够了网络安全工程师教学兼职副业黑客技术网络安全 web安全安全人工智能网络运维
一、网络工程专业专业内涵网络工程是指按计划进行的以工程化的思想、方式、方法，设计、研发和解决网络系统问题的工程，一般指计算机网络系统的开发与构建。该专业培养具备计算机科学与技术学科理论基础，掌握网络技术领域专业知识和基本技能，在计算机、网络及人工智能领域的工程实践和应用方面受到良好训练，具有深厚通信背景、可持续发展、能力较强的高水平工程技术人才。学生可在计算机软硬件系统、互联网、移动互联网及新一代
大语言模型应用指南：ReAct 框架 AI大模型应用实战 java python javascript kotlin golang 架构人工智能
大语言模型应用指南：ReAct框架关键词：大语言模型,ReAct框架,自然语言处理(NLP),模型融合,多模态学习,深度学习,深度学习框架1.背景介绍1.1问题由来近年来，深度学习技术在自然语言处理(NLP)领域取得了显著进展。尤其是大语言模型(LargeLanguageModels,LLMs)，如BERT、GPT系列等，通过在大规模无标签数据上进行预训练，获得了强大的语言理解和生成能力。然而，预
大语言模型原理基础与前沿基于语言反馈进行微调 AI天才研究院计算 AI大模型企业级应用开发实战 AI人工智能与大数据计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
大语言模型原理基础与前沿基于语言反馈进行微调作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来随着深度学习技术的飞速发展，自然语言处理（NLP）领域取得了显著的进展。大语言模型（LargeLanguageModels，LLMs）如GPT-3、BERT等在各项NLP任务上取得了令人瞩目的成绩。然而，如何进一步提高大语言模型的理
《北京市加快推动“人工智能+医药健康“创新发展行动计划（2025-2027年）》深度解读
引言随着新一轮科技革命和产业变革的深入推进，人工智能技术与医药健康的深度融合已成为全球科技创新的重要方向。北京市于2025年7月正式发布《北京市加快推动"人工智能+医药健康"创新发展行动计划（2025-2027年）》，旨在充分发挥北京在人工智能技术策源、头部医疗资源汇聚、健康数据高度富集等方面的突出优势，构建形成"人工智能+医药健康"创新和应用并举的产业生态体系，打造具有国际影响力的创新策源地、应
「源力觉醒创作者计划」_文心大模型开源：开启 AI 新时代的大门小黄编程快乐屋人工智能
在人工智能的浩瀚星空中，大模型技术宛如一颗璀璨的巨星，照亮了无数行业前行的道路。自诞生以来，大模型凭借其强大的语言理解与生成能力，引发了全球范围内的技术变革与创新浪潮。百度宣布于6月30日开源文心大模型4.5系列，这一消息如同一颗重磅炸弹，在AI领域掀起了惊涛骇浪，其影响之深远，意义之重大，足以改写行业的发展轨迹。百度这次放大招，直接把文心大模型4.5开源了，这操作就像往国内AI圈子里空投了一个超
四种微调技术详解：SFT 监督微调、LoRA 微调、P-tuning v2、Freeze 监督微调方法
当谈到人工智能大语言模型的微调技术时，我们进入了一个令人兴奋的领域。这些大型预训练模型，如GPT-3、BERT和T5，拥有卓越的自然语言处理能力，但要使它们在特定任务上表现出色，就需要进行微调，以使其适应特定的数据和任务需求。在这篇文章中，我们将深入探讨四种不同的人工智能大语言模型微调技术：SFT监督微调、LoRA微调方法、P-tuningv2微调方法和Freeze监督微调方法。第一部分：SFT监
2023年搜索领域的技术认证与职业发展指南搜索引擎技术搜索引擎 ai
2023年搜索领域的技术认证与职业发展指南关键词搜索领域、技术认证、职业发展、搜索引擎技术、人工智能搜索摘要本指南旨在为搜索领域的从业者和有志于进入该领域的人士提供全面的技术认证与职业发展参考。首先介绍搜索领域的概念基础，包括其历史发展和关键问题。接着阐述相关理论框架，分析不同认证背后的原理。架构设计部分展示搜索系统的组成与交互。实现机制探讨算法复杂度和代码优化。实际应用部分给出实施和部署策略。高
探索AI人工智能医疗NLP实体识别系统的架构设计 AI学长带你学AI 人工智能自然语言处理 easyui ai
探索AI人工智能医疗NLP实体识别系统的架构设计关键词：人工智能、医疗NLP、实体识别、系统架构、深度学习、自然语言处理、医疗信息化摘要：本文将深入探讨医疗领域NLP实体识别系统的架构设计。我们将从基础概念出发，逐步解析医疗文本处理的特殊性，详细介绍实体识别技术的核心原理，并通过实际案例展示如何构建一个高效可靠的医疗实体识别系统。文章还将探讨当前技术面临的挑战和未来发展方向，为医疗AI领域的从业者
AI智能体原理及实践：从概念到落地的全链路解析 you的日常人工智能大语言模型人工智能机器学习深度学习神经网络自然语言处理
AI智能体正从实验室走向现实世界，成为连接人类与数字世界的桥梁。它代表了人工智能技术从"知"到"行"的质变，是能自主感知环境、制定决策、执行任务并持续学习的软件系统。在2025年，AI智能体已渗透到智能家居、企业服务、医疗健康、教育和内容创作等领域，展现出强大的生产力与创造力。然而，其发展也伴随着技术挑战、伦理困境和安全风险，需要从架构设计到落地应用的全链条思考与平衡。一、AI智能体的核心定义与技
人工智能动画展示人类的特征 AGI大模型与大数据研究院 AI大模型应用开发实战 java python javascript kotlin golang 架构人工智能
人工智能，动画，人类特征，情感识别，行为模拟，机器学习，深度学习，自然语言处理1.背景介绍人工智能（AI）技术近年来发展迅速，已渗透到生活的方方面面。从智能语音助手到自动驾驶汽车，AI正在改变着我们的世界。然而，尽管AI技术取得了令人瞩目的成就，但它仍然难以完全模拟人类的复杂行为和特征。人类的特征是多方面的，包括情感、认知、社交和创造力等。这些特征是人类区别于其他生物的重要标志，也是人类社会文明发
推客系统全栈开发指南：从架构设计到商业化落地 ywyy6798 系统小程序分销系统短剧系统海外短剧系统推客系统推客小程序
一、推客系统概述推客系统（TuiKeSystem）是一种结合社交网络与内容分发的创新型平台，旨在通过用户间的相互推荐机制实现内容的高效传播。这类系统通常包含用户关系管理、内容发布、智能推荐、数据分析等核心模块，广泛应用于电商导购、知识分享、新闻资讯等领域。推客系统的核心价值在于：利用社交关系链实现内容病毒式传播通过激励机制提升用户参与度基于用户行为数据优化推荐算法构建内容生产者与消费者的良性互动生
2024年11月架构设计师论文真题回顾，附参考解答、解析及所涉知识点（一）一几文架构系统架构系统架构设计师软考高级 IT考证
软考高级系统架构设计师考试包含三个科目：信息系统综合知识、案例分析和系统架构设计论文。考试形式为机考。本文主要回顾2024年下半年(2024-11-10)系统架构设计师考试下午论文的题目，同时附带参考解答、解析和所涉知识点。综合知识2024年11月架构设计师综合知识真题回顾，附参考答案、解析及所涉知识点（一）2024年11月架构设计师综合知识真题回顾，附参考答案、解析及所涉知识点（二）2024年1
推客系统开发：从0到1构建高效社交化推荐引擎 wx_ywyy6798 推客系统分销系统海外短剧系统推客小程序推客系统开发推客小程序开发推客分销系统
在信息爆炸的时代，如何让用户快速获取感兴趣的内容？推客系统（推荐引擎）成为解决这一问题的核心方案。无论是电商、内容平台还是社交应用，精准的推荐算法都能显著提升用户粘性和转化率。本文将带您了解推客系统的核心模块与开发要点，助您快速构建高效的推荐体系。一、推客系统的核心价值个性化体验：基于用户行为数据（浏览、点赞、收藏等）生成定制化推荐。流量高效分发：解决“信息过载”问题，提升内容/商品的曝光率。商业
202505架构师论文《论静态负载均衡策略设计和应用》文琪小站系统架构师软考论文负载均衡运维软考论文
软件架构师论文范文系列摘要在当今高度依赖信息技术的时代，构建高性能、高可用的分布式系统已成为必然趋势。负载均衡作为分布式系统中的关键技术，旨在将请求或数据有效地分发到多个处理单元，以优化资源利用率、提升系统吞吐量并确保服务的稳定运行。本文深入探讨了静态负载均衡策略的设计原理、技术特点及其在实际项目中的应用。首先，概述了负载均衡的整体概念及静态策略的分类，重点介绍了基于哈希、轮询和权重等静态算法的实
深度学习篇---简单果实分类网络
下面我将提供一个使用Python从零实现果实分类模型的完整流程，包括数据准备、模型构建、训练和部署，不依赖任何深度学习框架，仅使用NumPy进行数值计算。1.数据准备与预处理首先需要准备果实图像数据集，将其分为好果和坏果两类，并进行预处理：importosimportnumpyasnpfromPILimportImagefromsklearn.model_selectionimporttrain_
Python深度学习：3步实现AI人脸识别，效果堪比专业软件！小筱在线 python 人工智能 python 深度学习
引言：AI人脸识别的时代已经到来在当今数字化时代，人脸识别技术已经从科幻电影走进了我们的日常生活。从手机解锁到机场安检，从银行身份验证到智能门禁系统，这项技术正以前所未有的速度改变着我们的生活方式。而令人振奋的是，借助Python和深度学习技术，普通人也能构建出专业级的人脸识别系统。本文将带领您通过三个关键步骤，使用Python深度学习技术实现一个准确率高达99%的人脸识别系统。这个系统不仅原理简
java解析APK 3213213333332132 java apk linux 解析APK
解析apk有两种方法 1、结合安卓提供apktool工具，用java执行cmd解析命令获取apk信息 2、利用相关jar包里的集成方法解析apk 这里只给出第二种方法，因为第一种方法在linux服务器下会出现不在控制范围之内的结果。 public class ApkUtil { /** * 日志对象 */ private static Logger
nginx自定义ip访问N种方法 ronin47 nginx 禁止ip访问
　　　因业务需要，禁止一部分内网访问接口，　由于前端架了F5，直接用deny或allow是不行的，这是因为直接获取的前端Ｆ５的地址。　　　所以开始思考有哪些主案可以实现这样的需求，目前可实施的是三种：　　　一：把ip段放在redis里，写一段lua 二：利用geo传递变量，写一段
mysql timestamp类型字段的CURRENT_TIMESTAMP与ON UPDATE CURRENT_TIMESTAMP属性 dcj3sjt126com mysql
timestamp有两个属性，分别是CURRENT_TIMESTAMP 和ON UPDATE CURRENT_TIMESTAMP两种，使用情况分别如下： 1. CURRENT_TIMESTAMP 当要向数据库执行insert操作时，如果有个timestamp字段属性设为 CURRENT_TIMESTAMP，则无论这
struts2+spring+hibernate分页显示 171815164 Hibernate
分页显示一直是web开发中一大烦琐的难题，传统的网页设计只在一个JSP或者ASP页面中书写所有关于数据库操作的代码，那样做分页可能简单一点，但当把网站分层开发后，分页就比较困难了，下面是我做Spring+Hibernate+Struts2项目时设计的分页代码，与大家分享交流。　　1、DAO层接口的设计，在MemberDao接口中定义了如下两个方法： public in
构建自己的Wrapper应用 g21121 rap
我们已经了解Wrapper的目录结构，下面可是正式利用Wrapper来包装我们自己的应用，这里假设Wrapper的安装目录为:/usr/local/wrapper。首先，创建项目应用 &nb
[简单]工作记录_多线程相关 53873039oycg 多线程
最近遇到多线程的问题,原来使用异步请求多个接口(n*3次请求) 方案一使用多线程一次返回数据,最开始是使用5个线程,一个线程顺序请求3个接口,超时终止返回缺点测试发现必须3个接
调试jdk中的源码，查看jdk局部变量程序员是怎么炼成的 jdk 源码
转自：http://www.douban.com/note/211369821/ 学习jdk源码时使用-- 学习java最好的办法就是看jdk源代码，面对浩瀚的jdk（光源码就有40M多，比一个大型网站的源码都多）从何入手呢，要是能单步调试跟进到jdk源码里并且能查看其中的局部变量最好了。可惜的是sun提供的jdk并不能查看运行中的局部变量
Oracle RAC Failover 详解 aijuans oracle
Oracle RAC 同时具备HA(High Availiablity) 和LB(LoadBalance). 而其高可用性的基础就是Failover(故障转移). 它指集群中任何一个节点的故障都不会影响用户的使用，连接到故障节点的用户会被自动转移到健康节点，从用户感受而言，是感觉不到这种切换。 Oracle 10g RAC 的Failover 可以分为3种： 1. Client-Si
form表单提交数据编码方式及tomcat的接受编码方式 antonyup_2006 JavaScript tomcat 浏览器互联网 servlet
原帖地址：http://www.iteye.com/topic/266705 form有2中方法把数据提交给服务器，get和post,分别说下吧。（一）get提交 1.首先说下客户端（浏览器）的form表单用get方法是如何将数据编码后提交给服务器端的吧。对于get方法来说，都是把数据串联在请求的url后面作为参数，如：http://localhost:
JS初学者必知的基础百合不是茶 js函数 js入门基础
JavaScript是网页的交互语言,实现网页的各种效果, JavaScript 是世界上最流行的脚本语言。 JavaScript 是属于 web 的语言，它适用于 PC、笔记本电脑、平板电脑和移动电话。 JavaScript 被设计为向 HTML 页面增加交互性。许多 HTML 开发者都不是程序员，但是 JavaScript 却拥有非常简单的语法。几乎每个人都有能力将小的
iBatis的分页分析与详解 bijian1013 java ibatis
分页是操作数据库型系统常遇到的问题。分页实现方法很多，但效率的差异就很大了。iBatis是通过什么方式来实现这个分页的了。查看它的实现部分，发现返回的PaginatedList实际上是个接口，实现这个接口的是PaginatedDataList类的对象，查看PaginatedDataList类发现，每次翻页的时候最
精通Oracle10编程SQL(15)使用对象类型 bijian1013 oracle 数据库 plsql
/* *使用对象类型 */ --建立和使用简单对象类型 --对象类型包括对象类型规范和对象类型体两部分。 --建立和使用不包含任何方法的对象类型 CREATE OR REPLACE TYPE person_typ1 as OBJECT( name varchar2(10),gender varchar2(4),birthdate date ); drop type p
【Linux命令二】文本处理命令awk bit1129 linux命令
awk是Linux用来进行文本处理的命令，在日常工作中，广泛应用于日志分析。awk是一门解释型编程语言，包含变量，数组，循环控制结构，条件控制结构等。它的语法采用类C语言的语法。 awk命令用来做什么？ 1.awk适用于具有一定结构的文本行，对其中的列进行提取信息 2.awk可以把当前正在处理的文本行提交给Linux的其它命令处理，然后把直接结构返回给awk 3.awk实际工
JAVA(ssh2框架)+Flex实现权限控制方案分析白糖_ java
目前项目使用的是Struts2+Hibernate+Spring的架构模式，目前已经有一套针对SSH2的权限系统，运行良好。但是项目有了新需求：在目前系统的基础上使用Flex逐步取代JSP，在取代JSP过程中可能存在Flex与JSP并存的情况，所以权限系统需要进行修改。【SSH2权限系统的实现机制】权限控制分为页面和后台两块：不同类型用户的帐号分配的访问权限是不同的，用户使
angular.forEach boyitech AngularJS AngularJS API angular.forEach
angular.forEach 描述: 循环对obj对象的每个元素调用iterator, obj对象可以是一个Object或一个Array. Iterator函数调用方法: iterator(value, key, obj), 其中obj是被迭代对象，key是obj的property key或者是数组的index，value就是相应的值啦. (此函数不能够迭代继承的属性.)
java-谷歌面试题-给定一个排序数组，如何构造一个二叉排序树 bylijinnan 二叉排序树
import java.util.LinkedList; public class CreateBSTfromSortedArray { /** * 题目:给定一个排序数组，如何构造一个二叉排序树 * 递归 */ public static void main(String[] args) { int[] data = { 1, 2, 3, 4,
action执行2次 Chen.H JavaScript jsp XHTML css Webwork
xwork 写道 <action name="userTypeAction" class="com.ekangcount.website.system.view.action.UserTypeAction"> <result name="ssss" type="dispatcher">
[时空与能量]逆转时空需要消耗大量能源 comsci 能源
无论如何,人类始终都想摆脱时间和空间的限制....但是受到质量与能量关系的限制,我们人类在目前和今后很长一段时间内,都无法获得大量廉价的能源来进行时空跨越..... 在进行时空穿梭的实验中,消耗超大规模的能源是必然
oracle的正则表达式(regular expression)详细介绍 daizj oracle 正则表达式
正则表达式是很多编程语言中都有的。可惜oracle8i、oracle9i中一直迟迟不肯加入，好在oracle10g中终于增加了期盼已久的正则表达式功能。你可以在oracle10g中使用正则表达式肆意地匹配你想匹配的任何字符串了。正则表达式中常用到的元数据(metacharacter)如下： ^ 匹配字符串的开头位置。 $ 匹配支付传的结尾位置。 *
报表工具与报表性能的关系 datamachine 报表工具 birt 报表性能润乾报表
在选择报表工具时，性能一直是用户关心的指标，但是，报表工具的性能和整个报表系统的性能有多大关系呢？要回答这个问题，首先要分析一下报表的处理过程包含哪些环节，哪些环节容易出现性能瓶颈，如何优化这些环节。一、报表处理的一般过程分析 1、用户选择报表输入参数后，报表引擎会根据报表模板和输入参数来解析报表，并将数据计算和读取请求以SQL的方式发送给数据库。 2、
初一上学期难记忆单词背诵第一课 dcj3sjt126com word english
what 什么 your 你 name 名字 my 我的 am 是 one 一 two 二 three 三 four 四 five 五 class 班级，课 six 六 seven 七 eight 八 nince 九 ten 十 zero 零 how 怎样 old 老的 eleven 十一 twelve 十二 thirteen
我学过和准备学的各种技术 dcj3sjt126com 技术
语言VB https://msdn.microsoft.com/zh-cn/library/2x7h1hfk.aspxJava http://docs.oracle.com/javase/8/C# https://msdn.microsoft.com/library/vstudioPHP http://php.net/manual/en/Html
struts2中token防止重复提交表单蕃薯耀重复提交表单 struts2中token
struts2中token防止重复提交表单 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 蕃薯耀 2015年7月12日 11:52:32 星期日 ht
线性查找二维数组 hao3100590 二维数组
1.算法描述有序（行有序，列有序，且每行从左至右递增，列从上至下递增）二维数组查找，要求复杂度O(n) 2.使用到的相关知识：结构体定义和使用，二维数组传递（http://blog.csdn.net/yzhhmhm/article/details/2045816） 3.使用数组名传递这个的不便之处很明显，一旦确定就是不能设置列值 //使
spring security 3中推荐使用BCrypt算法加密密码 jackyrong Spring Security
spring security 3中推荐使用BCrypt算法加密密码了，以前使用的是md5， Md5PasswordEncoder 和 ShaPasswordEncoder，现在不推荐了，推荐用bcrpt Bcrpt中的salt可以是随机的，比如： int i = 0; while (i < 10) { String password = "1234
学习编程并不难,做到以下几点即可! lampcy java html 编程语言
不论你是想自己设计游戏，还是开发iPhone或安卓手机上的应用，还是仅仅为了娱乐，学习编程语言都是一条必经之路。编程语言种类繁多，用途各异，然而一旦掌握其中之一，其他的也就迎刃而解。作为初学者，你可能要先从Java或HTML开始学，一旦掌握了一门编程语言，你就发挥无穷的想象，开发各种神奇的软件啦。 1、确定目标学习编程语言既充满乐趣，又充满挑战。有些花费多年时间学习一门编程语言的大学生到
架构师之mysql----------------用group+inner join,left join ,right join 查重复数据（替代in) nannan408 right join
1.前言。如题。 2.代码 (1)单表查重复数据,根据a分组 SELECT m.a,m.b, INNER JOIN （select a,b,COUNT(*) AS rank FROM test.`A` A GROUP BY a HAVING rank>1 )k ON m.a=k.a （2）多表查询，使用改为le
jQuery选择器小结 VS 节点查找（附css的一些东西） Everyday都不同 jquery css name选择器追加元素查找节点
最近做前端页面，频繁用到一些jQuery的选择器，所以特意来总结一下：测试页面： <html> <head> <script src="jquery-1.7.2.min.js"></script> <script> /*$(function() { $(documen
关于EXT tntxia ext
ExtJS是一个很不错的Ajax框架，可以用来开发带有华丽外观的富客户端应用，使得我们的b/s应用更加具有活力及生命力。ExtJS是一个用 javascript编写，与后台技术无关的前端ajax框架。因此，可以把ExtJS用在.Net、Java、Php等各种开发语言开发的应用中。 ExtJs最开始基于YUI技术，由开发人员Jack
一个MIT计算机博士对数学的思考 xjnine Math
在过去的一年中，我一直在数学的海洋中游荡，research进展不多，对于数学世界的阅历算是有了一些长进。为什么要深入数学的世界？作为计算机的学生，我没有任何企图要成为一个数学家。我学习数学的目的，是要想爬上巨人的肩膀，希望站在更高的高度，能把我自己研究的东西看得更深广一些。说起来，我在刚来这个学校的时候，并没有预料到我将会有一个深入数学的旅程。我的导师最初希望我去做的题目，是对appe