脚踏实地仰望星空

Multiple Objective Optimization in Recommender Systems 推荐系统多目标优化

ABSTRACT 摘要

We address the problem of optimizing recommender systems for multiple relevance objectives that are not necessarily aligned. Specifically, given a recommender system that optimizes for one aspect of relevance, semantic matching (as defined by any notion of similarity between source and target of recommendation; usually trained on CTR), we want to enhance the system with additional relevance signals that will increase the utility of the recommender system, but that may simultaneously sacrifice the quality of the semantic match. The issue is that semantic matching is only one relevance aspect of the utility function that drives the recommender system, albeit a significant aspect.

本论文讨论推荐系统中多个相关目标的优化问题，这些目标不一定一致。具体来说，给定一个推荐系统，该推荐系统针对相关性、语义匹配（定义为推荐源和目标之间的任何相似性概念；通常在CTR上进行训练）的一个方面进行优化，我们希望增加额外的相关信号来增强该系统，从而提高推荐系统的实用性，但这可能会牺牲语义匹配的质量。问题是语义匹配只是驱动推荐系统的效用功能的一个相关方面，尽管它是一个重要方面。

In talent recommendation systems, job posters want candidates who are a good match to the job posted, but also prefer those candidates to be open to new opportunities. Recommender systems that recommend discussion groups must ensure that the groups are relevant to the users’ interests, but also need to favor active groups over inactive ones. We refer to these additional relevance signals (job-seeking intent and group activity) as extraneous features, and they account for aspects of the utility function that are not captured by the semantic match (i.e. post-CTR down-stream utilities that reflect engagement: time spent reading, sharing, commenting, etc). We want to include these extraneous features into the recommendations, but we want to do so while satisfying the following requirements: 1) we do not want to drastically sacrifice the quality of the semantic match, and 2) we want to quantify exactly how the semantic match would be affected as we control the different aspects of the utility function. In this paper, we present an approach that satisfies these requirements.

在人才推荐系统中，招聘者希望应聘者与所发布的招聘启事相匹配，但也希望应聘者对新的机会敞开心扉。推荐讨论组的推荐系统必须确保讨论组与用户的兴趣相关，但也需要优先推荐活动组而不是非活动组。我们将这些额外的关联信号（求职意向和群体活动）称为无直接关系的特征，它们包含语义匹配未捕获的效用函数的各个方面（即反映参与度的点击后下游效用：花在阅读、分享、评论等上的时间）。我们希望在推荐中包含这些无直接关系的特征，但我们希望在满足以下要求的前提下，添加这些无关特征：1）我们不希望大幅牺牲语义匹配的质量，（2）我们希望量化当我们控制效用函数的不同方面时，语义匹配将如何受到影响。本文提出了一种满足这些要求的方法。

We frame our approach as a general constrained optimization problem and suggest ways in which it can be solved efficiently by drawing from recent research on optimizing non-smooth rank metrics for information retrieval. Our approach features the following characteristics: 1) it is model and feature agnostic, 2) it does not require additional labeled training data to be collected, and 3) it can be easily incorporated into an existing model as an additional stage in the computation pipeline. We validate our approach in a revenue-generating recommender system that ranks billions of candidate recommendations on a daily basis and show that a significant improvement in the utility of the recommender system can be achieved with an acceptable and predictable degradation in the semantic match quality of the recommendations.

我们将我们的方法框架为一个一般的约束优化问题，并从最近关于优化信息检索的非平滑排序度量的研究中提出了有效的解决方法。我们的方法具有以下特点：1）它是模型并且有不可知的特征，2）它不需要收集额外的标记训练数据，3）它可以很容易地作为计算中的附加阶段合并到现有模型中。我们在线上的推荐系统中验证了我们的方法，该系统每天对数十亿个候选推荐进行排序，并表明，在推荐的语义匹配质量下降可接受且可预测的情况下，推荐系统的实用性可以得到显著提高。

1. INTRODUCTION 简介

In designing recommender systems, we often have to balance multiple competing objectives. An example scenario can be drawn from a revenue-generating product at LinkedIn called TalentMatch, in which the recommender system, triggered by a job posted on the site, scours the entire member database to find the best candidates for the job. Those receiving the recommendations, job posters, want the candidates recommended to be a good fit for the job, but also prefer that the candidates be open to pursuing new opportunities. More specifically, a job poster would rather be recommended a candidate who is a great match for the job and also happens to be looking to change jobs, than the best match who happens to not be interested in exploring new opportunities. On the other hand, recommending a candidate who will certainly take the job if the offer was made, but who is not a good match for the job, will negatively affect the experience of the job poster. Therefore, given a ranking of candidates according to how well they match a given job and a ranking of candidates with regards to their job-seeking intent, the challenge is to combine both rankings into a final ranking that is optimal with regards to a given utility function.

在设计推荐系统时，我们常常需要平衡多个相互竞争的目标。以LinkedIn的TalentMatch产品为例，在该产品中，由发布在网站上的工作机会触发的推荐系统会搜索整个成员数据库，以找到该职位的最佳候选人。那些收到推荐的招聘者，希望被推荐的应聘者很适合这个职位，同时也更希望应聘者有寻找新机会的意愿。更具体地说，招聘者更愿意推荐系统给他推荐一个与职位非常匹配，而且碰巧也在找工作的候选人，而不是一个对寻找新机会不感兴趣的最佳人选。另一方面，推荐一位肯定会接受offer、但并不适合这个职位的候选人，这将对招聘者的体验产生负面影响。因此，根据候选人与给定职位的匹配程度对其进行排序，和根据其求职意向对候选人进行排序，挑战在于将这两个排序合并为最终排序，该排序对于给定的效用函数是最优的。

In most recommender systems, there is a utility function to be maximized: relevant engagement. In TalentMatch, relevant engagement is a multi-faceted objective: a) the job poster decides to purchase the set of candidate recommendations based on a snippet of information for each of the candidates (see Figure 1), b) the job poster decides to initiate communication with each of the recommended candidates in the purchased set, and c) each of the candidates contacted respond in a favorable fashion to the job poster.

在大多数推荐系统中，有一个效用函数需要最大化：相关参与。在TalentMatch中，相关的参与有多个目标：a）招聘者根据每个候选人的一小段信息购买候选人推荐集（见图1），b）招聘者开始与购买集中的每个推荐候选人进行沟通，以及c）每一位被联系的应聘者都对招聘者做出了回应。

In the TalentMatch system, the semantic model computes the probability that the feature vector representing the member and the feature vector representing the job are a good match. The model does this by computing similarities between subsets of the member’s feature vector and semantically related subsets of the job’s feature vector. The various similarities in this vector are then weighted by training against a given CTR metric using a supervised learning algorithm. We refer to the features used to generate the similarity vector as semantic features, a concrete example being the job description in the job posting and the job description of the member’s current position. In this case, the semantic features being compared are explicit, however, it may also be the case that the semantic features are latent (as in matrix factorization approaches to recommender systems). Extraneous features, on the other hand, exist only in the entity being recommended, not in the entity being recommended to. An example of an extraneous feature would be the job-seeking intent.

在TalentMatch系统中，语义模型计算代表成员的特征向量和代表职位的特征向量匹配的概率。该模型通过计算成员特征向量的子集和职位特征向量的语义相关子集之间的相似度来实现这一点。然后，利用监督学习算法，通过训练给定的CTR度量来加权该向量中的各种相似性。我们把用于生成相似向量的特征称为语义特征，一个具体的例子是职位发布中的职位描述和成员当前职位的职位描述。在这种情况下，被比较的语义特征是显式的，然而，语义特征也可能是潜在的（在推荐系统的矩阵分解方法中）。另一方面，无关特征只存在于被推荐的实体中，而不存在于推荐者（being recommended to）实体中。例如，求职意向是一个无关特征。

The job-seeking intent of each candidate is generated using another model, which for purposes of this paper, can be treated as a black-box that takes as input a candidate member and based on that member’s data (e.g. activity on the site and profile information), outputs a job-seeking propensity score and a probabilistic assignment to each of the job-seeking intent categories: active, passive, and non-job-seeker. Many members who do not self-identify as job-seekers on the site actually display job-seeking behavior and characteristics. Therefore, we can estimate job-seeking intent for every member of the site. Though only a proxy, job-seeking intent is a very good indicator of the likelihood with which a member contacted regarding a job opportunity will respond favorably (which is one of the aspects of the utility function in TalentMatch). The other two aspects of the utility function (purchase rate of candidate recommendations and likelihood that a job poster will communicate with the purchased recommendations) are accounted for by the semantic model. Intuitively, increasing the number of individuals with high job-seeking intent (those classified as active or passive) in the top-K recommendations, without drastically sacrificing the semantic match of the recommendations, should increase the utility of the TalentMatch recommender system by connecting job posters with candidates who will engage with them. Figure 2 gives a high-level overview of the relevant system components.
每个求职者的求职意向是使用另一个模型生成的，在本文中，该模型可以被视为一个黑盒，将一个求职者作为输入，并基于该求职者的数据（例如，网站上的活动和个人资料信息），输出求职意向分数和主动求职、被动求职、无意愿三种求职意愿的概率。很多在网站上没有自我认同为求职者的会员，实际上都表现出了求职行为和特点。因此，我们可以估计每个成员的求职意愿。尽管只是一个中间桥梁，求职意向是一个非常好的指标，表明与某个工作机会有关的成员可能作出反应的可能性（这是TalentMatch中效用函数的一个方面）。效用函数的另外两个方面（候选推荐的购买率，和招聘者与购买的候选人进行联系的可能性）由语义模型进行说明。直观地说，在top-K推荐中增加具有高度求职意向的个人（被归类为主动或被动的）的数量，而不大幅牺牲推荐的语义匹配度，通过连通招聘者和会响应招聘的应聘者提升了TalentMatch推荐系统的效用。图2给出了相关系统组件的简要概述。

Compounding the issue of multi-faceted objectives is the fact that in live production systems, models often need to evolve in a progressive fashion: the model may have initially optimized for only one aspect of relevant engagement (e.g. purchase rate of candidate recommendations) and it would be preferable to improve the model incrementally (as soon as a new feature like job seeking intent becomes available), rather than waiting for a complete redesign and development lifecycle of a new model. The incrementally improved model then bridges the old and the new models, and allows for additional analysis on the performance of the new feature, which may in turn influence how it is incorporated in the new model.

在实际生产系统中，模型通常需要以渐进的方式发展：模型最初可能只针对相关参与的一个方面（例如候选推荐的购买率）进行了优化，并且会逐步改进模型（一旦出现求职意向等新功能），而不是重新设计和开发一个全新的模型。逐步改进的模型合并了新旧模型，并对新特性的性能进行额外的分析，而新特性又可能影响如何将其合并到新模型中。

In this paper we describe a general approach for incorporating extraneous features into a semantic model, which result in the need to optimize for objectives which are not necessarily aligned. More specifically, given a model which outputs recommendations ranked according to some notion of semantic relevance, we want to add certain features which contribute to the overall utility of the users of the recommender system, but that may negatively affect the semantic relevance of the recommendations. This approach is model and feature agnostic, does not require additional labeled training data to be collected, and can be easily incorporated into an existing model as an additional stage in the computation pipeline. We validate our approach by A/B testing it on TalentMatch system, which currently ranks billions of job-member pairs on a daily basis, and show that a significant improvement in the utility of the recommender system (42% increase on email reply rate) can be achieved with an acceptable and predictable degradation in the original relevance of the recommendations.

在本文中，我们描述了一种将无关特征合并到语义模型中的通用方法，这种方法可以对不一定一致的目标进行优化。更具体地说，给定一个根据语义相关性进行推荐的模型，我们希望添加一些有助于推荐系统用户整体效用的特性，但这可能会对推荐的语义相关性产生负面影响。这种方法不依赖于模型和特征，不需要收集额外的标记训练数据，并且可以很容易地作为计算中的附加阶段合并到现有模型中。我们在每天对数十亿个职位-候选人对进行排序的TalentMatch系统上进行了A/B测试，验证了我们的方法，该方法显著提升了推荐系统的实用性（电子邮件回复率提高了42%），同时推荐的原始相关性的下降可接受和可预测。

2. PROBLEM FORMULATION 问题表述

In this section we discuss a general template for framing the kinds of problems we are targeting in this paper and in Section 4.1 we discuss the instantiation of this general template for the specific TalentMatch scenario.
We start with a model which is optimized for semantic relevance. We then want to enhance this model with additional features which will increase the utility of the recommender system, but at a potential loss in the semantic relevance of the recommendations. Adding these additional features to the model will result in an enhanced model with additional objectives to be optimized. These additional objectives will be optimized conditionally on the semantic relevance objective having already been optimized.

在本节中，我们将讨论一个通用模板，用于构建本文所针对的各种问题；在第4.1节中，我们将讨论该通用模板在TalentMatch场景中的应用。

我们从一个以语义相关性为优化目标的模型开始。然后，我们希望添加额外的特性来提升这个模型，以增加推荐系统的实用性，但是在推荐的语义相关性方面可能会有损失。添加这些附加的特性，会提升模型，同时会增加额外的优化目标。这些附加的目标将在语义相关目标已经优化的基础上有条件地优化。

2.1 Adding a single competing objective 增加单一竞争目标

In the simplest case, we would have only one feature to add to the semantic model, which equates to one additional objective to be increased (adding more features that map to only one additional objective can be handled similarly). We also want to penalize enhanced models in a manner that is correlated with the distance between the semantic relevance score distribution of the items in the top-K ranking as output by the semantic model, and the semantic relevance score distribution of the items in the top-K ranking as output by the enhanced model (note that the enhanced model outputs a ranking based on the enhanced scores, but we need to map those scores back to their semantic counterparts to compare the two distributions). These requirements are expressed by the following loss function:

$L(w)=-g(f(Y,X,w)) + \lambda \Delta (\pi(Y), \pi(f(Y,X,w)))$ (1)

where

X is the single feature to be added, a matrix of dimensions m×n;
m is the number of targets of the recommender system;
n is the number of recommendations per target;
Y represents semantic relevance scores, also a matrix of dimensions m × n;
w is the parameter associated with the feature to be added;
f is the enhanced model which perturbs the semantic relevance score Y according to the new feature X and parameter w;
g is an objective to be maximized which contributes the overall utility of the recommender system but is not accounted for in the semantic model;
∆ is a non-negative measure which is indicative of the distance between the semantic match score distribution and the enhanced score distribution;
π is a function which returns a top-K ranking, and n >> K;
λ is a positive trade-off parameter.

Alternatively, it may be easier to visualize the objective as a constrained optimization problem where we have a limit on how much we are allowed to deviate from the top-K score distribution based on the semantic model:

$maximize \quad g(f(Y,X,w)) \\ s.t. \quad \Delta_i (\pi(Y), \pi(f(Y, X, w))) \leq c_i, i=1,...,l$ (2)
Where:

is the ith constraint on the top-K distribution distance between the semantic match score distribution and the enhanced score distribution;
l is the number of constraints on the semantic score distribution deviation.

Given the constrained optimization perspective, in the simple case where l = 1 we could analyze how g trends as a function of various values for c, from which we could extract the Pareto frontier [6] and which we could use to make a data-driven decision on what value of c is appropriate. Note that if, in the Pareto frontier, g turns out to be a linear function of c, then the slope of the line would be the value of the λ parameter in Equation 1.

在最简单的情况下，我们将只有一个特性要添加到语义模型中，这相当于要增加一个附加目标（添加更多映射到一个附加目标的特性可以类似地处理）。我们还希望对增强模型进行惩罚，惩罚方式与语义模型输出的top-K排序中item的语义相关性得分分布和增强模型输出的top-K排序中item的语义相关性得分分布之间的距离相关（注意，增强模型输出基于增强分数的排序，但我们需要将这些分数映射回它们的语义对应项以比较这两个分布）。这些需求由以下损失函数表示：

$L(w)=-g(f(Y,X,w)) + \lambda \Delta (\pi(Y), \pi(f(Y,X,w)))$ (1)

其中：

X是要添加的单个特征，一个尺寸为m×n的矩阵；
m是推荐系统的目标数；
n是每个目标的推荐个数；
Y表示语义相关性得分，也是m×n的矩阵；
w是与要添加的特征相关联的参数；
f是根据新特征X和参数w扰动语义关联度Y的增强模型；
g是最大化的目标，它有助于推荐系统的总体效用，但在语义模型中没有考虑到；
∆是一个非负的度量，表示语义匹配分数分布和增强分数分布之间的距离；
π是一个返回top-K排名的函数，n>>K；
λ是一个正的权衡参数。

也可以将目标视为一个约束优化问题，对允许偏离基于语义模型的top-K分数分布的程度进行限制：

$maximize \quad g(f(Y,X,w)) \\ s.t. \quad \Delta_i (\pi(Y), \pi(f(Y, X, w))) \leq c_i, i=1,...,l$ （2）

其中：

是语义匹配分数分布和增强分数分布之间top-K分布距离的第i个约束条件；
l是语义得分分布偏差的约束个数。

考虑到约束优化的观点，在l=1的简单情况下，我们可以分析g的趋势如何作为c的各种值的函数，从中我们可以提取帕累托前沿[6]，我们可以基于数据，确定c的哪个值是合适的。注意，如果在Pareto前沿，g是c的线性函数，那么直线的斜率就是方程1中λ参数的值。

2.2 Adding multiple competing objectives 添加多个竞争目标

In the case where the additional features with which we will enhance the semantic model lead to multiple additional objectives, we have the following general version of the problem:

$maximize \quad G(F(Y, X, w)) \\ s.t. \quad \Delta_i(\pi(Y), \pi(f(Y, X, w)))) \leq c_i, i = 1,...,l$ (3)
Where:

G is the set of additional objectives to be maximized {};
t is the number of additional objectives to be maximized;
X are the features to be added, a matrix of dimensions m×n×p;
p is the number of features being added;
w is the parameter vector associated with the features to be added with dimension 1 × p;
F is the enhanced model which perturbs the semantic match score Y according to the new features X and parameter w.

如果我们添加的特性导致模型添加多个目标，则有如下优化问题：

$maximize \quad G(F(Y, X, w)) \\ s.t. \quad \Delta_i(\pi(Y), \pi(f(Y, X, w)))) \leq c_i, i = 1,...,l$ (3)
其中：

G是要最大化的附加目标集合 {};
t是要最大化的附加目标数；
X是要添加的特征，是m×n×p的矩阵；
p是要添加的特征数；
w是与添加特征相关的参数向量，是1 × p的矩阵；
F是根据新特征X和参数w扰动语义匹配得分Y的增强模型。

3. COMPUTATIONAL STRATEGY 计算策略

The functions g and ∆ described in Section 2 are non-smooth since they depend on a ranking which in turn depends on a sort operation. Therefore, traditional optimization approaches which leverage the gradient of a function are not directly applicable.
In very small parameter spaces (one or two parameters), grid (exhaustive) search is an acceptable and very simple to implement computational strategy. For larger parameter spaces, we can devise smoothed approximations to g and ∆ that are amenable to traditional gradient-based methods and therefore able to handle parameter spaces where grid search would be unfeasible. In this Section we discuss using such approximations in our problem formulation and in Section 4.2 we discuss the computational strategy followed in the TalentMatch case study.

第2节中描述的函数g和∆是非平滑的，因为它们依赖于排序，而排序依赖sort操作。因此，传统的利用函数梯度的优化方法并不直接适用。

在非常小的参数空间（一个或两个参数）中，网格（穷举）搜索是一种可接受的且非常易于实现的计算策略。对于更大的参数空间，我们可以设计对g和∆的平滑逼近，这是对传统的基于梯度的方法的改进，因此能够处理网格搜索不可行的参数空间。在这一节中，我们讨论如何在我们的问题公式中使用这种近似，在第4.2节中，我们讨论在TalentMatch中遵循的计算策略。

Recent research on “learning to rank” for information retrieval addresses the need to optimize non-smooth rank-based metrics. There are two approaches that are particularly interesting in this direction: SoftRank [9] and SmoothRank [3]. These approaches develop smooth approximations to IR metrics such as the Normalized Discounted Cumulative Gain (NDCG) and the Average Precision (AP). We can formulate our g and ∆ functions so that they have a similar form to those IR metrics and then we can employ the techniques described in [9, 3] for optimizing them.
For example, we can consider the original semantic relevance score from the TalentMatch model to be the ground-truth measure of relevance of each candidate member given a job. In an IR setting, we would have queries and documents, where documents have a measure of relevance to a particular query. In the TalentMatch model, a job is equivalent to a query and a candidate member is equivalent to a document.

最近关于信息检索“学习排序”的研究指出优化非平滑排序度量的必要。在这个方向上有两种方法特别有趣：SoftRank[9]和SmoothRank[3]。这些方法发展了对IR指标的平滑近似，如归一化折损累计增益（NDCG）和平均精度（AP）。我们可以定义g和∆函数，使它们具有与IR度量相似的形式，然后我们可以使用[9，3]中描述的技术来优化它们。

例如，我们可以将TalentMatch模型中的原始语义相关性分数作为每个候选成员和职位相关性的基本真实度量。在IR设置中，我们有查询和文档，其中文档与特定查询有一定的相关性。在TalentMatch模型中，职位等同于查询，候选成员等同于文档。

One possible instantiation of g and ∆ would be as follows: assume we do not wish to distinguish between active and passive candidates; we would then have a binary notion of relevance for a given candidate, {job-seeker = 1, non-job- seeker = 0} that we want to maximize in the top-K results. This is a good match for the AP measure. We then need a constraint function which penalizes how much deviation there is from the original relevance-based ranking. It turns out that an adapted form of the NDCG measure would be appropriate here.
g和∆的一个可能的实例如下：假设我们不想区分主动和被动的候选人；对一个候选人，我们可以定义一个二元表示，{job seeker=1，non job-seeker=0}，我们希望在top-K结果中最大化这个二元表示。我们将这个二元表示作为AP度量。然后我们需要一个约束函数来惩罚与原始的基于相关性的排序有多大的偏差。结果表明，采用基于NDCG度量的某种变换形式比较合适。

This leaves us with the following smooth approximation to our objective function, using the approximation in equation 9 from [3]:

where:

is the Approximate Average Precision for a given job j;
f is the enhanced model which perturbs the semantic match score, originally defined in Equation 1;
σ is the smoothing parameter as described in [3];
n is the number of recommendations per job j;
is the label of the ith recommended candidate, either job-seeker = 1 or non-job-seeker = 1;
is the label of the recommended candidate at rank k, either job-seeker = 1 or non-job-seeker = 1;
$\widehat{r_i}$ is the smooth rank, as defined in equation 5 in [3].

这样，我们就可以使用[3]中方程9的近似，对目标函数进行以下平滑近似：

其中：

是给定职位j的近似平均精度；
f 是等式1中定义的扰动语义匹配分数的增强模型;
σ [3]中所述的平滑参数;
n 是每个职位j的推荐个数;
是第i个推荐候选人的标签，either job-seeker = 1 or non-job-seeker = 1;
是排序为k的推荐候选人的标签，either job-seeker = 1 or non-job-seeker = 1;
$\widehat{r_i}$ 是平滑排序，如[3]中等式5所定义。

And the following smooth approximation to our constraint function, using the approximation from equation 8 in [3]:

$\Delta _j(f) \approx (A)NDCG_j(f,\sigma ) = \sum_{i,k=1}^{n} S_i D(k) h_{ik}$ (5)
Where:

is the Approximate Normalized Discounted Cumulative Gain for a given job j;
is the semantic score for the ith candidate, obtained from using the TalentMatch model;
n is the number of recommendations per job j;
D(k) is the discounting associated with ranking k, which could be defined as ;
$h_{ik}$ is defined in equation 7 in [3] to be a soft version of an indicator variable that indicates the probability that the $j_{th}$ recommendation is ranked at the $k_{th}$ position.

f, which is the enhanced model that perturbs the semantic match score, originally defined in Equation 1, enters Equation 5 through $h_{ik}$ :

Where d(k) is the index of the recommendation which was ranked at position k by f .

下面是约束函数的平滑近似，使用了[3]中等式8的近似：

$\Delta _j(f) \approx (A)NDCG_j(f,\sigma ) = \sum_{i,k=1}^{n} S_i D(k) h_{ik}$ （5）

其中：

是给定职位的近似归一化折损累计增益;
是第i个候选人的语义得分，得分由TalentMatch模型给出
n是每个职位j的推荐个数；
D(k)是排序k的折扣，可定义为;
$h_{ik}$ 在[3]等式7中定义，表示第j个推荐人在第k个位置的概率。

f是扰动语义匹配分数的增强模型，最初定义在等式1中，通过 $h_{ik}$ 引入等式5：

式中，d(k)是在位置k处按f排名的推荐index。

There are many other ways to formulate our approach using these smoothed approximations. For example, if instead of the job-seeking categories (active, passive, and non-job-seeker) we wished to use the job-seeking intent score, we could formulate g using (A)NDCG instead of (A)AP. Additionally, if the functional form of f in equation 1 is such that a parameter vector w of 0 in the enhanced model yields the equivalent of the semantic model, an Euclidean norm constraint on w could be used instead of (A)NDCG for the ∆ function.

有很多其他的方法来使用这些平滑的近似来表示我们提出的方法。例如，如果我们希望使用求职意向得分而不是求职类别（主动、被动和非求职者），我们可以使用（A）NDCG而不是（A）AP来表示g。此外，如果方程1中f的函数形式使得增强模型中的参数向量w为0产生与语义模型等效的结果，则可以使用对w的欧几里德范数约束，而不是（a）NDCG作为∆函数。

4. TALENT MATCH CASE STUDY 人才匹配案例研究

We illustrate our approach with the TalentMatch system, where given a job posted on the site, we generate a ranked list of candidates with regards to how well the candidates match the job. This semantic model outputs the probability that the candidate is a good match to the job. We want to enhance this model with the job-seeking intent of the candidate so that the candidates being recommended are both good matches for the job, as well as open to new job opportunities. Our hypothesis is that this will contribute to increased engagement between the job poster and the recommended candidates.

我们用TalentMatch系统来说明我们的方法，在该系统中，给定一个发布在网站上的职位，我们就候选人与职位匹配的程度生成一个候选人的排序列表。该语义模型输出候选对象与职位匹配的概率。我们希望根据求职者的求职意向来改进这一模式，使被推荐的求职者既能匹配职位，又有换工作的意愿。我们的假设是，这将有助于增加招聘者和推荐候选人之间的接触。

There are many ways to incorporate the job-seeking intent signal into the TalentMatch model. As discussed in section 1, the job-seeking intent model outputs, for each member, a job-seeking propensity score and a probabilistic assignment to each of the job-seeking intent categories: active, passive, and non-job-seeker. Our objective is to increase the average number of active and passive candidates in the top-K recommendations. We want to achieve this objective by perturbing slightly the semantic ranking so that if there is a candidate Cx with a semantic score of 0.9 in rank 1 who has a low job-seeking intent (classified as a non-job-seeker), and another candidate, Cy, in rank 2 with a match score of 0.88, but that happens to have a high job-seeking intent (classified as active or passive), then we would like to bump Cy up to rank 1 and bump Cx down to rank 2. We do not necessarily want to eliminate Cx from the final ranking, nor do we want to excessively bump the candidate down the ranking. More importantly, we want a systematic way to perform this re-ranking perturbation.

有很多方法可以将求职意向信号纳入TalentMatch模型。如第1节所述，求职意向模型为每个成员输出一个求职倾向得分和主动求职、被动求职、无求职意愿的概率。我们的目标是在top-K推荐中增加主动和被动求职候选人的平均数量。我们希望通过稍微扰动语义排序来实现这一目标，如果排序为1的候选人为Cx，其语义得分为0.9，但是求职意向较低（分类为非求职者），而排序为2的候选人为Cy，其语义得分为0.88，但有较高的求职意向（分类为主动或被动），我们希望将Cy提升到第一位，将Cx下降到第二位。我们不一定想在最终排名中淘汰Cx，也不想过分地把Cx从排序中挤下来。更重要的是，我们需要一种系统的方法来执行这种重新排序的扰动。

A simple strategy would be to remove from the ranked list based on semantic matching scores all those recommended candidates with a job-seeking intent score below a certain threshold t, backfilling if needed to make sure we have K recommendations (we discuss below how this specific heuristic is a special case of our suggested approach). This approach still requires us to estimate the threshold t, but more crucially, it also incurs the risk of completely eliminating high-quality matches from the final ranking, an outcome we do not want.

一个简单的策略是根据语义匹配分数从排名列表中删除所有那些求职意向分数低于某个阈值t的推荐候选人，如果需要，进行填充以确保我们有K个推荐（下面我们将讨论这种特定的启发式方法是我们推荐方法的一个特例）。这种方法仍然需要我们估计阈值t，但更关键的是，它也会带来从最终排名中完全淘汰高质量匹配度的风险，这是我们不希望看到的结果。

4.1 TalentMatch Problem Formulation TalentMatch问题公式

In order to come up with a strategy for re-ranking that satisfies our requirements, we frame our problem using the template described in section 2. The average number of active and passive candidates in the top-K recommendations is actually an instance of a familiar metric: mean precision at K, where our binary relevance measure is an indicator function that returns 1 if the member is active or passive and 0 otherwise, ∈ {0, 1}. For a given job posted to the site, precision at K is:

$Prec@K = \frac{1}{K}\sum_{i=1}^{n} l_i 1\left \{ r(i) \leq K \right \}$ (7)

Where 1{A} is the indicator function applied to A, and 1{A} = 1 is A is true and 0 otherwise, r(i) is the ranking of the $i_{th}$ candidate, and n is the number of candidates in the result set. Our objective to be maximized, the mean precision at K, which maps to the g function in equations 1 and 2 is:

$g = MeanPrec@K = \frac{1}{m}\sum_{i=1}^{m} Prec@K(i)$ (8)

为了提出一个重新排序的策略来满足我们的需求，我们使用第2节中描述的模板来构建我们的问题。top-K推荐中的主动和被动候选的平均数实际上是一个常见度量的实例：K的平均精度，其中我们的二元相关性度量是一个指标函数，如果成员是主动或被动的，则返回1，否则返回0，li∈{0，1}。对于发布到站点的给定职位，K的精度为：

$Prec@K = \frac{1}{K}\sum_{i=1}^{n} l_i 1\left \{ r(i) \leq K \right \}$ (7)

其中1{A}是A的指示符函数，如果A为true，则1{A}=1，否则为0。r(i)是第i个候选的排名，n是结果集中的候选数。我们的目标是最大化，K处的平均精度，它映射到等式1和2中的g函数是：

$g = MeanPrec@K = \frac{1}{m}\sum_{i=1}^{m} Prec@K(i)$ (8)

The functional form of f, the enhanced model in equations 1 and 2, can also be specified in a variety of ways. One possible option is to use a a linear combination of the TalentMatch semantic and job-seeking intent scores. This would not be ideal: we want both, good matches and likely to be job-seeking candidates; therefore, a multiplicative feature interaction is what we seek. We settled on the following formulation:

$f(y, x, w = [\alpha , \beta ]) = y \times (a^{1\left \{x == active \right \}}) \times (\beta^{1 \left\{\ x==passive \right\}})$ (9)

This is equivalent to applying a small boost to the semantic match score (y), and allowing for the boost to be different for actives (α) and passive (β) candidates. Solving the optimization problem defined in equations 1 and 2, with the specific functional forms defined here will yield appropriate values for α and β.

f的函数形式，即方程1和方程2中的增强模型，也可以用多种方式指定。一种可能的选择是使用TalentMatch语义和求职意向分数的线性组合。这并不理想：我们想要两者都是，好的匹配，而且很可能是求职者；因此，我们寻求的是一个乘法特征交互。我们决定采用以下公式：

$f(y, x, w = [\alpha , \beta ]) = y \times (a^{1\left \{x == active \right \}}) \times (\beta^{1 \left\{\ x==passive \right\}})$ (9)

这相当于对语义匹配分数（y）应用一个小的提升（boost），并允许主动（α）和被动（β）候选的提升是不同的。用这里定义的特定函数形式求解方程1和2中定义的优化问题，将得到α和β的适当值。

Given our chosen functional for f, it can be seen that the simple heuristic suggested earlier is actually a special case in our approach, where α and/or β are set to large enough values so as to effectively rank all members with a job-seeking intent score above the threshold t over those members with a score below t. Section 5 discusses how this strategy is suboptimal (it causes an unacceptable loss in semantic relevance).

Finally, we need to specify how we will measure the deviation of the enhanced model distribution from the semantic model distribution, that is, the functional form for ∆ in equations 1 and 2. There are various histogram distance functions to choose from [2], examples of which include Euclidean distance and Kullback-Leibler divergence. We settled on using the Euclidean distance between the two histograms, or more specifically, the sum of squared errors of the histogram buckets, each histogram having b buckets:

$\Delta =\Delta _{SSE}(H_s, H_e) = \sum_{i=1}^{b}(H_s[i] - H_e[j])^2$ (10)

Where is the histogram of semantic match scores of the top-K candidates ranked by the semantic match score and is the histogram of semantic match scores of the top-K candidates ranked by the enhanced score.

考虑到我们为f选择的函数，可以看出前面建议的简单启发式方法实际上是我们方法中的一个特例，其中α和/或β被设置为足够大的值，以便有效地将所有求职意向得分高于阈值t的成员排在得分低于阈值t的成员之上。第5节讨论了这一点策略是次优的（它会导致不可接受的语义相关性损失）。

最后，我们需要具体说明如何测量增强模型分布与语义模型分布之间的偏差，即等式1和2中∆的函数形式。[2]中有各种各样的直方图距离函数可供选择，其中包括欧几里德距离和Kullback-Leibler散度。我们决定使用两个直方图之间的欧几里德距离，或者更具体地说，直方图桶的平方误差之和，每个直方图都有b个桶：

$\Delta =\Delta _{SSE}(H_s, H_e) = \sum_{i=1}^{b}(H_s[i] - H_e[j])^2$ (10)

其中是按语义匹配得分排序的前K个候选的语义匹配得分直方图，是按增强得分排序的前K个候选的语义匹配得分直方图。

4.2 TalentMatch Computational Strategy TalentMatch计算策略

Since we only have two parameters: α and β, and given our intuition that the optimal parameters will probably lie in the interval [1.0, 2.0], a grid search turns out to be an acceptable computational strategy in this scenario. We break up the grid search into 2 runs: a coarse run (to see what region of the search space we should focus on) and a fine run (to zero in on the desired values). In each run we generate all the plans to be tested (a plan being an assignment of values to α and β) and evaluate our g and ∆ functions for each plan generated.

由于我们只有两个参数：α和β，并且根据我们的直觉，最优参数可能位于区间[1.0，2.0]，在这种情况下，网格搜索是一种可接受的计算策略。我们将网格搜索分为两个运行：粗略运行（以确定我们应该关注搜索空间的哪个区域）和精细运行（以关注期望得到的值）。在每次运行中，我们生成所有要测试的计划（计划是对α和β值的赋值），并对生成的每个计划的g和∆函数进行评估。

5. EVALUATION AND RESULTS 评估和结果
For estimating the α and β parameters to be used in Equation 9, we created a sample dataset of jobs recently posted to the site and computed a maximum of 9000 recommendations for those jobs using the TalentMatch model. We filtered all recommendations with a threshold of 0.6 on the TalentMatch semantic score, and then removed all jobs which did not have at least 6 recommendations (we do not show results on the site unless there are at least 6 relevant matches and we include only the top-24 candidates in the recommendation set). This left us with a total of 760 jobs, each with anywhere from 6 recommendations to 9000 recommendations. We then generated the plans as per Section 4.2 and evaluated our g and ∆ functions.
Our g function is the mean precision at K, as defined in Equation 8, where K = 12 since that is how many snippets of candidate recommendations we show in a single page.
Also, for the measure of divergence, our ∆ function, we compared the distribution of the minimum score of the top-12 ranking, given that we want to ensure relevant recommendations in the worst case on the first results page.

为了估计方程式9中使用的α和β参数，我们创建了一个最近发布到站点的职位样本数据集，并使用TalentMatch模型计算了这些职位的最多9000条推荐。我们筛选了TalentMatch语义评分阈值为0.6的所有推荐，然后删除了所有没有至少6个推荐的职位（除非至少有6个相关匹配，否则我们不会在网站上显示结果，并且我们只在推荐集中包含前24名候选人）。这给我们留下了760个工作岗位，每个岗位都有6到9000条推荐。然后，我们根据第4.2节生成了计划，并评估了我们的g和∆函数。

我们的g函数是K的平均精度，如等式8所定义，其中K=12，因为这是我们在一个页面中显示的候选推荐片段的数量。

此外，对于差异的度量，我们的∆函数，我们比较了前12名的最低得分分布，因为我们希望在最坏的情况下，在第一个结果页上确保相关的建议。

Figure 3 shows the result of the fine grid search run, which illustrates the risk-reward trade-off in our experiment: up until a histogram divergence of a little over 60, we pay a penalty that is linear with regards to the increase in the average number of active and passive members in the top-12 result set. Table 1 shows a few of the points used in the plot, including the original plan (equivalent to setting α and β to 1.0), which also indicates the average number of active and passive candidates in the top-12 result set of the original plan to be nearly 4. Figure 3 tells us that we can double that number if we are willing to pay a penalty of about 64 in the histogram divergence, and also tells us what to set α and β to: 1.15 (see table 1). Figures 4(a)-4(d) give an idea of how good/bad a histogram divergence of 64 is. For reference, as per table 1, setting α to 1.3 and α to 1.0 causes an unacceptable loss in relevance (the histogram divergence is too high and the gain in the objective does not justify it).

图3显示了精细网格搜索运行的结果，它说明了我们实验中的风险-回报权衡：直到直方图差异略大于60，我们支付的惩罚与前12个结果集中的主动和被动成员平均数量的增加成线性关系。表1显示了该图中使用的一些点，包括原始计划（相当于将α和β设置为1.0），这也表明原始计划前12个结果集中的主动和被动候选的平均数接近4。图3告诉我们，如果我们愿意支付柱状图散度约64的惩罚，我们可以将这个数字翻一番，还告诉我们将α和β设置为：1.15（见表1）。图4(a)-4(d)给出了64的直方图散度的好坏。作为参考，如表1所示，将α设置为1.3和α1.0会导致不可接受的相关性损失（直方图发散度太高，目标中的增益不能证明这一点）。

All of the plans on the Pareto front in figure 3 have similar coefficients for α and β, which is tied to the fact that the goal we are trying to maximize is the combined number of active/passive candidates in the top-12, and presumably the values for the weights would diverge more had we favored one category or the other. These results point to reasonable strategies that should be evaluated using A/B testing: a plan where α = β = 1.07 and a plan where α = β = 1.15. A/B testing turns out to be a crucial component to the methodology described in this paper. Our approach provides the tools for generating reasonable values for α and β: no matter what the desired risk-reward trade-off of a specific application, only plans in the Pareto frontier should be chosen. However, our choice for what is an acceptable histogram divergence will only be meaningful if once in production, the rate with which job posters purchase candidate set recommendations and the rate with which job posters contact the purchased recommended candidates does not decrease substantially.

图3中帕累托边界的所有计划对于α和β都有相似的系数，这与我们试图最大化的目标是前12名中的主动/被动候选组合的数量有关，如果我们倾向于一个类别或另一个类别，权重的值可能会出现更大的差异。这些结果指出了应使用A/B测试评估的合理策略：α=β=1.07的计划和α=β=1.15的计划。A/B测试是本文所述方法的关键组成部分。我们的方法提供了为α和β生成合理值的工具：无论特定应用的期望风险回报权衡是什么样的，只应选择帕累托边界的计划。然而，我们对于什么是可接受的直方图差异的选择只有在以下情况下才有意义：一旦应用到线上，招聘者购买候选人集推荐的比率和招聘者与购买的推荐候选人联系的比率没有大幅度下降。

As mentioned in Section 1, we would like to increase the likelihood of relevant engagement for TalentMatch. If the job-seeking intent is to be of any use to us, we would expect the rate of replies to InMails (LinkedIn e-mails) from job posters about job opportunities to be higher for members with high job-seeking intent. We determined that members classified as having a high job-seeking intent (actives/passives) are 16× more likely to reply to an InMail regarding a career opportunity, with a 95% confidence interval of 15-17x (intervals computed by the method of E. C. Fieller [4]). These numbers are based on InMail activity that took place over a period of 10 days, during which time the number of non-job-seekers contacted was nearly the same as the number of actives/passives contacted.

如第1节所述，我们希望增加招聘者/求职者与TalentMatch的互动。如果求职意向对我们有用，我们预计求职意向高的会员会对招聘者发布职位的InMail有更高的回复率。被归类为具有高度求职意向（主动/被动）的成员，在职业机会方面，向InMail回复的可能性要高出16倍，95%的置信区间为15-17倍（区间由E.C.Fieller[4]方法计算）。这些数字是根据在10天内的InMail行为得出的，在此期间，与非求职者接触的人数几乎与主动/被动接触的人数相同。

Assuming that all members in the top-12 ranking are contacted about the job opportunity, our results suggest that we can double the desired relevant engagement. Given that the probability of positively replying for non-job-seekers is pn(reply) = 0.028, and the probability of replying for actives/passives is pa/p(reply) = 0.45 (as computed using the 10-day sample), and given that our analysis shows that we can double the average number of actives/passives in the top-12 from 4 to 8 at an acceptable relevance loss, we expect to double the expected relevant engagement: 0.028 × 8 + 0.45 × 4 ≈ 2 versus 0.028 × 4 + 0.45 × 8 ≈ 4.

假设就工作机会联系排名前12位的所有候选人，则期望的相关参与度会翻一番。假设非求职者的正面回复概率为，主动/被动回复概率为 $p_{a/p}(reply) = 0.45$ （使用10天样本计算），由于在可接受的相关性损失下，我们可以将前12名中的主动/被动的平均数量从4倍增加到8倍，我们期望预期的相关参与会增加一倍：0.028×8+0.45×4≈2与0.028×4+0.45×8≈4。

We deployed the A/B test experiment and let it run for a couple of weeks before we started collecting data for analysis (until the “novelty effect” often caused by a new feature being deployed live had subsided). To measure the change in InMail reply rate, we looked at all emails created in a period of three weeks and observed how many were replied to. Table 2 shows the increase in response rate for each A/B test treatment bucket along with their confidence intervals. The actual increase follows the expected linear relationship as expected from the Pareto frontier, and though there is high variance, the 95% confidence intervals contain the expected values.

我们进行了A/B实验，在开始收集数据进行分析之前让它运行几个星期（直到经常由实时部署的新功能引起的“新奇效应”消退）。为了衡量InMail回复率的变化，我们查看了三周内创建的所有电子邮件，并观察了有多少邮件被回复。表2显示了每个A/B试验处理桶的回复率及其置信区间。实际增长遵循帕累托前沿的预期线性关系，尽管存在高方差，但95%置信区间包含预期值。

We now turn our attention to the effect of the re-ranking perturbation on the booking rate. More specifically, we want to quantify the effect of the histogram divergence. For this, during the same time period, we looked at the booking rate in each of A/B test buckets. Table 3 summarizes our findings. It shows a slight degradation in booking rate for the most extreme treatment bucket. However, looking at the rate with which job posters email candidates in the purchased set (InMails per booking) shows a different story.

我们现在将注意力转移到重新排序扰动对预订率的影响上。更具体地说，我们想量化直方图散度的影响。为此，在同一时间段内，我们查看了每个A/B测试桶中的预订率。我们的发现如表3。从表3可看出，最极端处理桶的预订率略有下降。然而，招聘者给应聘者发邮件的比率（每个预定的InMails）却不是这种情况。

Table 4 shows that on average, job posters choose to email candidates more often in the treatment group than in the control group. This is evidence that job posters do not email all candidates in the purchased set, but rather pick and choose who they will contact. This fact explains why the actual average increase in InMail response rate was not as high as expected: we had assumed that job posters contact all of the candidates in the result set, but this turns out not to be the case. The control group shows that job posters only contact an average of approximately 2 members from the purchased candidate result set. More importantly, since job posters do not have access to the job-seeking intent of each candidate (nor to most of the information we use to determine job-seeking intent, such as job searches), there must be something else in the candidate’s profile which compels job posters to email candidates we’ve identified as job-seekers more often than those we have not. Perhaps job-seeking candidates have more complete or more curated profiles. Nevertheless, this means that the snippet we show job posters, based on which they make the decision of whether or not to purchase, is not well representative of the value of the candidate result set. This finding is something that we plan on exploring further, as it suggests that, given the right snippet (one which better conveys the value of the candidate result set to the job poster), the booking rate for the treatment groups should be higher than for the control group.

表4显示，平均来说，与对照组相比，实验组的招聘者选择给应聘者发送电子邮件的频率更高。这就证明了招聘者并不是给购买的所有候选人发送电子邮件，而是只给他们想联系的候选人发送邮件。这一事实解释了为什么实际平均增加的InMails回复率没有预期的那么高：我们假设招聘者会联系结果集中的所有候选人，但事实并非如此。对照组显示，招聘者平均只与购买的候选结果集中的大约2名成员联系。更重要的是，由于招聘者无法获取每位应聘者的求职意向（也无法获取我们用来确定求职意向的大部分信息，如求职信息），在求职者的个人资料中，肯定还有其他一些东西使招聘者向我们确定为求职者的候选人发送电子邮件，而不是那些非求职者。也许求职者有更完整或更详细的个人资料。尽管如此，这意味着我们给招聘者展示的片段并不能很好地代表候选人结果集的价值，而这些片段是他们决定是否购买的依据。这一发现是我们计划进一步探索的，因为它表明，在给定正确的片段（一个更好地将候选结果集的价值传达给招聘者的片段）的情况下，实验组的预约率应该高于对照组。

你可能感兴趣的:(多目标)

「达摩院MindOpt」用于多目标规划（目标规划法） MindOpt_003 算法云计算阿里云
前篇我们讲述了使用加权和法对多目标规划问题的优化，本篇将讲述使用目标规划法。1.原理目标规划法，首先是为每个目标函数设定一个期望值（目标值）gig_igi。方案1：然后构建一个新的目标函数F(x)F(x)F(x)，其形式如下：minimizeF(x)=∑[wi∗∣fi(x)−gi∣]\text{minimize}\quadF(x)=\sum[w_i*|f_i(x)-g_i|]minimizeF(x
剽悍一只猫的生意经要瘦的孙小米
001流量密码：创作一份特别有吸引力的内容，不断推广，让它被更多目标用户看到，这是一些高手的超级流量密码。002写100条：接上一条，写不出特别有深度、有体系的内容，怎么办？谁说一定要写特别有深度、有体系的内容？哪怕你就写一份《xxxx100条》——把你自己在细分领域积累的100个观点、经验、方法列出来，也能给很多人带来帮助。003帮助同行：有的人比同行多走了几步，取得了令一些同行很羡慕的成绩，然
Matlab实现BP-NSGA-II多目标预测优化方法含老司开挖掘机
本文还有配套的精品资源，点击获取简介：本文涉及将遗传算法优化的BP神经网络与NSGA-II相结合，应用于多目标预测问题的解决。主要内容包括BP神经网络的学习原理、适应度函数的设计与应用、NSGA-II在多目标优化中的作用、多目标预测的策略以及Matlab工具在算法实现中的使用。本文旨在通过这些技术，帮助读者构建出能在多个相互冲突的目标间取得平衡的优化解决方案，并提供完整的Matlab代码实现，以供
2022-06-09 Doracmon
2022-6-9——得到头条中国知识创新走到哪一步了？华为公布第四届“十大发明”评选结果6月8日，华为在深圳召开“2022创新和知识产权论坛”，发布了第四届“十大发明”成果，包括高能效基础计算、多目标博弈、全精度浮点计算等等。这些重大科技成果的发明人都是华为员工，其中有几位还是刚加入华为没几年的年轻博士。我们知道，华为是中国乃至全球研发投入力度最大的企业之一。2016年以前，还没有一家中国公司进入
8.0 践行打卡 D1 星月格格
瑜伽♀️法国作家法郎士说过：人生太短，布鲁斯特太长……关于时间的秘密，我们每个人的定位，理解，目标都不一样，所以每个人都有属于自己的生活与故事。时间管理践行到8.0，也就是第8个90天了，坚持到现在，有彷徨过，也有想下车不在坚持，每次都是自己把自己给说服了继续坚持下去，因为自己知道个人的毅力，决心，和力量是有限，就像老师说的：一个人可以走的很快，一群人才能走的很远。从刚开始给自己定了很多目标，到后
网站优化 b10cb7b31b6c
网站优化是指在了解搜索引擎自然排名机制的基础之上，对网站进行内部及外部的调整优化，改进网站在搜索引擎中关键词的自然排名，获得更多的展现量，吸引更多目标客户点击访问网站，网站优化包括整站优化站内优化、站外优化，就是时适合搜索引擎检索，满足搜索引擎名的指标，从而在搜索引擎检索中获得搜索引擎排名靠前，增强搜索引擎营销的效果，使网站相关的关键词能有好的排名。
大肠杆菌数据集的不平衡多类分类 Python 背包客研究不平衡学习分类 python 人工智能
大肠杆菌数据集的不平衡多类分类关注博主学习更多内容关注vxGZH:多目标优化与学习Lab教程概述本教程分为五个部分；他们是：大肠杆菌数据集探索数据集模型测试和基线结果评估模型评估机器学习算法评估数据过采样对新数据进行预测大肠杆菌数据集在这个项目中，我们将使用一个标准的不平衡机器学习数据集，称为“大肠杆菌”数据集，也称为“蛋白质定位位点”数据集。该数据集描述了利用细胞定位位点的氨基酸序列对大肠杆菌蛋
【剽悍一只猫的剽悍行动营】微习惯的魔力财务自由的社群运营人苏宝
文/阿铭世界上有许多患有拖延症的人，或者他们不清楚自己最想要的是什么，或者他们想要的东西太多，或者对完成事情的标准太过完美，以至于举步艰难，或避而不做，或自责焦虑。完美主义倾向的我，也常问自己新年立这么多目标都没办法坚持下去，制定目标的时候越雄心勃勃，但失望也越大。我深深思考：社会上那些牛人们取得的成果是如何做到的？如何向有成果的牛人对标学习？我如何才能迅速改变现状，走出这个怪圈？一、我为什么要加
PaddleDetection多目标跟踪报错MCMOTEvaluator is not exist, so the MOTA will be -INF ATM006 目标检测
ppdet.metrics.mcmot_metricsWARNING:gt_filename'{}'ofMCMOTEvaluatorisnotexist,sotheMOTAwillbe-INFPaddleDetection/ppdet/metrics/mcmot_metrics.pyclassMCMOTEvaluator(object):def__init__(self,data_root,seq
春节回顾分享20180227 我是一面镜子
大家好，2018年的新年马上就要结束了，大家也进入了新一年的工作状态，相信每位伙伴对新的一年都有很多目标与期待，我祝福大家心想事成！在春节期间和家人相处觉察到了几个问题与大家分享。1、昨天我去医院换了一个科室看病，因为我觉得自己吃的药太多就接受了表哥建议用针灸治疗，去到医院就诊完，我就问医生真的不用吃药，医生说暂时不用。我突然发现我是如此的依赖药物，总是抱有不吃药能行的心理，想到过年回到老家看到爸
2022-05-28 更清新的2023
为什么自己定很多计划都坚持不下去？你有没有在年初时定过像减肥、读书、学外语等这些计划，是不是到了年底偶然想起年初定过的计划，心理不禁有些愧疚：又没坚持下来，随即又安慰自己：这一年太忙了，根本没时间去做，明年一定会完成计划，每年反复如此，但目标似乎也总是没有完成。我自己也定过很多目标，比如，今年要看多少本书，要减重多少公斤等等，刚开始都热情满满，但往往坚持不了多久就放弃了，除了责怪自己不够自律外，一
蛙跳算法例子依然风yrlf 算法 python
蛙跳算法（JumpingFrogAlgorithm，简称JFA）是一种仿生优化算法，模拟了青蛙在搜索食物时的跳跃行为。该算法通过模拟青蛙的跳跃过程来寻找最优解，适用于连续优化、离散优化和多目标优化等问题。下面是一个详细的蛙跳算法示例，用于解决一维连续优化问题：importnumpyasnp#定义目标函数defobjective_function(x):return(x-2)**2-1#定义蛙跳算法
基于非支配排序的蜣螂优化算法NSDBO求解微电网多目标优化调度（MATLAB） 2301_78492934 matlab 开发语言
1.微电网微电网多目标优化调度模型是为了实现微电网系统的经济和环境双重优化目标而建立的。该模型以微电网的运行成本和环境保护成本之和最小为目标，参考文献采用改进的粒子群算法（PSO）对优化模型进行求解。该模型主要包括两个核心模块：系统仿真模块和运行优化模块。系统仿真模块使用能量模型对系统调度方案的经济和环境指标进行评估。通过对微电网系统的各个组件（如发电机、储能装置、负荷等）进行建模和仿真，可以得到
计算机设计大赛深度学习交通车辆流量分析 - 目标检测与跟踪 - python opencv iuerfee python
文章目录0前言1课题背景2实现效果3DeepSORT车辆跟踪3.1DeepSORT多目标跟踪算法3.2算法流程4YOLOV5算法4.1网络架构图4.2输入端4.3基准网络4.4Neck网络4.5Head输出层5最后0前言优质竞赛项目系列，今天要分享的是**基于深度学习得交通车辆流量分析**该项目较为新颖，适合作为竞赛课题方向，学长非常推荐！学长这里给一个题目综合评分(每项满分5分)难度系数：3分工
基于YOLOv8与ByteTrack的车辆行人多目标检测与追踪系统【python源码+Pyqt5界面+数据集+训练代码】深度学习实战、目标追踪、运动物体追踪阿_旭深度学习实战 AI应用软件开发实战计算机视觉 python 行人车辆追踪目标追踪 YOLOv8 深度学习
《博主简介》小伙伴们好，我是阿旭。专注于人工智能、AIGC、python、计算机视觉相关分享研究。✌更多学习资源，可关注公-仲-hao:【阿旭算法与机器学习】，共同学习交流~感谢小伙伴们点赞、关注！《------往期经典推荐------》一、AI应用软件开发实战专栏【链接】项目名称项目名称1.【人脸识别与管理系统开发】2.【车牌识别与自动收费管理系统开发】3.【手势识别系统开发】4.【人脸面部活体
多目标检测与跟踪技术详解小厂程序猿目标检测人工智能计算机视觉
导言在计算机视觉领域，多目标检测与跟踪（Multi-ObjectTracking,MOT）是一个至关重要的研究方向。它涉及到在视频序列中同时跟踪多个目标，如行人、车辆等。本文将深入探讨多目标检测与跟踪的核心算法和相关挑战。1.基于检测的跟踪算法这类算法首先进行目标检测，然后根据检测到的目标位置进行跟踪。代表性的方法包括JDE(JointDetectionandEmbedding)和SORT(Sim
岁月如歌《把握驾驭课堂的主宰》王丽娜河南商丘彭庄小学
于老师说想要驾驭课堂，主要是能够主宰教学目标，而教学目标的实现分三方面：1.坚持改变多目标导致无目标的情况，不被教材牵着鼻子走。2.制定教学目标去明确、具体、切实可行，不千篇一律，不笼而统之。目标能否具体，明确，相当程度考教师对教材勾线的把握。要把握住文章的个性有多种途径与老师分享了有三条第一条：抓住文章的基调；第二条：抓最动人最精彩的笔墨；第三条：通过比较，把握特色。3.教学内容不能与教学目标脱
二八定律咖啡与浓茶
1、80/20法则20世纪初意大利统计学家、经济学家维尔弗雷多·帕累托提出的,他指出:在任何特定群体中,重要的因子通常只占少数,而不重要的因子则占多数,因此只要能控制具有重要性的少数因子即能控制全局。这个原理经过多年的演化,已变成当今管理学界所熟知的二八法则——即80%的公司利润来自20%的重要客户,其余20%的利润则来自80%的普通客户。2、对于目标每个人都有很多目标,但有20%的关键目标,决定
互联网加竞赛多目标跟踪算法实时检测 - opencv 深度学习机器视觉 Mr.D学长 python java
文章目录0前言2先上成果3多目标跟踪的两种方法3.1方法13.2方法24TrackingByDetecting的跟踪过程4.1存在的问题4.2基于轨迹预测的跟踪方式5训练代码6最后0前言优质竞赛项目系列，今天要分享的是深度学习多目标跟踪实时检测该项目较为新颖，适合作为竞赛课题方向，学长非常推荐！学长这里给一个题目综合评分(每项满分5分)难度系数：3分工作量：3分创新点：4分更多资料,项目分享：ht
【目标跟踪】提供一种简单跟踪测距方法（c++）读书猿目标跟踪 c++人工智能
文章目录一、前言二、c++代码2.1、Tracking2.2、KalmanTracking2.3、Hungarian2.4、TrackingInfo三、调用示例四、结果一、前言在许多目标检测应用场景中，完完全全依赖目标检测对下游是很难做出有效判断，如漏检。检测后都会加入跟踪进行一些判断或者说补偿。而在智能驾驶中，还需要目标位置信息，所以还需要测距。往期博客介绍了许多处理复杂问题的，而大部分时候我们
让坚持成为一种习惯妈咪的幸福
每天坚持完成一件自己喜欢的事，每天坚持读一本好书，每天坚持锻炼一下身体，每天坚持写一篇日记……我喜欢的事情想要坚持做下去的事情很多，但是真正能坚持到最后的却少之又少。我是一个自律性很差的人，一件事情开始时我会给自己定很多目标，但一旦开始实施起来我又会给自己找好多放弃的理由。比如:我想坚持每天早上送完孩子上学后，就去郊外走走，锻炼锻炼身体，放松一下心情。可是没几天我就不想走了，原因是早上容易下雨，早
如果你想实现2019的目标，一定要关注这几点（干货）青梅一歌
2019来了，你2018年立的Flag完成了多少呢？最近朋友圈都在晒2019年新目标，你是不是也在忙着制定各种目标和计划呢？我要减肥20斤。我要看48本书。我要出国旅行两次。我要考教师资格证。我要考研、出国读书。我要去一家更好的公司。我要暴富、暴美、暴瘦。........2019我们有很多目标和愿望，但回首2018年，可能计划就只是计划，我们仍然没有去实现。2019了，你还想继续这个样子吗？那么我
互➕兴成长计划7月学习心得体会（翠屏058罗礼平） scentsun1983
从7月1日开始，我们已经进行了一个月的互加兴成长计划的学习，每周进行一本书，一颗星，一朵云，一本帐，一平台的学习。内容非常丰富容量也非常大，经过一个月来的学习引起我许多思考，接下来我叙述一下我这个月的一些心得体会。一、不忘初心牢记使命“不忘初心牢记使命”这不仅仅是一句政治口号，而我们应该落实到实际行动上。很多老师一开始的时候都给自己订立了许多目标，制订了许多规划，雄心壮志意气风发。可是经历了几年的
最美人间四月天，愿你知足且上进，温柔而坚定！小糯米的日记本
文|小佳三月已过，2020年已经过去四分之一，同时我们迎来了人间最美的四月天。四月是一个充满希望的季节，万物复苏，春暖花开，在这样的季节里，我们也应该更加热爱生活。每到新的一个月或者新的一年总有人喜欢立下各种各样的flag。我也有过许多随着时间而散去的计划，所以这个月我就不给自己定太多目标了，简单的几个好习惯希望自己继续能坚持就好。1.日更继续真正的日更我还只坚持了三天，在四月份的伊始，我跟自己说
生活中至少要以一种方式持续获取你不知道的信息王盐老师
我的一个长期生活感悟是：生活中至少要以一种方式持续获取你不知道的信息，不管你看书，上网，还是听身边专业人士讲。人的日常很多目标、行为和思维完全是受惯性支配，没有任何合理性可言，但处于惯性之下又根本发现不了有什么不妥。当接受新的信息之后，往往会对这个惯性造成冲击，你会发现有些目标毫无意义，有些行为十分可笑，有些思维本质上就是错的。在这个持续自我修正的过程中不断接近真实的世界，你的目标、行为和思维的有
多目标优化：以嵌套单目标粒子群实现（Python）总裁余(余登武) 最优化实战例子 python
文章目录一、算法讲解粒子群复杂约束求解方法多目标优化二、将单目标算法改为多目标一、算法讲解粒子群见链接粒子群算法求解无约束优化问题源码实现粒子群算法求解带约束优化问题源码实现复杂约束求解方法优化算法求解复杂约束问题策略（以粒子群算法为例讲解求解复杂约束问题的多种策略）多目标优化NSGA2讲解nsga2多目标优化之核心知识点（快速非支配排序、拥挤距离、精英选择策略）详解（python实现）多目标遗传
MOPSO 多目标粒子群python实现 _年_ 进化计算多目标优化 MOPSO 多目标粒子群
参考：https://blog.csdn.net/m0_38097087/article/details/79818348http://yarpiz.com/59/ypea121-mopsoCoelloCAC,PulidoGT,LechugaMS.Handlingmultipleobjectiveswithparticleswarmoptimization[J].IEEETransactionso
利用多目标粒子群优化（MOPSO）算法对全加器中的晶体管大小进行重新调整以达到功率优化：详细步骤与Python实现快撑死的鱼 python算法解析算法 python 开发语言
简介:随着技术的不断进步，微电子行业始终追求在保持性能的同时降低功率消耗。全加器作为数字电路中的基本元素，其功率优化显得尤为关键。本文将详细介绍如何使用一种称为多目标粒子群优化（MOPSO）的进化算法，重新调整晶体管的大小，以优化全加器中的功率。此外，我们还将提供Python代码实现，供读者参考和使用。具体的项目实现过程，我们已经准备了一个完整的项目文件，您可以下载以获取更多细节。1.多目标粒子群
多目标优化（Python）：多目标粒子群优化算法（MOPSO）求解ZDT1、ZDT2、ZDT3、ZDT4、ZDT6（提供Python代码）优化算法MATLAB与Python Python 优化算法 python 算法开发语言人工智能强化学习
一、多目标粒子群优化算法多目标粒子群优化算法（MOPSO）是一种用于解决多目标优化问题的进化算法。它基于粒子群优化算法（PSO），通过引入多个目标函数和非支配排序来处理多目标问题。MOPSO的基本思想是将问题转化为在多维搜索空间中寻找一组最优解的问题。每个解被称为一个粒子，它在搜索空间中移动，并根据自身的经验和群体的经验进行调整。粒子的位置表示解的候选解，速度表示解的搜索方向和步长。MOPSO的算
坐电梯的感悟宏_c2a5
想问大家一个问题：你进入电梯后一般会怎么做？是按下你要去的楼层等待到达，还是要看下电梯外面有没有伙伴，然后再按要去的楼层，还是进入电梯后就开始发呆？我今天乘电梯时碰到了一个送外卖的小哥，他进入电梯后的第一个动作是把关门键按下，然后再按自己要去的楼层。可能这是长期使用电梯，得出来最有效率的一种方式，但这何尝不是我们做出改变需要采取的一种方式；我们想从当前的状态去到更高的状态，我们需要往往想了好多目标
Java序列化进阶篇 g21121 java序列化
1.transient 类一旦实现了Serializable 接口即被声明为可序列化，然而某些情况下并不是所有的属性都需要序列化，想要人为的去阻止这些属性被序列化，就需要用到transient 关键字。
escape()、encodeURI()、encodeURIComponent()区别详解 aigo JavaScript Web
原文：http://blog.sina.com.cn/s/blog_4586764e0101khi0.html JavaScript中有三个可以对字符串编码的函数，分别是： escape,encodeURI,encodeURIComponent，相应3个解码函数：,decodeURI,decodeURIComponent 。下面简单介绍一下它们的区别 1 escape()函
ArcgisEngine实现对地图的放大、缩小和平移 Cb123456 添加矢量数据对地图的放大、缩小和平移 Engine
ArcgisEngine实现对地图的放大、缩小和平移: 个人觉得是平移，不过网上的都是漫游，通俗的说就是把一个地图对象从一边拉到另一边而已。就看人说话吧. 具体实现: 一、引入命名空间 using ESRI.ArcGIS.Geometry; using ESRI.ArcGIS.Controls; 二、代码实现.
Java集合框架概述天子之骄 Java集合框架概述
集合框架集合框架可以理解为一个容器，该容器主要指映射(map)、集合(set)、数组(array)和列表(list)等抽象数据结构。从本质上来说，Java集合框架的主要组成是用来操作对象的接口。不同接口描述不同的数据类型。简单介绍： Collection接口是最基本的接口，它定义了List和Set，List又定义了LinkLi
旗正4.0页面跳转传值问题何必如此 java jsp
跳转和成功提示 a) 成功字段非空forward 成功字段非空forward，不会弹出成功字段，为jsp转发，页面能超链接传值,传输变量时需要拼接。接拼接方式list.jsp?test="+strweightUnit+"或list.jsp?test="+weightUnit+&qu
全网唯一:移动互联网服务器端开发课程 cocos2d-x小菜 web开发移动开发移动端开发移动互联程序员
移动互联网时代来了！ App市场爆发式增长为Web开发程序员带来新一轮机遇，近两年新增创业者，几乎全部选择了移动互联网项目！传统互联网企业中超过98%的门户网站已经或者正在从单一的网站入口转向PC、手机、Pad、智能电视等多端全平台兼容体系。据统计，AppStore中超过85%的App项目都选择了PHP作为后端程
Log4J通用配置|注意问题笔记 7454103 DAO apache tomcat log4j Web
关于日志的等级那些去百度就知道了！这几天要搭个新框架配置了日志记下来！做个备忘！ #这里定义能显示到的最低级别,若定义到INFO级别,则看不到DEBUG级别的信息了~! log4j.rootLogger=INFO,allLog # DAO层 log记录到dao.log 控制台和总日志文件 log4j.logger.DAO=INFO,dao,C
SQLServer TCP/IP 连接失败问题 ---SQL Server Configuration Manager darkranger sql c windows SQL Server XP
当你安装完之后,连接数据库的时候可能会发现你的TCP/IP 没有启动.. 发现需要启动客户端协议 : TCP/IP 需要打开 SQL Server Configuration Manager... 却发现无法打开 SQL Server Configuration Manager..?? 解决方法: C:\WINDOWS\system32目录搜索framedyn.
[置顶] 做有中国特色的程序员 aijuans 程序员
从出版业说起网络作品排到靠前的，都不会太难看，一般人不爱看某部作品也是因为不喜欢这个类型，而此人也不会全不喜欢这些网络作品。究其原因，是因为网络作品都是让人先白看的，看的好了才出了头。而纸质作品就不一定了，排行榜靠前的，有好作品，也有垃圾。许多大牛都是写了博客，后来出了书。这些书也都不次，可能有人让为不好，是因为技术书不像小说，小说在读故事，技术书是在学知识或温习知识，有些技术书读得可
document.domain 跨域问题 avords document
document.domain用来得到当前网页的域名。比如在地址栏里输入：javascript:alert(document.domain); //www.315ta.com我们也可以给document.domain属性赋值，不过是有限制的，你只能赋成当前的域名或者基础域名。比如：javascript:alert(document.domain = "315ta.com");
关于管理软件的一些思考 houxinyou 管理
工作好多看年了,一直在做管理软件,不知道是我最开始做的时候产生了一些惯性的思维,还是现在接触的管理软件水平有所下降.换过好多年公司,越来越感觉现在的管理软件做的越来越乱. 在我看来,管理软件不论是以前的结构化编程,还是现在的面向对象编程,不管是CS模式,还是BS模式.模块的划分是很重要的.当然,模块的划分有很多种方式.我只是以我自己的划分方式来说一下. 做为管理软件,就像现在讲究MVC这
NoSQL数据库之Redis数据库管理(String类型和hash类型) bijian1013 redis 数据库 NoSQL
一.Redis的数据类型 1.String类型及操作 String是最简单的类型，一个key对应一个value，string类型是二进制安全的。Redis的string可以包含任何数据，比如jpg图片或者序列化的对象。 Set方法：设置key对应的值为string类型的value
Tomcat 一些技巧征客丶 java tomcat dos
以下操作都是在windows 环境下一、Tomcat 启动时配置 JAVA_HOME 在 tomcat 安装目录，bin 文件夹下的 catalina.bat 或 setclasspath.bat 中添加 set JAVA_HOME=JAVA 安装目录 set JRE_HOME=JAVA 安装目录/jre 即可；二、查看Tomcat 版本在 tomcat 安装目
【Spark七十二】Spark的日志配置 bit1129 spark
在测试Spark Streaming时，大量的日志显示到控制台，影响了Spark Streaming程序代码的输出结果的查看(代码中通过println将输出打印到控制台上)，可以通过修改Spark的日志配置的方式，不让Spark Streaming把它的日志显示在console 在Spark的conf目录下，把log4j.properties.template修改为log4j.p
Haskell版冒泡排序 bookjovi 冒泡排序 haskell
面试的时候问的比较多的算法题要么是binary search，要么是冒泡排序，真的不想用写C写冒泡排序了，贴上个Haskell版的，思维简单，代码简单，下次谁要是再要我用C写冒泡排序，直接上个haskell版的，让他自己去理解吧。 sort [] = [] sort [x] = [x] sort (x:x1:xs) | x>x1 = x1:so
java 路径配置文件读取 bro_feng java
这几天做一个项目，关于路径做如下笔记，有需要供参考。取工程内的文件，一般都要用相对路径，这个自然不用多说。在src统计目录建配置文件目录res,在res中放入配置文件。读取文件使用方式： 1. MyTest.class.getResourceAsStream("/res/xx.properties") 2. properties.load(MyTest.
读《研磨设计模式》-代码笔记-简单工厂模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ package design.pattern; /* * 个人理解：简单工厂模式就是IOC; * 客户端要用到某一对象，本来是由客户创建的，现在改成由工厂创建，客户直接取就好了 */ interface IProduct {
SVN与JIRA的关联 chenyu19891124 SVN
SVN与JIRA的关联一直都没能装成功，今天凝聚心思花了一天时间整合好了。下面是自己整理的步骤：一、搭建好SVN环境，尤其是要把SVN的服务注册成系统服务二、装好JIRA，自己用是jira-4.3.4破解版三、下载SVN与JIRA的插件并解压，然后拷贝插件包下lib包里的三个jar，放到Atlassian\JIRA 4.3.4\atlassian-jira\WEB-INF\lib下，再
JWFDv0.96 最新设计思路 comsci 数据结构算法工作企业应用公告
随着工作流技术的发展，工作流产品的应用范围也不断的在扩展，开始进入了像金融行业(我已经看到国有四大商业银行的工作流产品招标公告了)，实时生产控制和其它比较重要的工程领域，而
vi 保存复制内容格式粘贴 daizj vi 粘贴复制保存原格式不变形
vi是linux中非常好用的文本编辑工具，功能强大无比，但对于复制带有缩进格式的内容时，粘贴的时候内容错位很严重，不会按照复制时的格式排版，vi能不能在粘贴时，按复制进的格式进行粘贴呢？答案是肯定的，vi有一个很强大的命令可以实现此功能。在命令模式输入:set paste，则进入paste模式，这样再进行粘贴时
shell脚本运行时报错误：/bin/bash^M: bad interpreter 的解决办法 dongwei_6688 shell脚本
出现原因：windows上写的脚本，直接拷贝到linux系统上运行由于格式不兼容导致解决办法： 1. 比如文件名为myshell.sh，vim myshell.sh 2. 执行vim中的命令 : set ff?查看文件格式，如果显示fileformat=dos，证明文件格式有问题 3. 执行vim中的命令 :set fileformat=unix 将文件格式改过来就可以了，然后:w
高一上学期难记忆单词 dcj3sjt126com word english
honest 诚实的；正直的 argue 争论 classical 古典的 hammer 锤子 share 分享；共有 sorrow 悲哀；悲痛 adventure 冒险 error 错误；差错 closet 壁橱；储藏室 pronounce 发音；宣告 repeat 重做；重复 majority 大多数；大半 native 本国的，本地的，本国
hibernate查询返回DTO对象，DTO封装了多个pojo对象的属性 frankco POJO hibernate查询 DTO
DTO-数据传输对象；pojo-最纯粹的java对象与数据库中的表一一对应。简单讲：DTO起到业务数据的传递作用，pojo则与持久层数据库打交道。有时候我们需要查询返回DTO对象，因为DTO
Partition List hcx2013 partition
Given a linked list and a value x, partition it such that all nodes less than x come before nodes greater than or equal to x. You should preserve the original relative order of th
Spring MVC测试框架详解——客户端测试 jinnianshilongnian
上一篇《Spring MVC测试框架详解——服务端测试》已经介绍了服务端测试，接下来再看看如果测试Rest客户端，对于客户端测试以前经常使用的方法是启动一个内嵌的jetty/tomcat容器，然后发送真实的请求到相应的控制器；这种方式的缺点就是速度慢；自Spring 3.2开始提供了对RestTemplate的模拟服务器测试方式，也就是说使用RestTemplate测试时无须启动服务器，而是模拟一
关于推荐个人观点 liyonghui160com 推荐系统关于推荐个人观点
回想起来，我也做推荐了3年多了，最近公司做了调整招聘了很多算法工程师，以为需要多么高大上的算法才能搭建起来的，从实践中走过来，我只想说【不是这样的】第一次接触推荐系统是在四年前入职的时候，那时候，机器学习和大数据都是没有的概念，什么大数据处理开源软件根本不存在，我们用多台计算机web程序记录用户行为，用.net的w
不间断旋转的动画 pangyulei 动画
CABasicAnimation* rotationAnimation; rotationAnimation = [CABasicAnimation animationWithKeyPath:@"transform.rotation.z"]; rotationAnimation.toValue = [NSNumber numberWithFloat: M
自定义annotation sha1064616837 java enum annotation reflect
对象有的属性在页面上可编辑，有的属性在页面只可读，以前都是我们在页面上写死的，时间一久有时候会混乱，此处通过自定义annotation在类属性中定义。越来越发现Java的Annotation真心很强大，可以帮我们省去很多代码，让代码看上去简洁。下面这个例子主要用到了 1.自定义annotation：@interface，以及几个配合着自定义注解使用的几个注解 2.简单的反射 3.枚举
Spring 源码 up2pu spring
1.Spring源代码 https://github.com/SpringSource/spring-framework/branches/3.2.x 注：兼容svn检出 2.运行脚本 import-into-eclipse.bat 注：需要设置JAVA_HOME为jdk 1.7 build.gradle compileJava { sourceCompatibilit
利用word分词来计算文本相似度 yangshangchuan word word分词文本相似度余弦相似度简单共有词
word分词提供了多种文本相似度计算方式：方式一：余弦相似度，通过计算两个向量的夹角余弦值来评估他们的相似度实现类：org.apdplat.word.analysis.CosineTextSimilarity 用法如下： String text1 = "我爱购物"; String text2 = "我爱读书"; String text3 =