脚踏实地仰望星空

Pareto-Efficient Hybridization for Multi-Objective Recommender Systems

ABSTRACT 简介

Performing accurate suggestions is an objective of paramount importance for effective recommender systems. Other important and increasingly evident objectives are novelty and diversity, which are achieved by recommender systems that are able to suggest diversified items not easily discovered by the users. Different recommendation algorithms have particular strengths and weaknesses when it comes to each of these objectives, motivating the construction of hybrid approaches. However, most of these approaches only focus on optimizing accuracy, with no regard for novelty and diversity. The problem of combining recommendation algorithms grows significantly harder when multiple objectives are considered simultaneously. For instance, devising multi-objective recommender systems that suggest items that are simultaneously accurate, novel and diversified may lead to a conflicting-objective problem, where the attempt to improve an objective further may result in worsening other competing objectives. In this paper we propose a hybrid recommendation approach that combines existing algorithms which differ in their level of accuracy, novelty and diversity. We employ an evolutionary search for hybrids following the Strength Pareto approach, which isolates hybrids that are not dominated by others (i.e., the so called Pareto frontier). Experimental results on two recommendation scenarios show that: (i) we can combine recommendation algorithms in order to improve an objective without significantly hurting other objectives, and (ii) we allow for adjusting the compromise between accuracy, diversity and novelty, so that the recommendation emphasis can be adjusted dynamically according to the needs of different users.

对于推荐系统来说，进行准确的推荐是至关重要的目标。另一个日益重要的目标是新颖性和多样性，多样性是通过推荐系统来达成的，推荐系统能够推荐用户不易发现的多样性物品。不同的推荐算法在实现每一个目标时都有其独特的优势和弱点，从而推动了混合方法的构建。然而，这些方法大多只注重优化精度，而不考虑新颖性和多样性。当同时考虑多个目标时，组合推荐算法变得更加困难。例如，设计多目标推荐系统，同时推荐准确、新颖和多样化的物品，可能会导致目标冲突问题，提升某个目标可能会导致其他竞争目标恶化。本文提出了一种混合推荐方法，该方法结合了现有算法的准确性、新颖性和多样性。我们采用强度帕累托方法对杂交种进行进化搜索，分离出不受其他杂交种支配的杂交种（即所谓的帕累托前沿）。在两种推荐场景下的实验结果表明：（1）我们可以将推荐算法结合起来，在不显著损害其他目标的情况下提升某个目标；（2）我们可以调整准确性、多样性和新颖性之间的折衷，从而根据不同用户的需求动态调整推荐应该强化哪个目标。

1. INTRODUCTION 概述

Recommender systems are increasingly emerging as enabling mechanisms devoted to overcoming problems that are inherent to information overload, providing intelligent information access and delivery, and thus potentially improving browsing and consumption experience. Historically, the typical goal of a recommender system is to maximize accuracy as much as possible in predicting and matching user information needs, often by considering individual delivered items in isolation [12]. More recently, however, it has become a consensus that the success of a recommender system depends on other dimensions of information utility, notably the diversity and novelty of the suggestions performed by the system [9, 19, 25, 33]. More specifically, even being accurate, obvious and monotonous recommendations are generally of little use, since they do not expose users to unprecedent experiences.

推荐系统正日益成为一种基本机制，致力于克服信息过载所固有的问题，提供智能信息访问和传递，从而可能改善浏览和消费体验。从历史上看，推荐系统的典型目标是尽可能提高预测和匹配用户信息需求的准确性，通常是通过单独考虑单个交付的物品来实现的[12]。然而，最近的一个共识是，推荐系统的成功取决于信息效用的其他方面，特别是系统推荐的多样性和新颖性[9、19、25、33]。更具体地说，过于大众和单调的推荐，即使推荐是准确的，通常也没有多大用处，因为它们不会让用户体验到前所未有的体验。

Increasing novelty and diversity by completely giving up on accuracy is straight forward and meaningless, since the system will not meet the users needs anymore. In fact, there is an apparent trade-off between these dimensions, which becomes evident by inspecting the performance of existing top-N recommendation algorithms. An easy conclusion is that different algorithms may perform distinctly depending on the dimension of interest (i.e., the best performer in terms of accuracy is not the best one in terms of novelty and diversity), and thus it is hard to point to a best performer if all the dimensions are considered simultaneously. A conclusion which is harder to reach is whether these algorithms are indeed complementary, so that the strengths of an algorithm may compensate the weaknesses of others. The potential synergy between different recommendation algorithms is of great importance to multi-objective recommender systems, since they must achieve a proper level of each dimension (i.e., objective).

完全放弃精确性来增加新颖性和多样性是毫无意义的，因为系统将不再满足用户的需求。事实上，这些维度之间存在明显的折衷，通过研究现有top-N推荐算法的表现可以明显看出这一点。一个简单的结论是，不同的算法可以明显地根据感兴趣的维度来执行（即，一个算法无法同时在准确性、新颖性和多样性方面表现最好），因此，如果同时考虑所有维度，很难指出表现最好的算法。一个更难得出的结论是，这些算法是否真的是互补的，因此一个算法的优点可以弥补其他算法的缺点。不同推荐算法之间的协同作用对于多目标推荐系统来说是非常重要的，因为它们必须达到每个维度（即目标）的适当水平。

In this paper we hypothesize that it is possible to properly aggregate different recommendation algorithms, so that the resulting hybrids balances the level of accuracy, diversity and novelty in its suggestions. In this case, each potential hybrid is given as a weighted combination of well-established recommendation algorithms (e.g., simple algorithms as well as representative of the state-of-the-art). Our proposed hybridization approach consists in finding appropriate weights for the constituent algorithms. By considering each dimension (i.e., accuracy, novelty and diversity) as a separate objective, we reduce the hybridization task to a multi-objective optimization problem, in which we search for the optimal combination of weights that maximizes accuracy, diversity and novelty.

在本文中，我们假设可以适当地聚合不同的推荐算法，从而使产生的混合算法在其推荐的准确性、多样性和新颖性方面达到平衡。在这种情况下，每一个潜在的混合体都是作为一个已有推荐算法（例如，简单算法以及最新技术的代表）的加权组合给出的。我们提出的混合方法是为组成算法寻找合适的权重。将每个维度（准确性、新颖性和多样性）作为一个独立的目标，将混合任务简化为一个多目标优化问题，在该问题中，我们寻找权重的最佳组合，以使准确性、多样性和新颖性最大化。

Since the considered objectives are potentially conflicting, we employ an evolutionary search for optimal hybrids. Evolutionary algorithms denote a class of optimization methods that are characterized by a set of candidate solutions (aka individuals) called a population, which is maintained during the entire optimization process. The population of individuals evolves towards better (and potentially optimal) solutions by employing genetic operators, such as reproduction, mutation and crossover. In our context, each individual represents a possible combination of weights (i.e., a possible hybrid). Optimal hybrids lie in the so-called Pareto frontier [37], and are optimal in the sense that no hybrid in the frontier can be improved upon without hurting at least one of its objectives. Therefore, the evolutionary algorithm evolves the population towards producing hybrids that are located closer to the Pareto frontier, and then a linear search returns the most dominant hybrid [37], which is likely to balance accuracy, novelty and diversity. Alternatively, hybrids in the Pareto frontier can be selected according to a certain need, allowing the recommender system to adjust the compromise between accuracy, novelty and diversity, so that the recommendation emphasis can be adapted dynamically according to the needs of each user (i.e., new users may benefit more from more accurate suggestions, whereas older users may require more novel and diversified suggestions).

由于这些目标是潜在冲突的，我们采用进化搜索来优化混合推荐。进化算法是指在整个优化过程中以一组称之为种群的候选解（即个体）为特征的优化方法。种群通过使用遗传算子，如繁殖、变异和交叉，向更好（和潜在最优）的解决方案进化。就我们而言，每个个体代表可能的权重组合（即，可能的混合）。最优群体存在于所谓的帕累托前沿[37]，并且是最优的，因为帕累托前沿的群体在不损害其至少一个目标的情况下是无法改进的。因此，进化算法将种群进化为更靠近帕累托前沿的群体，然后线性搜索返回最主要的混合体[37]，这可能平衡准确性、新颖性和多样性。或者，帕累托前沿的混合体可以根据特定的需要进行选择，使得推荐系统能够调整准确性、新颖性和多样性之间的折衷，从而可以根据每个用户的需要动态地调整推荐重点（即，新用户可能会从更准确的推荐中受益，而老用户可能需要更新颖和多样化的推荐）。

We conducted a systematic evaluation involving different recommendation scenarios, with explicit user feedback (i.e., movies from the MovieLens dataset), as well as implicit user feedback (i.e., artists from the LastFM dataset). The experiments showed that it is possible to (i) combine different algorithms in order to produce better recommendations and (ii) control the desired balance between accuracy, novelty and diversity. In order to evaluate the baseline algorithms and our hybrids, we used the methodology for top-N evaluation proposed in [12] and measured novelty and diversity using the framework proposed in [33].

我们对不同的推荐场景进行了系统的评估，包括明确的用户反馈（即来自MovieLens数据集的电影）和隐含的用户反馈（即来自LastFM数据集的艺术家）。实验表明，我们可以：（i）将不同的算法结合起来以产生更好的推荐，以及（ii）在准确性、新颖性和多样性之间控制所需的平衡是可能的。为了评估基线算法和我们的混合算法，我们使用了[12]中提出的top-N评估方法，并使用[33]中提出的框架测量新颖性和多样性。

2. PRELIMINARIES 预备知识

In this section we review the main concepts about evolutionary algorithms and multi-objective optimization. Finally, we discuss related work on hybrid and multi-objective recommender systems.

在这一部分，我们回顾了进化算法和多目标优化的主要概念。最后，讨论了混合多目标推荐系统的相关工作。

2.1 Evolutionary Algorithms 进化算法

Evolutionary algorithms are meta-heuristic optimization techniques that follow processes such as inheritance and evolution as key components in the design and implementation of computer-based problem solving systems [15, 20]. In evolutionary algorithms, a solution to a problem is represented as an individual in a population pool. The individuals may be represented as different data structures, such as vectors, threes, or stacks [26]. If the individual is represented as a vector, for example, each position in the vector is called a gene.
Typically, evolutionary algorithms employ a training and a validation set, as described in Algorithm 1. Initially, the population starts with individuals created randomly (line 6). The evolutionary process is composed of a sequence of solution generations. The process evolves generation by generation through genetic operations (lines 7-12). The goal of this process is to obtain better solutions after some generations. A fitness function is used to assign a fitness value to each individual (line 9), which represents its performance on the training set or in a cross validation set. To produce a new generation, genetic operators are applied to individuals with the aim of creating more diverse and better individuals (line 12). Typical operators include reproduction, mutation, and crossover.

进化算法是继遗传和进化等过程之后的元启发式优化技术，是设计和实现基于计算机的问题解决系统的关键组成部分[15，20]。在进化算法中，问题的解被表示为种群中的个体。个体可以表示为不同的数据结构，例如向量、三纬或堆栈[26]。例如，如果个体被表示为一个向量，那么向量中的每个位置都称为一个基因。

通常，进化算法采用训练和验证集，如算法1所述。最初，种群是从随机创建的个体开始的（第6行）。进化过程是由一系列的解生成过程组成的。这个过程通过遗传操作一代一代地进化（第7-12行）。这个过程的目标是在几代人之后获得更好的解决方案。fitness函数用于为每个个体（第9行）分配一个fitness值，该值表示其在训练集或交叉验证集中的性能。为了产生新一代，遗传算子被应用于个体，目的是创造更多样化和更好的个体（第12行）。典型的操作包括复制、变异和交叉。

2.2 Multi-Objective Optimization 多目标优化

Since we are interested in maximizing three different objectives for the sake of recommender systems (i.e. accuracy, novelty, and diversity), we use a multi-objective evolutionary algorithm. In multi-objective optimization problems there is a set of solutions that are superior to the remainder when all the objectives are considered together. In general, traditional approaches to multi-objective optimization problems are very limited because they become too expensive as the size of the problem grows [8]. Multi-objective evolutionary algorithms are a suitable option to overcome such an issue. Typically, multi-objective evolutionary algorithms are classified as Pareto or non-Pareto [37]. In the non-Pareto optimization case, the objectives are combined into a single evaluation value that is used as fitness value (i.e., average of the objectives). In Pareto algorithms, on the other hand, a vector of objective values is used (i.e., the individual is given as an objective vector). The evaluation of Pareto approaches follows the Pareto dominance concept. An individual dominates another if it performs better in at least one of the objectives considered. Given two arbitrary individuals, the result of the dominance operation has two possibilities: (i) one individual dominates another, or (ii) the two individuals do not dominate each other. An individual is denoted as non-dominated if it is not dominated by any other individual in the population, and the set of all non-dominated individuals compose the Pareto frontier.

由于我们感兴趣的是最大化推荐系统的三个不同目标（准确性、新颖性和多样性），所以我们使用了一种多目标进化算法。在多目标优化问题中，当所有目标同时考虑时，存在一组解优于其他解。一般来说，传统的多目标优化方法是非常有限的，因为随着问题规模的增长，它们耗费的资源过多[8]。多目标进化算法是解决这一问题的一种有效方法。通常，多目标进化算法分为Pareto算法和非Pareto算法[37]。在非帕累托算法中，目标被组合成一个单独的评价值，作为适应度值（即目标的平均值）。另一方面，在Pareto算法中，使用了一个目标值向量（即，将个体作为目标向量给出）。帕累托方法的评估遵循帕累托支配的概念。如果一个个体在所考虑的目标中至少有一个表现得更好，他就会支配另一个个体。给定两个任意的个体，支配操作的结果有两种可能性：（i）一个个体支配另一个个体，或（ii）两个个体互不支配对方。如果一个个体不受任何其他个体的支配，那么它就被称为非支配个体，所有非支配个体的集合构成帕累托边界。

In this work we use a second version of the strength Pareto evolutionary algorithm (SPEA-2) [36, 37]. The aim is to find or approximate the Pareto-optimal set for multi-objective problems. The main features of this algorithm are: (i) the fitness assignment scheme takes into account how many individuals each individual dominates or is dominated by, (ii) it uses a nearest neighbour density estimation technique to break ties in solutions with the same fitness, (iii) the size of the population of non-dominated solutions is a fixed value η. Thus, we have two situations. First, when the actual number of non-dominated solutions is lower than η, the population is filled with dominated solutions; second, when the actual number of non-dominated solutions exceeds η, some of them are discarded by a truncation operator which preserves boundary conditions, even though we always keep the current Pareto Frontier in a list separate from the population, so we can later retrieve the individuals in it.

在这项工作中，我们使用第二个版本的强度帕累托进化算法（SPEA-2）[36，37]。目的是寻找或逼近多目标问题的Pareto最优集。该算法的主要特点是：（i）适应度分配方案考虑了每个个体支配或被支配的个体数量；（ii）它使用最近邻密度估计技术来打破具有相同适应度的解之间的联系；（iii）非支配解的总体大小是一个固定值 η。因此，我们有两种情况。第一，当非支配解的实际个数小于η时，种群中全为支配解；第二，当非支配解的实际个数超过η时，其中一些被保留边界条件的截断算子丢弃，不过，我们总是将当前的Pareto边界保持在一个单独的列表中，这样我们以后就可以找到其中的个体。

2.3 Related Work 相关工作

Traditionally, hybrid recommender strategies are the combination of two different families of algorithms, namely, content-based and collaborative filtering [1]. In this work, we combine many (up to 8) recommendation algorithms - different content-based and collaborative filtering algorithms that deal with explicit and implicit feedback, etc. We treat each recommendation algorithm as a black-box, so adding or removing recommendation algorithms is easy. Different hybridization strategies have been proposed to combine recommender methods, such as weighted approaches [10], voting mechanisms [30], switching between different recommenders [6, 24], and re-ranking the results of one recommender with another [7].

传统上，混合推荐策略是两类不同算法的组合，即基于内容的和协同过滤的[1]。在这项工作中，我们结合了许多（最多8个）推荐算法-不同的基于内容的和处理显式和隐式反馈的协同过滤算法等。我们将每个推荐算法视为一个黑箱，因此添加或删除推荐算法很容易。人们提出了不同的混合策略来组合推荐方法，例如加权方法[10]、投票机制[30]、在不同推荐方法之间切换[6、24]，以及将一个推荐的结果与另一个推荐的结果重新排序[7]。

A prominent use of hybridization in recommender systems is the Belkor system that won the Netflix competition [4, 5]. Their method is a statically weighted linear combination of 107 collaborative filtering engines. There are important differences between their work and ours: (i) their solution is single-objective (accuracy), (ii) they combine only collaborative filtering information, and (iii) the recommendation task is rating prediction, focused on RMSE - which makes the aggregation simpler, since all of the ratings are on the same scale and consist of the same items.

在推荐系统中，混合推荐的一个突出应用是赢得Netflix竞争的Belkor系统[4，5]。他们的方法是107个协同过滤引擎的静态加权线性组合。他们的工作和我们的工作有着重要的区别：（i）他们的解决方案是单目标（准确度），（ii）他们只结合了协同过滤信息，和（iii）他们的推荐任务是评级预测，重点是RMSE-这使得聚合更简单，因为所有评级都在同一个尺度上，有同样的物品。

There has been an increasing consensus in the recommender systems community about the importance of proposing algorithms and methods to enhance novelty and diversity [17, 33]. As showed in [35], user satisfaction does not always correlate with high recommender accuracy. Thus, different multi-objective algorithms have been proposed to improve user experience considering either diversity or novelty. For instance, in [35], the authors define a greedy re-ranking algorithm that diversifies baseline recommendations. Another approach to improve diversity is presented in [34], where they suggest an optimization method to improve two objective functions reflecting preference similarity and item diversity.

对于提出算法和方法以增强新颖性和多样性的重要性，推荐系统界已经有了越来越多的共识[17，33]。如[35]所示，用户满意度并不总是与推荐的高准确度相关。因此，考虑到用户体验的多样性和新颖性，人们提出了不同的多目标算法来改善用户体验。例如，在[35]中，作者定义了一个贪婪的re-ranking算法，该算法使基线推荐多样化。文献[34]提出了另一种改进多样性的方法，他们提出了一种优化方法来改进反映偏好相似性和物品多样性的两个目标函数。

On the other hand, novelty has been understood as recommending long-tail items, i.e., those items which few users have accessed. In [33], the authors present hybrid strategies that combine collaborative filtering with graph spreading techniques to improve novelty. The authors in [9] take an alternative approach: instead of assessing novelty in terms of the long-tail items that are recommended, they follow the paths leading from recommendations to the long tail using similarity links. As far as we know, this is the first work that proposes a hybrid method that is multi-objective in terms of the three metrics, i.e., accuracy, diversity and novelty.

另一方面，新颖性被理解为推荐长尾物品，即那些很少有用户访问过的物品。在[33]中，作者提出了混合策略，将协同过滤与图形扩展技术相结合，以提高新颖性。[9]中的作者采用了另一种方法：他们不根据推荐的长尾物品来评估新颖性，而是使用相似链接进行长尾推荐。据我们所知，这是第一次提出一种多目标的混合方法，即精确性、多样性和新颖性。

Extensive research has also been performed exploiting the robust characteristics of genetic algorithms in recommender systems. For instance, in [28] the authors build a content-based recommender system and use genetic algorithms to assign proper weights to the words. Such weights are combined using the traditional IR vector space model [2] to produce recommendations. In [23] the authors use a genetic algorithm to build a recommender method that considers the browsing history of users in real-time. In contrast to our approach (which uses a GA to combine multiple recommender methods), they use GA to build a single-method.

对遗传算法在推荐系统中的鲁棒特性，也有广泛的研究。例如，在[28]中，作者建立了一个基于内容的推荐系统，并使用遗传算法为单词分配适当的权重。使用传统的IR向量空间模型[2]组合这些权重以进行推荐。在文献[23]中，作者使用遗传算法建立了一种实时考虑用户浏览历史的推荐方法。与我们的方法（使用遗传算法组合多个推荐方法）不同，他们使用遗传算法构建单个方法。

In [22], the authors present an implementation of GA for optimal feature weighting in the multi-criteria scenario. Their application of GA consists in selecting features that represent users’ interest in a collaborative filtering context, in contrast to our method, which focuses on assigning weights to different recommendation algorithms in order to improve the overall performance in terms of accuracy, novelty and diversity.

在文献[22]中，作者提出了一种在多准则情形下的最优特征加权遗传算法的实现。它们的应用是在协同过滤环境中选择代表用户兴趣的特征，与我们的方法不同，我们的方法侧重于为不同的推荐算法分配权重，以便在准确性、新颖性和多样性方面提高整体性能。

3. PARETO-EFFICIENT HYBRIDIZATION

In this section we introduce our search approach for Pareto-Optimal hybrids. We start by discussing how different recommendation algorithms are combined, so that potential hybrids are created. Then we describe the evolutionary search for Pareto-Optimal hybrids. Finally, we discuss an approach to deal with the compromise between accuracy, novelty and diversity, so that the system is able to adjust itself for different user perspectives.

在这一节中，我们介绍了我们的帕累托最优混合搜索方法。我们首先讨论如何组合不同的推荐算法，以便创建潜在的混合算法。然后我们描述了Pareto最优混合的进化搜索。最后，我们讨论了一种在准确性、新颖性和多样性之间进行折衷的方法，使系统能够根据不同的用户视角进行自我调整。

3.1 Weighted Hybridization 加权混合

Our hybridization approach is based on assigning weights to each constituent algorithm. We denote the set of constituent algorithms as A and the score given by algorithm Aj for an item i is represented by Aj(i). As the constituent algorithms may output scores in drastically different scales, a simple normalization procedure is necessary to ensure that all algorithms in A operate in the same scale. The aggregated score for each item i is calculated as:

$S(i) = \sum_{j=1}^{|A|} A_j(i) * W_j$ (1)

where W is a vector that represents the weight assigned to each constituent algorithm. The assignment of weights to each algorithm is formulated as a search problem which we discuss next.

我们的混合方法是基于给每个组成算法分配权重。我们将组成算法集表示为A，并且由算法给出的物品i的得分用表示。由于组成算法可能以完全不同的尺度输出分数，因此需要一个简单的归一化过程来确保A中的所有算法都在相同的尺度下操作。每个物品i的总分计算如下：

$S(i) = \sum_{j=1}^{|A|} A_j(i) * W_j$ (1)

其中W表示分配给每个组成算法的权重的向量。每个算法的权重分配都被表示为一个搜索问题，我们接下来将讨论这个问题。

3.2 Searching for Pareto-Optimal Hybrids 搜索帕累托最优混合

Finding a suitable vector of weights W can be viewed as a search problem in which possible solutions are given as a combination of weights { $w_1, w_2, ..., w_{|A|}$ }, such that each wi is selected in a way that optimizes a established criterion. We consider the application of evolutionary algorithms for searching optimal solutions. These algorithms iteratively evolve a population of individuals towards optimal solutions by performing operations based on reproduction, mutation, recombination, and selection [18]. This approach is interesting because we have no knowledge of the search space, since any number of different algorithms may be used, in different domains. Next, we precisely define an individual.

找到一个合适的权重向量W可以看作是一个搜索问题，在这个问题中，可能的解是权重的组合{ $w_1, w_2, ..., w_{|A|}$ }，以优化某个已建立的标准来选择。我们考虑应用进化算法寻找最优解。这些算法通过执行基于复制、变异、重组和选择的操作，迭代地将个体种群进化为最优解[18]。这种方法很有趣，因为我们对搜索空间一无所知，因为在不同的领域中，可能会使用任意数量的不同算法。接下来，我们精确地定义一个个体。

Definition 1: An individual is a candidate solution, which is encoded as a sequence of |A| values [ $w_1, w_2, ..., w_{|A|}$ ], where each indicates the weight associated with algorithm ∈ A.
Each algorithm assigns scores to items using a cross-validation set. Finally, weights are assigned to each algorithm and their scores are aggregated according to Equation 1, producing an individual. A fitness function is computed for each individual in order to make them directly comparable, so that the population can evolve towards optimal solutions.

定义1：个体是一个候选解，它被编码为一个有|A|个值的序列[ $w_1, w_2, ..., w_{|A|}$ ]，其中每个表示与算法∈A相关联的权重。

每个算法使用交叉验证集为物品分配分数。最后，为每个算法分配权重，并根据等式1聚合它们的得分，生成个体。为每个个体计算一个适应度函数，以使它们直接可比，从而使种群进化到最优解。

Definition 2: An optimal solution is a sequence of weights W = { $w_1, w_2, ..., w_{|A|}$ }, satisfying:

$maximize \quad \phi (o_i) \quad \forall \quad o_i \in \left \{ accuracy, novelty, diversity \right \}$ (2)

where $\phi (o_i)$ is a metric used to measure an objective, which can be either accuracy, novelty or diversity. These metrics are better discussed in Section 4. For now it suffices to notice that the performance of each individual is given by a 3-dimensional objective vector, containing the average accuracy, novelty and diversity over the users in the cross validation set ( since different metrics may operate in different scales, we normalize each $\phi (o_i)$ to the 0-1 interval). Searching for optimal solutions, therefore, is a multi-objective optimization problem, in which the value of $\phi (o_i)$ must be maximized for each of the 3 objectives that compose an optimal solution. Therefore, multiple optimal individuals are possible. It is worth noticing that different datasets and combinations of algorithms and A will generate different optimal individuals.

定义2：最优解是一个权重序列W={ $w_1, w_2, ..., w_{|A|}$ }，该序列满足如下条件：

$maximize \quad \phi (o_i) \quad \forall \quad o_i \in \left \{ accuracy, novelty, diversity \right \}$ (2)

其中， $\phi (o_i)$ 是用于衡量目标的度量，可以是精确性、新颖性或多样性。第4节对这些指标进行了更详细的讨论。现在只需注意，每个个体的性能是由一个三维目标向量给出的，它包含交叉验证集中用户的平均准确性、新颖性和多样性（因为不同的度量可能量纲不同，我们将每个 $\phi (o_i)$ 标准化到0-1）。因此，寻找最优解是一个多目标优化问题，在这个问题中，组成最优解的3个目标的 $\phi (o_i)$ 值必须最大化。因此，多个最优个体是可能的。值得注意的是，不同的数据集以及算法和A的组合将产生不同的最优个体。

A general strategy for solving a multi-objective optimization problem is to exploit the concept of Pareto dominance, which may be used to find solutions that are not dominated by others. These non-dominated solutions lie in the so-called Pareto frontier, and are optimal in the sense that no solution in the frontier can be improved upon without hurting at least one of its objectives. Therefore, the evolutionary algorithm evolves the population towards producing individuals that are located closer to the Pareto frontier, and then a linear search returns the individual which simply maximizes the average (or some other combination, as we see on the next section) of the three objectives. Under this strategy, we follow the well-known Strength Pareto Evolutionary Algorithm approach [36], which has shown to be highly effective and also because it provides more diverse results when compared to existing approaches [11, 13, 32] for many problems of interest. The Strength Pareto approach isolates individuals that achieve a compromise between maximizing the competing objectives by evolving individuals that are likely to be non-dominated by other individuals in the population.

求解多目标优化问题的一般策略是利用Pareto支配的概念，利用Pareto支配可以找到不受解支配的解。这些非支配解存在于所谓的Pareto边界中，并且是最优的，因为边界中的任何解都不能在不损害其至少一个目标的情况下得到改进。因此，进化算法将种群进化为产生更靠近帕累托边界的个体，然后线性搜索返回仅使三个目标的平均值（或其他组合，如我们在下一节中看到的）最大化的个体。在这种策略下，我们遵循著名的强度Pareto进化算法方法[36]，该方法被证明是非常有效的，并且与现有方法[11，13，32]相比，它为许多有意思的问题提供了更多样的结果。强度帕累托方法将个体分离出来，这些个体通过进化出可能不受群体中其他个体支配的个体，在最大化竞争目标之间达成妥协。

It is worth noticing that our approach does not depend on which recommendation algorithms are being aggregated, nor does it depend on the data domain. This makes adding or removing algorithms trivial, and allows the data to determine how each algorithm contributes to each of the objectives - an algorithm may be the most accurate when ratings are available, but not so accurate when only implicit feedback is used.
The Pareto-Optimal search is computationally expensive. However, it can be performed in an offline manner, and with low frequency. After the Pareto-Optimal weights are discovered, there is no need to perform the search repeatedly, unless a recommendation algorithm is added or removed, or a lot of new feedback data enters the system. Therefore, using this approach would not hinder the system’s online performance.

值得注意的是，我们的方法不依赖于聚合哪些推荐算法，也不依赖于数据域。这使得添加或删除算法变得很简单，并能确定每个算法对每个目标的贡献-当评级可用时，一个算法可能是最精确的，但当仅使用隐式反馈时，则不是那么精确。

帕累托最优搜索在计算上是昂贵的。但是，它可以以离线方式和低频率执行。在发现Pareto最优权值后，无需重复搜索，除非加入或删除推荐算法，或大量新的反馈数据进入系统。因此，使用这种方法不会妨碍系统的在线性能。

3.3 Adjusting the System Priority 调整系统优先级

It is well recognized that the role that a recommender system plays may vary depending on the target user. For instance, according to [19], the suggestions performed by a recommender system may fail to appear trustworthy to a new user because it does not recommend items the user is sure to enjoy but probably already knows about. Based on this, a recommender system might prioritize accuracy instead of novelty or diversity for new users, while prioritizing novelty for users that have already used the system for a while. This is made possible by our hybridization approach, by searching which individual in the Pareto frontier better solves the user’s current needs.
The choice of which individual in the Pareto frontier is accomplished by performing a linear search on all of the individuals, in order to find which one maximizes a simple weighted mean on each of the three objectives in the objective vector, where the weights in the weighted mean represent the priority given to each objective. It is worth noting that fitness values are always calculated using the cross-validation set. Therefore, considering a 3-dimensional priority vector Q, which represents the importance of each objective j, the individual in the Pareto frontier P is chosen as:

众所周知，推荐系统扮演的角色可能因目标用户而异。例如，根据[19]，推荐系统的推荐可能对新用户不太准确，因为它无法推荐用户肯定会喜欢但可能已经知道的物品。基于此，对新用户来说，推荐系统可能会优先考虑准确性，而不是新颖性或多样性；而对已经使用过系统的用户来说，优先考虑新颖性。这是通过我们的混合方法实现的，通过搜索帕累托前沿的哪个个体更好地解决了用户当前的需求。

帕累托前沿中的个体选择是通过对所有个体进行线性搜索来完成的，以便找出哪个个体最大化了目标向量中三个目标的简单加权均值，其中加权平均值中的权重表示赋予每个目标的优先级。值得注意的是，适应度值是使用交叉验证集计算的。因此，考虑到代表每个目标j的重要性的三维优先向量Q，帕累托前沿P中的个体被选择为：

4. EVALUATION METHODOLOGY 评估方法

The testing methodology we adopted in this paper is similar to the one described in [12], which is appropriate for the top-N recommendation task. For each dataset, ratings are split into two subsets: the training set M and the test set T. The training set M can (if necessary) be split into two subsets: the cross-validation training set C and the cross-validation test set V, which is used in order to tune parameters or adjust models. The test set T and the cross-validation test set V only contain items that are considered relevant to the users in the set. For explicit feedback (i.e., MovieLens), this means that the sets T and V only contain 5-star ratings.

本文采用的测试方法与文献[12]中描述的方法相似，适合于top-N推荐任务。对于每个数据集，评分被分成两个子集：训练集M和测试集T。训练集M可以（如有必要）分为两个子集：交叉验证训练集C和交叉验证测试集V，用于调整参数或调整模型。测试集T和交叉验证测试集V只包含被认为与集合中的用户相关的项。对于显式反馈（比如，MovieLens），这意味着集合T和V只包含5星评级。

In the case of implicit feedback (i.e., Last.fm), we normalized the observed item access frequencies of each user to a common rating scale [0,5], as used in [33]. Namely, $r(u, i) = n * F(frec_{u, i})$ , where $frec_{u, i}$ is the number of times u has accessed i and $F(frec_{u, i}) = |j \in u|f_{u, j} < f_{u, i}| / |u|$ is the cumulative distribution function of $frec_{u, i}$ over the set of items accessed by the user u, denoted as u. In this case, the test set and the cross validation test set only contain ratings such that r(u, i) >= 4, since the number of 5 star ratings is very small using this mapping of implicit feedback into ratings. It is worth noting that all the sets have a corresponding implicit feedback set, used by the recommendation algorithms that can deal with implicit feedback.

在隐式反馈（比如，Last.fm）的情况下，我们将每个用户对看过的物品的访问频率标准化为一个通用的评分标准[0,5]，如[33]中所用。即 $r(u, i) = n * F(frec_{u, i})$ ，其中， $frec_{u, i}$ 是u访问i的次数， $F(frec_{u, i}) = |j \in u|f_{u, j} < f_{u, i}| / |u|$ 是 $frec_{u, i}$ 在用户u访问的物品集合上的累积分布函数，表示为u。测试集和交叉验证测试集只包含r(u, i) >= 4的评分，因为隐式反馈到评分的映射，五星评分的数量非常小。值得注意的是，所有的集合都有一个对应的隐式反馈集合，用于处理隐式反馈的推荐算法。

The detailed procedure to create M and T is the same used in [12], in order to maintain compatibility with their results. Namely, for each dataset we randomly sub-sampled 1.4% of the ratings from the dataset in order to create a probe set. The training set M contains the remaining ratings, while the test set T contains all the 5-star ratings in the probe set (in the case of explicit feedback) or 4+ star ratings (in the case of implicit feedback mapped into explicit feedback). We further divided the training set in the same fashion, in order to create the cross-validation training and test sets C and V. The ratings in the probe sets were not used for training.
In order to evaluate the algorithms, we first train the models using M. Then, for each item in T that is relevant to user u:
• We randomly select 1,000 additional items unrated by user u. The assumption is that most of them will not be interesting to u.
• The algorithm in question forms a ranked list by ordering all of the 1,001 items. The most accurate result corresponds to the case where the test item i is in the first position.

创建M和T的详细过程与[12]中使用的相同，以保持其结果的兼容性。也就是说，对于每个数据集，我们从数据集中随机抽取1.4%的评分，以创建一个探测集。训练集M包含剩余的评分，而测试集T包含探测集中的所有5-star评分（在显式反馈的情况下）或4+star评分（在隐式反馈映射为显式反馈的情况下）。我们以同样的方式进一步划分训练集，以创建交叉验证训练集和测试集C和V。探针集中的评分不用于训练。

为了评估算法，我们首先使用M训练模型，然后，对于T中与用户u相关的每个项：

•我们随机选择1000个未经用户u评分的附加项目。假设是u对其中的大多数物品不感兴趣。

•所讨论的算法对所有1001个物品排序形成一个排序列表。最准确的结果对应于试验物品i处于第一位置的情况。

Since the task is top-N recommendation, we form a top-N list by picking the N items out of the 1,001 that have the highest rank. If the test item i is among the top-N items, we have a hit. Otherwise, we have a miss. Recall and precision are calculated as follows:

$recall(N) = \frac{\#hits}{|T|}$ (4)

$precision(N) = \frac{\#hits}{N*|T|} = \frac{recall(N)}{N}$ (5)

In order to measure the novelty of the recommendations, we used a popularity-based item novelty model proposed in [33], so that the probability of an item i being seen is estimated as:

where U denotes the set of users. Since the testing methodology supposes that most of the 1,000 additional unrated items are not relevant, we used the metrics in the framework proposed in [33] without relevance awareness. The novelty of a top-N recommendation list from R presented to user u is therefore given by:

$nov(R(N)) = EPC(N) = C\sum_{i_k\in R}^{i_N} rd(k) (1 - p(seen|i_k))$ (7)

where rd(k) is a rank discount giver by $rd(k) = 0.85^{k-1}$ [33] and C is a normalizing constant given by $\frac{1}{\sum_{i_k\in R}^{i_N} rd(k)}$ . Therefore,
this metric is rank-sensitive (i.e. the novelty of the top-rated items counts more than the novelty of other items). As is the case with precision and recall, we average the EPC@N value of the top-N recommendation lists over the test set.
We used a distance based model in order to measure the diversity of the recommendation lists. Once again, we used the metrics from [33] without relevance-awareness. The recommendation diversity, therefore, is given by:

where rd (l|k) = rd (max (1, l−k)) reflects a relative rank discount between l and k, and is the cosine similarity between two items, given by:

such that denotes the users that liked item i, and denotes the users that liked item j.

由于是top-N推荐，我们从1001个物品中选出排名最高的N个物品，从而形成top-N列表。如果测试物品i在前N个物品中，则为hit。否则，为miss。召回率和精确度计算如下：

$recall(N) = \frac{\#hits}{|T|}$ (4)

$precision(N) = \frac{\#hits}{N*|T|} = \frac{recall(N)}{N}$ (5)

为了衡量这些推荐的新颖性，我们使用了[33]中提出的基于流行度的物品新颖性模型，看的物品i的概率为：

其中U表示用户集。由于测试方法假设1000个额外的未评分物品中的大多数都不相关，我们使用了[33]中提出的框架中的度量。因此，向用户u呈现的R的top-N推荐列表的新颖性为：

$nov(R(N)) = EPC(N) = C\sum_{i_k\in R}^{i_N} rd(k) (1 - p(seen|i_k))$ (7)

其中rd(k)为排序衰减因子， $rd(k) = 0.85^{k-1}$ [33]，C是归一化常量 $\frac{1}{\sum_{i_k\in R}^{i_N} rd(k)}$ 。因此，这个指标是排序敏感的（即，排序最高的物品的新颖性远远高于其他物品）。与精确性和召回率一样，我们在测试集上对前N个推荐列表的EPC@N值做了平均。

我们使用基于距离的模型来衡量推荐列表的多样性。我们使用了[33]中的度量标准。推荐的多样性如下：

where rd (l|k) = rd (max (1, l−k))反映了l和k之间的相对排序衰减，是两个物品的余弦相似性，由下式给出：

表示喜欢物品i的用户，表示喜欢物品j的用户。

5. EXPERIMENTAL EVALUATION 实验评估

We apply the methodology presented in Section 4 to two different scenarios, in order to evaluate our hybrid approach: movie and music recommendation. For movie recommendation, we used the MovieLens dataset [27]. This dataset contains 1,000,209 ratings from 6,040 users on 3,883 movies. For music recommendation, we used an implicit preference dataset from [9], which consists of 19,150,868 user accesses to music tracks on the website Last.fm. This dataset involves 176,948 artists and 992 users, and we considered the task of recommending artists to users. Mapping the implicit feedback into user-artist ratings yielded a total of 889,558 ratings, which were used by the algorithms that cannot deal with implicit feedback, and to separate the dataset into the training and test sets M and T .

我们将第4节中介绍的方法应用于两个不同的场景，以评估我们的混合方法：电影和音乐推荐。对于电影推荐，我们使用MovieLens数据集[27]。这个数据集包含来自3883部电影6040个用户的1000209个评分。对于音乐推荐，我们使用了来自[9]的隐式偏好数据集，该数据集由19150868个用户访问网站Last.fm1上的音乐曲目组成。这个数据集涉及176948个艺术家和992个用户，我们考虑的是向用户推荐艺术家的任务。将隐式反馈映射到用户-艺术家评分中，得到了889558个评分，把这些评分用到无法处理隐式反馈的算法中，并将数据集分离为训练集和测试集M和T。

5.1 Recommendation Algorithms 推荐算法

We selected eight recommendation algorithms to provide the base for our hybrids. To represent latent factor models, we selected PureSVD with 50 and 150 factors (PureSVD50 and PureSVD150), described in [12]. These were the only algorithms we used that are based on explicit feedback. To compute the scores for the items in the Last.fm dataset, we used the mappings of implicit feedback into ratings explained in Section 5.3.
As for recommendation algorithms that use implicit feedback, we used algorithms available in the MyMediaLite package [16]. We used WeightedItemKNN (WIKNN) and WeightedUserKNN (WUKNN) as representative of neighbourhood models based on collaborative data [14] (we only used WeightedItemKNN on the MovieLens dataset, as MyMediaLite’s implementation cannot yet handle datasets where the number of items is very large, which is the case in the Last.fm dataset). As a baseline, and to allow for comparison with [12], we used MyMediaLite’s MostPopular implementation, which is the same as TopPop in [12]. We also used WRMF − a weighted matrix factorization method based on [21, 29], which is very effective for data with implicit feedback. In order to represent content-based algorithms, we used ItemAttributeKNN(IAKNN), a K-nearest neighbor item-based collaborative filtering using cosine-similarity over the movie genres for MovieLens (we could not use this method in the Last.fm dataset, because it does not contain content data). Finally, we used UserAttributeKNN(UAKNN), a K-nearest neighbor user-based collaborative filtering using cosine-similarity over the user attributes, such as sex, age, etc. (which both datasets provide).

在混合推荐中，我们选择了8种推荐算法。为了表示潜在因素模型，我们选择了具有50和150个因素（PureSVD50和PureSVD150）的PureSVD，如[12]所述。这些是我们使用的唯一基于显式反馈的算法。为了计算Last.fm数据集中各项的得分，我们使用了第5.3节中解释的隐式反馈到评分的映射。

对于使用隐式反馈的推荐算法，我们使用MyMediaLite包中提供的算法[16]。我们使用WeightedItemKNN（WIKNN）和WeightedUserKNN（WUKNN）两个基于协作数据[14]的邻域模型（我们只在MovieLens数据集上使用WeightedItemKNN，因为MyMediaLite的实现还不能处理物品数非常大的数据集，Last.fm数据集也是这样）。作为基线，为了与[12]进行比较，我们使用MyMediaLite的MostPopular实现，它与[12]中的TopPop相同。我们还使用了基于[21，29]的加权矩阵因式分解方法WRMF，它对隐式反馈的数据非常有效。为了表示基于内容的算法，我们使用itematributeknn（IAKNN），这是一种K近邻基于item的协同过滤，在MovieLens的电影类型上使用余弦相似性（在Last.fm数据集中不能使用此方法，因为它不包含内容数据）。最后，我们使用了UserAttributeKNN（UAKNN），这是一种K近邻基于user的协同过滤，对用户属性（如性别、年龄等）使用余弦相似度。

5.2 Hybrid Approaches 混合方法

As a baseline, we used a voting-based hybrid based on Borda-Count (BC) which is similar to [30], where each constituent algorithm gives n points to each item i such that , where |R| is the size of the recommendation list and is the position of i in R. We also used STREAM as baseline, a stacking-based approach with additional meta-features [3]. We used the same additional meta-features as [3], namely, the number of items that a certain user has rated and the number of users that has rated a certain item (denoted as and in [3]). We tried the learning algorithms proposed in [3], and Linear Regression yielded the best results, so the results presented for STREAM are generated using Linear Regression as the meta-learning algorithm. Our last baseline is the weighted hybrid we proposed in Section 3.1, using equal weights for each constituent algorithm. We called this baseline Equal Weights (EW).
As for our genetic approach, we combined all of the the recommendation algorithms cited in the last subsection. We used an open-source implementation of SPEA2 [36, 37] from DEAP [31]. We used a two points crossover operator [20], and a uniform random mutation operator with probability 0.05. SPEA-2 was configured with the following parameters:

作为基线，我们使用基于Borda-Count (BC)的基于投票的混合算法，类似于[30]，其中每个组成算法给每个物品i赋n个点，使得，其中|R|是推荐列表的大小，是i在R中的位置。我们还使用STREAM作为基线，一种附加元特征的基于叠加(stacking-based)的方法[3]。我们使用了与[3]相同的附加元特性，即，某个用户评分的物品数和对某个项目进行了评分的用户数（在[3]中表示为和）。我们尝试了文献[3]中提出的学习算法，线性回归得到了最好的结果，因此STREAM的结果是使用线性回归作为元学习算法生成的。我们的最后一个基线是我们在第3.1节中提出的加权混合，对每个组成算法使用相等的权重。我们称之为基线等权重（EW）。

至于我们的遗传方法，我们结合了上一小节中引用的所有推荐算法。我们使用了来自DEAP[31]的SPEA2[36，37]的开源实现，我们使用了两点交叉算子[20]和概率为0.05的均匀随机变异算子。SPEA-2的参数配置如下：

5.3 Results and Discussion 结果和讨论

The results achieved by each of the constituent recommendation algorithms can be seen in Tables 1 and 2. We show the accuracy results (recall and precision) over different values of N. Since both EPC(novelty) and EILD(diversity) are rank-sensitive metrics, we only presented their values for N = 20. There is a clear compromise between accuracy, novelty and diversity of these algorithms. For the MovieLens dataset (Table 1), the constituent algorithm that provides the most accurate recommendations is PureSVD50. The constituent algorithm that provides the most novel recommendation with an acceptable degree of accuracy is PureSVD150, but its accuracy is much worse than the accuracy obtained by PureSVD50, and its diversity is much worse than the other algorithms. TopPop provided the most diverse recommendations, although it performs significantly worse in accuracy and novelty. It is worth noting that ItemAttributeKNN is based only on genres, which explains its poor accuracy results.

各组成推荐算法的结果见表1和表2。我们给出了N为不同值时的准确性结果（召回率、准确率）。由于EPC（新颖性）和EILD（多样性）都是排序敏感的度量，我们只给出了N=20的新颖性和多样性结果。这些算法的准确性、新颖性和多样性之间存在着明显的折衷。对于MovieLens数据集（表1），提供最准确推荐的组成算法是PureSVD50。PureSVD150算法是新颖度最好的推荐算法，其推荐精度可以接受，但其推荐精度远低于PureSVD50算法，其多样性远低于其他算法。TopPop提供了最多样化的推荐，但是它在准确性和新颖性方面的表现要差得多。值得注意的是，ItemAttributeKNN只基于类别进行推荐，这就解释了其准确性差的原因。

On the Last.fm dataset, the constituent algorithm that provides the most accurate recommendations is WRMF. This is expected, as Last.fm is originally an implicit feedback dataset, to which WRMF is more suitable. Once again, PureSVD150 proved its capacity to suggest novel items, being the algorithm with the most novel recommendations. WeightedUserKNN proved to be the algorithm that provided the most diverse recommendations, while maintaining a reasonable accuracy degree. In this dataset the compromise between the three objectives is once again illustrated by the fact that there is no algorithm that dominates the others in every objective.

在Last.fm数据集上，提供最准确推荐的组成算法是WRMF。这是符合预期的，因为Last.fm最初是一个隐式反馈数据集，WRMF更适合它。PureSVD150再一次证明了它提出新颖物品的能力，它是新颖性最好的推荐算法。WeightedUserKNN在保持合理精度的前提下，提供了最多样化的推荐。在这个数据集中，这三个目标之间的折衷再一次被这样一个事实所说明：没有一个算法能在所有目标上支配其他算法。

Regarding the performance of the baselines in the MovieLens dataset, STREAM performs worse then PureSVD50 on accuracy, maintaining the same level of novelty and performing better in terms of diversity. Borda Count performed poorly on accuracy and reasonably well in terms of novelty and diversity. Equal Weights performed poorly on accuracy and novelty and well on diversity. On the Last.fm dataset, STREAM performed slightly worse than WRMF in accuracy, while maintaining the same level of diversity and improving slightly on novelty. Once again, Borda Count performed poorly on accuracy and reasonably well on novelty and diversity. Finally, Equal Weights performed poorly on accuracy and novelty, while performing well on diversity.

关于MovieLens数据集中基线的性能，STREAM在精度上比PureSVD50差，新颖性表现相同，在多样性方面表现更好。Borda Count在准确性方面表现不佳，在新颖性和多样性方面表现相当不错。Equal Weights在准确性、新颖性方面表现不佳，在多样性上表现很好。在Last.fm数据集上，STREAM的准确度略低于WRMF，多样性表现相当，在新颖性上略有提高。同样，Borda Count在准确性上表现不佳，在新颖性和多样性方面表现良好。最后，Equal Weights在准确性和新颖性方面表现不佳，而在多样性方面表现良好。

Now, with our evolutionary approach, we could reach any of the individuals in Figure 1, which represent the accuracy (in this case, Recall@10) and novelty (EPC@20) of the recommendations in x and y axes, and diversity (EILD@20) with a color scale. These graphics show the results in the test set for the individuals that represented the Pareto frontier in the cross-validation. It is clear that there is a compromise between the three objectives: the individuals with the most novel recommendations provide less accurate and diverse lists, and so on. This compromise can be adjusted dynamically with little extra cost, since the cost of reaching these individuals is as low as a linear search (for the individual that maximizes a weighted mean, as described on Section 3.2) over the Pareto frontier individuals’ scores on the cross validation set. The Pareto frontier consists of 1,418 individuals in the MovieLens dataset and of 1,995 individuals in the Last.fm dataset, so a linear search can be done very quickly. We chose to demonstrate a few of these individuals in Tables 1 and 2. First, Pareto-Optimal-mean (PO-mean) represents the individual that optimizes the mean of the three normalized objectives, assuming each of them are equally important. This would be an option if personalization was not desired, or if the designers of the recommender systems did not know which combination of the three objectives would result in higher user satisfaction. However, in a more realistic situation, the recommender system would most likely want to select different individuals for different users. We selected as examples the following individuals, which were found by the process explained in Section 3.2 with the represented associated weighted vectors:

PO-acc: [Accuracy:0.85, Novelty:0.1, Diversity:0.05]
PO-acc2: [Accuracy:0.7, Novelty:0.3, Diversity:0]
PO-nov: [Accuracy:0.15, Novelty:0.85, Diversity:0]
PO-div: [Accuracy:0.15, Novelty:0.15, Diversity:0.7]

现在，通过我们的进化方法，我们可以达到到图1中的任何个体，这些个体代表了x和y轴上推荐的准确性（在本例中，Recall@10）和新颖性（EPC@20），以及用各种颜色表示的多样性（EILD@20）。这些图形显示了交叉验证中代表Pareto前沿的个体的测试集结果。很明显，这三个目标之间存在一种折衷：推荐最新颖的个体提供的列表不够准确和多样，等等。这种折衷可以动态地进行调整，而不需要太多额外的成本，因为达到这些个体的成本比较低，在交叉验证集中的帕累托前沿个体的得分上进行线性搜索（对于加权平均值最大的个体，如第3.2节所述）即可。帕累托边界由MovieLens数据集中的1418个个体和Last.fm数据集中的1995个个体组成，因此可以非常快速地进行线性搜索。我们在表1和表2中显示了其中一些个体。首先，帕累托最优均值（PO-mean）表示优化三个标准化目标均值的个体，假设每个目标都同等重要。如果不需要个性化，或者如果推荐系统的设计者不知道这三个目标的哪一个组合将导致更高的用户满意度，可以这样来做。然而，在更现实的情况下，推荐系统很可能希望为不同的用户选择不同的个体。我们选择了以下个体作为例子，通过第3.2节中所述的过程和所代表的相关加权向量发现这些个体：

PO-acc: [Accuracy:0.85, Novelty:0.1, Diversity:0.05]
PO-acc2: [Accuracy:0.7, Novelty:0.3, Diversity:0]
PO-nov: [Accuracy:0.15, Novelty:0.85, Diversity:0]
PO-div: [Accuracy:0.15, Novelty:0.15, Diversity:0.7]

We compared PO-acc and PO-acc2 with PureSVD50, which is the standalone algorithm with the most accurate recommendations. Both perform as well as PureSVD50 on accuracy, but PO-acc performs much better on diversity (and equally well on novelty), and PO-acc2 performs better on novelty while maintaining the diversity level. We compared PO-nov with Pure-SVD150, which presented the most novel recommendations to the users, with reasonable accuracy. PO-nov performs slightly better on novelty than PureSVD150, but performs much better in terms of accuracy, and slightly on diversity. Finally, we compared PO-div with MostPopular, the algorithm with the most diverse recommendations. PO-div loses very slightly on diversity, while improving on accuracy and novelty. We were able, therefore, to find individuals in the Pareto frontier that performed close or better than the best algorithms in each individual objective, but better on the other objectives. Once again, we could have chosen to compromise more accuracy and diversity if we desired more novelty, as is shown by Figure 1 (left).
As for the Last.fm dataset, we selected the following individuals:
• PO-acc: [Accuracy:0.7, Novelty:0.3, Diversity:0]
• PO-nov: [Accuracy:0.15, Novelty:0.85, Diversity:0]
• PO-div: [Accuracy:0.45, Novelty:0.05, Diversity:0.5]

我们将PO-acc和PO-acc2与PureSVD50进行了比较，PureSVD50是具有最精确推荐的独立算法。两者在准确性上的表现都和PureSVD50一样好，但是PO-acc在多样性上表现得更好（在新颖性上也同样好），PO-acc2在保持多样性水平的同时在新颖性上表现得更好。我们将PO-nov与Pure-SVD150进行了比较，后者给用户的推荐最新颖，并且具有合理的准确性。与PureSVD150相比，PO-nov在新颖性方面表现稍好，但在准确性方面表现更佳，在多样性方面表现稍好。最后，我们比较了PO-div和MostPopular算法，该算法的推荐多样性最好。PO-div在多样性方面稍差，在准确性和新颖性方面有所提高。因此，我们能够在帕累托前沿找到在每个目标上表现接近或优于最佳算法的个体，但在其他目标上表现更好。如果我们想要更多的新颖性，我们可以选择牺牲更多的准确性和多样性，如图1（左）所示。

对于Last.fm数据集，我们选择了以下个体：

• PO-acc: [Accuracy:0.7, Novelty:0.3, Diversity:0]
• PO-nov: [Accuracy:0.15, Novelty:0.85, Diversity:0]
• PO-div: [Accuracy:0.45, Novelty:0.05, Diversity:0.5]

For the Last.fm dataset, we compared PO-acc with WRMF, which is the most accurate standalone algorithm on this dataset. PO-acc is much more accurate than WRMF, while also improving on novelty and maintaining the diversity level. The individual PO-nov was compared with PureSVD150, and it performed equally well on accuracy, while delivering a much higher novelty, and only a slightly worse diversity. PO-div was compared against WeightedUserKNN, and it faired equally well on diversity and novelty, while slightly improving on accuracy. It is worth noticing that the individual represented by PO-div is the same individual that maximizes the mean with equal weight (PO-mean). Once again, we were able to find interesting individuals in the Pareto frontier, but we could have reached any of the individuals in Figure 1 (right) by tweaking the weight value for each objective.

对于Last.fm数据集，我们比较了PO-acc和WRMF，后者是该数据集上最精确的独立算法。PO-acc比WRMF精确得多，在新颖性方面有改进，多样性相当。我们将PO-nov与PureSVD150进行了比较，它在准确性方面同样表现良好，同时提供了更高的新颖性，多样性只稍微差一点。将PO-div与WeightedUserKNN进行了比较，它在多样性和新颖性方面表现得同样出色，在准确性方面略有提高。值得注意的是，PO-div所代表的个体是用等权最大化平均值（PO-mean）的同一个体。我们在帕累托边界找到了有趣的个体，但是我们可以通过调整每个目标的权重值来达到图1（右）中的任何个体。

6. CONCLUSIONS

In this paper, we propose a hybridization technique for combining different recommendation algorithms, following the Strength Pareto approach. We show that different recommendation algorithms do not perform uniformly well when evaluated in accuracy, novelty and diversity, but our technique allows for the dynamic adjustment of the compromise between these three aspects of user satisfaction. This can be very useful in different scenarios, one example being the personalization of recommendations according to the users. According to [25], “New users have different needs from experienced users in a recommender. New users may benefit from an algorithm which generates highly ratable items, as they need to establish trust and rapport with a recommender before taking advantage of the recommendations it offers.” Therefore, our approach could be used to provide new users with the most accurate recommendations as possible, even if the recommendations are not novel at all - so the users would have items to rate, and build trust in the system. The costly part of our technique (the evolutionary algorithm) is performed offline, and the online cost of choosing an individual in the pareto frontier and weighting the results for different algorithms is very small, since the pareto frontier is comprised of few individuals.
We performed highly reproducible experiments on public datasets of implicit and explicit feedback, using open-source implementations. In our experiments, we demonstrated our technique’s ability to balance each of the objectives according to the desired compromise, and we showed some examples of reached solutions that are competitive with the best algorithms according to each objective and almost always better on the other objectives.

在本文中，遵循强度帕累托方法，我们提出了一种混合技术，来组合不同的推荐算法。我们发现不同的推荐算法在准确性、新颖性和多样性方面的表现并不一致，但是我们的技术允许动态调整用户满意度这三个方面之间的折衷。这在不同的场景中非常有用，例如根据用户个性化推荐。根据[25]，“新用户和老用户对推荐有不同的需求。新用户可能更喜欢高点击率的物品，因为他们需要在使用推荐系统之前与推荐系统建立信任和关系。”因此，我们的方法可以用于为新用户提供尽可能准确的推荐，即使推荐一点都不新颖-这样用户就可以对物品进行评分，并在系统中建立信任。我们技术中耗费资源的部分（进化算法）是离线执行的，而在帕累托前沿选择个体并为不同算法的结果加权的在线成本非常小，因为帕累托前沿由很少的个体组成。

我们使用开放源代码实现，对隐式和显式反馈的公共数据集进行了高度可重复的实验。我们的技术可以根据期望的折衷来平衡每个目标，并且给出了一些解决方案的例子，这些解决方案在单目标上与最佳算法性能相当，在其他目标上，表现更好。

你可能感兴趣的:(多目标)

「达摩院MindOpt」用于多目标规划（目标规划法） MindOpt_003 算法云计算阿里云
前篇我们讲述了使用加权和法对多目标规划问题的优化，本篇将讲述使用目标规划法。1.原理目标规划法，首先是为每个目标函数设定一个期望值（目标值）gig_igi。方案1：然后构建一个新的目标函数F(x)F(x)F(x)，其形式如下：minimizeF(x)=∑[wi∗∣fi(x)−gi∣]\text{minimize}\quadF(x)=\sum[w_i*|f_i(x)-g_i|]minimizeF(x
剽悍一只猫的生意经要瘦的孙小米
001流量密码：创作一份特别有吸引力的内容，不断推广，让它被更多目标用户看到，这是一些高手的超级流量密码。002写100条：接上一条，写不出特别有深度、有体系的内容，怎么办？谁说一定要写特别有深度、有体系的内容？哪怕你就写一份《xxxx100条》——把你自己在细分领域积累的100个观点、经验、方法列出来，也能给很多人带来帮助。003帮助同行：有的人比同行多走了几步，取得了令一些同行很羡慕的成绩，然
Matlab实现BP-NSGA-II多目标预测优化方法含老司开挖掘机
本文还有配套的精品资源，点击获取简介：本文涉及将遗传算法优化的BP神经网络与NSGA-II相结合，应用于多目标预测问题的解决。主要内容包括BP神经网络的学习原理、适应度函数的设计与应用、NSGA-II在多目标优化中的作用、多目标预测的策略以及Matlab工具在算法实现中的使用。本文旨在通过这些技术，帮助读者构建出能在多个相互冲突的目标间取得平衡的优化解决方案，并提供完整的Matlab代码实现，以供
2022-06-09 Doracmon
2022-6-9——得到头条中国知识创新走到哪一步了？华为公布第四届“十大发明”评选结果6月8日，华为在深圳召开“2022创新和知识产权论坛”，发布了第四届“十大发明”成果，包括高能效基础计算、多目标博弈、全精度浮点计算等等。这些重大科技成果的发明人都是华为员工，其中有几位还是刚加入华为没几年的年轻博士。我们知道，华为是中国乃至全球研发投入力度最大的企业之一。2016年以前，还没有一家中国公司进入
8.0 践行打卡 D1 星月格格
瑜伽♀️法国作家法郎士说过：人生太短，布鲁斯特太长……关于时间的秘密，我们每个人的定位，理解，目标都不一样，所以每个人都有属于自己的生活与故事。时间管理践行到8.0，也就是第8个90天了，坚持到现在，有彷徨过，也有想下车不在坚持，每次都是自己把自己给说服了继续坚持下去，因为自己知道个人的毅力，决心，和力量是有限，就像老师说的：一个人可以走的很快，一群人才能走的很远。从刚开始给自己定了很多目标，到后
网站优化 b10cb7b31b6c
网站优化是指在了解搜索引擎自然排名机制的基础之上，对网站进行内部及外部的调整优化，改进网站在搜索引擎中关键词的自然排名，获得更多的展现量，吸引更多目标客户点击访问网站，网站优化包括整站优化站内优化、站外优化，就是时适合搜索引擎检索，满足搜索引擎名的指标，从而在搜索引擎检索中获得搜索引擎排名靠前，增强搜索引擎营销的效果，使网站相关的关键词能有好的排名。
大肠杆菌数据集的不平衡多类分类 Python 背包客研究不平衡学习分类 python 人工智能
大肠杆菌数据集的不平衡多类分类关注博主学习更多内容关注vxGZH:多目标优化与学习Lab教程概述本教程分为五个部分；他们是：大肠杆菌数据集探索数据集模型测试和基线结果评估模型评估机器学习算法评估数据过采样对新数据进行预测大肠杆菌数据集在这个项目中，我们将使用一个标准的不平衡机器学习数据集，称为“大肠杆菌”数据集，也称为“蛋白质定位位点”数据集。该数据集描述了利用细胞定位位点的氨基酸序列对大肠杆菌蛋
【剽悍一只猫的剽悍行动营】微习惯的魔力财务自由的社群运营人苏宝
文/阿铭世界上有许多患有拖延症的人，或者他们不清楚自己最想要的是什么，或者他们想要的东西太多，或者对完成事情的标准太过完美，以至于举步艰难，或避而不做，或自责焦虑。完美主义倾向的我，也常问自己新年立这么多目标都没办法坚持下去，制定目标的时候越雄心勃勃，但失望也越大。我深深思考：社会上那些牛人们取得的成果是如何做到的？如何向有成果的牛人对标学习？我如何才能迅速改变现状，走出这个怪圈？一、我为什么要加
PaddleDetection多目标跟踪报错MCMOTEvaluator is not exist, so the MOTA will be -INF ATM006 目标检测
ppdet.metrics.mcmot_metricsWARNING:gt_filename'{}'ofMCMOTEvaluatorisnotexist,sotheMOTAwillbe-INFPaddleDetection/ppdet/metrics/mcmot_metrics.pyclassMCMOTEvaluator(object):def__init__(self,data_root,seq
春节回顾分享20180227 我是一面镜子
大家好，2018年的新年马上就要结束了，大家也进入了新一年的工作状态，相信每位伙伴对新的一年都有很多目标与期待，我祝福大家心想事成！在春节期间和家人相处觉察到了几个问题与大家分享。1、昨天我去医院换了一个科室看病，因为我觉得自己吃的药太多就接受了表哥建议用针灸治疗，去到医院就诊完，我就问医生真的不用吃药，医生说暂时不用。我突然发现我是如此的依赖药物，总是抱有不吃药能行的心理，想到过年回到老家看到爸
2022-05-28 更清新的2023
为什么自己定很多计划都坚持不下去？你有没有在年初时定过像减肥、读书、学外语等这些计划，是不是到了年底偶然想起年初定过的计划，心理不禁有些愧疚：又没坚持下来，随即又安慰自己：这一年太忙了，根本没时间去做，明年一定会完成计划，每年反复如此，但目标似乎也总是没有完成。我自己也定过很多目标，比如，今年要看多少本书，要减重多少公斤等等，刚开始都热情满满，但往往坚持不了多久就放弃了，除了责怪自己不够自律外，一
蛙跳算法例子依然风yrlf 算法 python
蛙跳算法（JumpingFrogAlgorithm，简称JFA）是一种仿生优化算法，模拟了青蛙在搜索食物时的跳跃行为。该算法通过模拟青蛙的跳跃过程来寻找最优解，适用于连续优化、离散优化和多目标优化等问题。下面是一个详细的蛙跳算法示例，用于解决一维连续优化问题：importnumpyasnp#定义目标函数defobjective_function(x):return(x-2)**2-1#定义蛙跳算法
基于非支配排序的蜣螂优化算法NSDBO求解微电网多目标优化调度（MATLAB） 2301_78492934 matlab 开发语言
1.微电网微电网多目标优化调度模型是为了实现微电网系统的经济和环境双重优化目标而建立的。该模型以微电网的运行成本和环境保护成本之和最小为目标，参考文献采用改进的粒子群算法（PSO）对优化模型进行求解。该模型主要包括两个核心模块：系统仿真模块和运行优化模块。系统仿真模块使用能量模型对系统调度方案的经济和环境指标进行评估。通过对微电网系统的各个组件（如发电机、储能装置、负荷等）进行建模和仿真，可以得到
计算机设计大赛深度学习交通车辆流量分析 - 目标检测与跟踪 - python opencv iuerfee python
文章目录0前言1课题背景2实现效果3DeepSORT车辆跟踪3.1DeepSORT多目标跟踪算法3.2算法流程4YOLOV5算法4.1网络架构图4.2输入端4.3基准网络4.4Neck网络4.5Head输出层5最后0前言优质竞赛项目系列，今天要分享的是**基于深度学习得交通车辆流量分析**该项目较为新颖，适合作为竞赛课题方向，学长非常推荐！学长这里给一个题目综合评分(每项满分5分)难度系数：3分工
基于YOLOv8与ByteTrack的车辆行人多目标检测与追踪系统【python源码+Pyqt5界面+数据集+训练代码】深度学习实战、目标追踪、运动物体追踪阿_旭深度学习实战 AI应用软件开发实战计算机视觉 python 行人车辆追踪目标追踪 YOLOv8 深度学习
《博主简介》小伙伴们好，我是阿旭。专注于人工智能、AIGC、python、计算机视觉相关分享研究。✌更多学习资源，可关注公-仲-hao:【阿旭算法与机器学习】，共同学习交流~感谢小伙伴们点赞、关注！《------往期经典推荐------》一、AI应用软件开发实战专栏【链接】项目名称项目名称1.【人脸识别与管理系统开发】2.【车牌识别与自动收费管理系统开发】3.【手势识别系统开发】4.【人脸面部活体
多目标检测与跟踪技术详解小厂程序猿目标检测人工智能计算机视觉
导言在计算机视觉领域，多目标检测与跟踪（Multi-ObjectTracking,MOT）是一个至关重要的研究方向。它涉及到在视频序列中同时跟踪多个目标，如行人、车辆等。本文将深入探讨多目标检测与跟踪的核心算法和相关挑战。1.基于检测的跟踪算法这类算法首先进行目标检测，然后根据检测到的目标位置进行跟踪。代表性的方法包括JDE(JointDetectionandEmbedding)和SORT(Sim
岁月如歌《把握驾驭课堂的主宰》王丽娜河南商丘彭庄小学
于老师说想要驾驭课堂，主要是能够主宰教学目标，而教学目标的实现分三方面：1.坚持改变多目标导致无目标的情况，不被教材牵着鼻子走。2.制定教学目标去明确、具体、切实可行，不千篇一律，不笼而统之。目标能否具体，明确，相当程度考教师对教材勾线的把握。要把握住文章的个性有多种途径与老师分享了有三条第一条：抓住文章的基调；第二条：抓最动人最精彩的笔墨；第三条：通过比较，把握特色。3.教学内容不能与教学目标脱
二八定律咖啡与浓茶
1、80/20法则20世纪初意大利统计学家、经济学家维尔弗雷多·帕累托提出的,他指出:在任何特定群体中,重要的因子通常只占少数,而不重要的因子则占多数,因此只要能控制具有重要性的少数因子即能控制全局。这个原理经过多年的演化,已变成当今管理学界所熟知的二八法则——即80%的公司利润来自20%的重要客户,其余20%的利润则来自80%的普通客户。2、对于目标每个人都有很多目标,但有20%的关键目标,决定
互联网加竞赛多目标跟踪算法实时检测 - opencv 深度学习机器视觉 Mr.D学长 python java
文章目录0前言2先上成果3多目标跟踪的两种方法3.1方法13.2方法24TrackingByDetecting的跟踪过程4.1存在的问题4.2基于轨迹预测的跟踪方式5训练代码6最后0前言优质竞赛项目系列，今天要分享的是深度学习多目标跟踪实时检测该项目较为新颖，适合作为竞赛课题方向，学长非常推荐！学长这里给一个题目综合评分(每项满分5分)难度系数：3分工作量：3分创新点：4分更多资料,项目分享：ht
【目标跟踪】提供一种简单跟踪测距方法（c++）读书猿目标跟踪 c++人工智能
文章目录一、前言二、c++代码2.1、Tracking2.2、KalmanTracking2.3、Hungarian2.4、TrackingInfo三、调用示例四、结果一、前言在许多目标检测应用场景中，完完全全依赖目标检测对下游是很难做出有效判断，如漏检。检测后都会加入跟踪进行一些判断或者说补偿。而在智能驾驶中，还需要目标位置信息，所以还需要测距。往期博客介绍了许多处理复杂问题的，而大部分时候我们
让坚持成为一种习惯妈咪的幸福
每天坚持完成一件自己喜欢的事，每天坚持读一本好书，每天坚持锻炼一下身体，每天坚持写一篇日记……我喜欢的事情想要坚持做下去的事情很多，但是真正能坚持到最后的却少之又少。我是一个自律性很差的人，一件事情开始时我会给自己定很多目标，但一旦开始实施起来我又会给自己找好多放弃的理由。比如:我想坚持每天早上送完孩子上学后，就去郊外走走，锻炼锻炼身体，放松一下心情。可是没几天我就不想走了，原因是早上容易下雨，早
如果你想实现2019的目标，一定要关注这几点（干货）青梅一歌
2019来了，你2018年立的Flag完成了多少呢？最近朋友圈都在晒2019年新目标，你是不是也在忙着制定各种目标和计划呢？我要减肥20斤。我要看48本书。我要出国旅行两次。我要考教师资格证。我要考研、出国读书。我要去一家更好的公司。我要暴富、暴美、暴瘦。........2019我们有很多目标和愿望，但回首2018年，可能计划就只是计划，我们仍然没有去实现。2019了，你还想继续这个样子吗？那么我
互➕兴成长计划7月学习心得体会（翠屏058罗礼平） scentsun1983
从7月1日开始，我们已经进行了一个月的互加兴成长计划的学习，每周进行一本书，一颗星，一朵云，一本帐，一平台的学习。内容非常丰富容量也非常大，经过一个月来的学习引起我许多思考，接下来我叙述一下我这个月的一些心得体会。一、不忘初心牢记使命“不忘初心牢记使命”这不仅仅是一句政治口号，而我们应该落实到实际行动上。很多老师一开始的时候都给自己订立了许多目标，制订了许多规划，雄心壮志意气风发。可是经历了几年的
最美人间四月天，愿你知足且上进，温柔而坚定！小糯米的日记本
文|小佳三月已过，2020年已经过去四分之一，同时我们迎来了人间最美的四月天。四月是一个充满希望的季节，万物复苏，春暖花开，在这样的季节里，我们也应该更加热爱生活。每到新的一个月或者新的一年总有人喜欢立下各种各样的flag。我也有过许多随着时间而散去的计划，所以这个月我就不给自己定太多目标了，简单的几个好习惯希望自己继续能坚持就好。1.日更继续真正的日更我还只坚持了三天，在四月份的伊始，我跟自己说
生活中至少要以一种方式持续获取你不知道的信息王盐老师
我的一个长期生活感悟是：生活中至少要以一种方式持续获取你不知道的信息，不管你看书，上网，还是听身边专业人士讲。人的日常很多目标、行为和思维完全是受惯性支配，没有任何合理性可言，但处于惯性之下又根本发现不了有什么不妥。当接受新的信息之后，往往会对这个惯性造成冲击，你会发现有些目标毫无意义，有些行为十分可笑，有些思维本质上就是错的。在这个持续自我修正的过程中不断接近真实的世界，你的目标、行为和思维的有
多目标优化：以嵌套单目标粒子群实现（Python）总裁余(余登武) 最优化实战例子 python
文章目录一、算法讲解粒子群复杂约束求解方法多目标优化二、将单目标算法改为多目标一、算法讲解粒子群见链接粒子群算法求解无约束优化问题源码实现粒子群算法求解带约束优化问题源码实现复杂约束求解方法优化算法求解复杂约束问题策略（以粒子群算法为例讲解求解复杂约束问题的多种策略）多目标优化NSGA2讲解nsga2多目标优化之核心知识点（快速非支配排序、拥挤距离、精英选择策略）详解（python实现）多目标遗传
MOPSO 多目标粒子群python实现 _年_ 进化计算多目标优化 MOPSO 多目标粒子群
参考：https://blog.csdn.net/m0_38097087/article/details/79818348http://yarpiz.com/59/ypea121-mopsoCoelloCAC,PulidoGT,LechugaMS.Handlingmultipleobjectiveswithparticleswarmoptimization[J].IEEETransactionso
利用多目标粒子群优化（MOPSO）算法对全加器中的晶体管大小进行重新调整以达到功率优化：详细步骤与Python实现快撑死的鱼 python算法解析算法 python 开发语言
简介:随着技术的不断进步，微电子行业始终追求在保持性能的同时降低功率消耗。全加器作为数字电路中的基本元素，其功率优化显得尤为关键。本文将详细介绍如何使用一种称为多目标粒子群优化（MOPSO）的进化算法，重新调整晶体管的大小，以优化全加器中的功率。此外，我们还将提供Python代码实现，供读者参考和使用。具体的项目实现过程，我们已经准备了一个完整的项目文件，您可以下载以获取更多细节。1.多目标粒子群
多目标优化（Python）：多目标粒子群优化算法（MOPSO）求解ZDT1、ZDT2、ZDT3、ZDT4、ZDT6（提供Python代码）优化算法MATLAB与Python Python 优化算法 python 算法开发语言人工智能强化学习
一、多目标粒子群优化算法多目标粒子群优化算法（MOPSO）是一种用于解决多目标优化问题的进化算法。它基于粒子群优化算法（PSO），通过引入多个目标函数和非支配排序来处理多目标问题。MOPSO的基本思想是将问题转化为在多维搜索空间中寻找一组最优解的问题。每个解被称为一个粒子，它在搜索空间中移动，并根据自身的经验和群体的经验进行调整。粒子的位置表示解的候选解，速度表示解的搜索方向和步长。MOPSO的算
坐电梯的感悟宏_c2a5
想问大家一个问题：你进入电梯后一般会怎么做？是按下你要去的楼层等待到达，还是要看下电梯外面有没有伙伴，然后再按要去的楼层，还是进入电梯后就开始发呆？我今天乘电梯时碰到了一个送外卖的小哥，他进入电梯后的第一个动作是把关门键按下，然后再按自己要去的楼层。可能这是长期使用电梯，得出来最有效率的一种方式，但这何尝不是我们做出改变需要采取的一种方式；我们想从当前的状态去到更高的状态，我们需要往往想了好多目标
ASM系列五利用TreeApi 解析生成Class lijingyao8206 ASM 字节码动态生成 ClassNode TreeAPI
前面CoreApi的介绍部分基本涵盖了ASMCore包下面的主要API及功能，其中还有一部分关于MetaData的解析和生成就不再赘述。这篇开始介绍ASM另一部分主要的Api。TreeApi。这一部分源码是关联的asm-tree-5.0.4的版本。在介绍前，先要知道一点， Tree工程的接口基本可以完
链表树——复合数据结构应用实例 bardo 数据结构树型结构表结构设计链表菜单排序
我们清楚：数据库设计中，表结构设计的好坏，直接影响程序的复杂度。所以，本文就无限级分类（目录）树与链表的复合在表设计中的应用进行探讨。当然，什么是树，什么是链表，这里不作介绍。有兴趣可以去看相关的教材。需求简介：经常遇到这样的需求，我们希望能将保存在数据库中的树结构能够按确定的顺序读出来。比如，多级菜单、组织结构、商品分类。更具体的，我们希望某个二级菜单在这一级别中就是第一个。虽然它是最后
为啥要用位运算代替取模呢 chenchao051 位运算哈希汇编
在hash中查找key的时候，经常会发现用&取代%，先看两段代码吧， JDK6中的HashMap中的indexFor方法： /** * Returns index for hash code h. */ static int indexFor(int h, int length) {
最近的情况麦田的设计者生活感悟计划软考想
今天是2015年4月27号整理一下最近的思绪以及要完成的任务 1、最近在驾校科目二练车，每周四天，练三周。其实做什么都要用心，追求合理的途径解决。为
PHP去掉字符串中最后一个字符的方法 IT独行者 PHP 字符串
今天在PHP项目开发中遇到一个需求，去掉字符串中的最后一个字符原字符串1,2,3,4,5,6, 去掉最后一个字符","，最终结果为1,2,3,4,5,6 代码如下： $str = "1,2,3,4,5,6,"; $newstr = substr($str,0,strlen($str)-1); echo $newstr;
hadoop在linux上单机安装过程 _wy_ linux hadoop
1、安装JDK jdk版本最好是1.6以上，可以使用执行命令java -version查看当前JAVA版本号，如果报命令不存在或版本比较低，则需要安装一个高版本的JDK，并在/etc/profile的文件末尾，根据本机JDK实际的安装位置加上以下几行： export JAVA_HOME=/usr/java/jdk1.7.0_25
JAVA进阶----分布式事务的一种简单处理方法无量多系统交互分布式事务
每个方法都是原子操作：提供第三方服务的系统，要同时提供执行方法和对应的回滚方法 A系统调用B,C,D系统完成分布式事务 =========执行开始======== A.aa(); try { B.bb(); } catch(Exception e) { A.rollbackAa(); } try { C.cc(); } catch(Excep
安墨移动广告：移动DSP厚积薄发引领未来广告业发展命脉矮蛋蛋 hadoop 互联网
　　“谁掌握了强大的DSP技术，谁将引领未来的广告行业发展命脉。”2014年，移动广告行业的热点非移动DSP莫属。各个圈子都在纷纷谈论，认为移动DSP是行业突破点，一时间许多移动广告联盟风起云涌，竞相推出专属移动DSP产品。　　到底什么是移动DSP呢? 　　DSP(Demand-SidePlatform)，就是需求方平台，为解决广告主投放的各种需求，真正实现人群定位的精准广
myelipse设置 alafqq IP
在一个项目的完整的生命周期中，其维护费用，往往是其开发费用的数倍。因此项目的可维护性、可复用性是衡量一个项目好坏的关键。而注释则是可维护性中必不可少的一环。注释模板导入步骤安装方法：打开eclipse/myeclipse 选择 window-->Preferences-->JAVA-->Code-->Code
java数组百合不是茶 java数组
java数组的声明创建初始化； java支持C语言数组中的每个数都有唯一的一个下标一维数组的定义声明： int[] a = new int[3];声明数组中有三个数int[3] int[] a 中有三个数，下标从0开始，可以同过for来遍历数组中的数
javascript读取表单数据 bijian1013 JavaScript
利用javascript读取表单数据，可以利用以下三种方法获取： 1、通过表单ID属性：var a = document.getElementByIdx_x_x("id"); 2、通过表单名称属性：var b = document.getElementsByName("name"); 3、直接通过表单名字获取：var c = form.content.
探索JUnit4扩展：使用Theory bijian1013 java JUnit Theory
理论机制（Theory）一.为什么要引用理论机制（Theory）当今软件开发中，测试驱动开发（TDD — Test-driven development）越发流行。为什么 TDD 会如此流行呢？因为它确实拥有很多优点，它允许开发人员通过简单的例子来指定和表明他们代码的行为意图。 TDD 的优点： &nb
[Spring Data Mongo一]Spring Mongo Template操作MongoDB bit1129 template
什么是Spring Data Mongo Spring Data MongoDB项目对访问MongoDB的Java客户端API进行了封装，这种封装类似于Spring封装Hibernate和JDBC而提供的HibernateTemplate和JDBCTemplate，主要能力包括 1. 封装客户端跟MongoDB的链接管理 2. 文档-对象映射，通过注解:@Document(collectio
【Kafka八】Zookeeper上关于Kafka的配置信息 bit1129 zookeeper
问题： 1. Kafka的哪些信息记录在Zookeeper中 2. Consumer Group消费的每个Partition的Offset信息存放在什么位置 3. Topic的每个Partition存放在哪个Broker上的信息存放在哪里 4. Producer跟Zookeeper究竟有没有关系？没有关系！！！ //consumers、config、brokers、cont
java OOM内存异常的四种类型及异常与解决方案 ronin47 java OOM 内存异常
　OOM异常的四种类型：　　　　　一：　StackOverflowError ：通常因为递归函数引起（死递归，递归太深）。-Xss 128k 一般够用。　二：　out Of memory: PermGen Space：通常是动态类大多，比如web 服务器自动更新部署时引起。-Xmx
java-实现链表反转-递归和非递归实现 bylijinnan java
20120422更新：对链表中部分节点进行反转操作，这些节点相隔k个： 0->1->2->3->4->5->6->7->8->9 k=2 8->1->6->3->4->5->2->7->0->9 注意1 3 5 7 9 位置是不变的。解法：将链表拆成两部分： a.0-&
Netty源码学习-DelimiterBasedFrameDecoder bylijinnan java netty
看DelimiterBasedFrameDecoder的API，有举例：接收到的ChannelBuffer如下： +--------------+ | ABC\nDEF\r\n | +--------------+ 经过DelimiterBasedFrameDecoder(Delimiters.lineDelimiter())之后，得到： +-----+----
linux的一些命令 -查看cc攻击-网口ip统计等 hotsunshine linux
Linux判断CC攻击命令详解 2011年12月23日 ⁄ 安全 ⁄ 暂无评论查看所有80端口的连接数 netstat -nat|grep -i '80'|wc -l 对连接的IP按连接数量进行排序 netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n 查看TCP连接状态 n
Spring获取SessionFactory ctrain sessionFactory
String sql = "select sysdate from dual"; WebApplicationContext wac = ContextLoader.getCurrentWebApplicationContext(); String[] names = wac.getBeanDefinitionNames(); for(int i=0; i&
Hive几种导出数据方式 daizj hive 数据导出
Hive几种导出数据方式 1.拷贝文件如果数据文件恰好是用户需要的格式，那么只需要拷贝文件或文件夹就可以。 hadoop fs –cp source_path target_path 2.导出到本地文件系统 --不能使用insert into local directory来导出数据，会报错 --只能使用
编程之美 dcj3sjt126com 编程 PHP 重构
我个人的 PHP 编程经验中，递归调用常常与静态变量使用。静态变量的含义可以参考 PHP 手册。希望下面的代码，会更有利于对递归以及静态变量的理解 header("Content-type: text/plain"); function static_function () { static $i = 0; if ($i++ < 1
Android保存用户名和密码 dcj3sjt126com android
转自：http://www.2cto.com/kf/201401/272336.html 我们不管在开发一个项目或者使用别人的项目，都有用户登录功能，为了让用户的体验效果更好，我们通常会做一个功能，叫做保存用户，这样做的目地就是为了让用户下一次再使用该程序不会重新输入用户名和密码，这里我使用3种方式来存储用户名和密码 1、通过普通的txt文本存储 2、通过properties属性文件进行存
Oracle 复习笔记之同义词 eksliang Oracle 同义词 Oracle synonym
转载请出自出处：http://eksliang.iteye.com/blog/2098861 1.什么是同义词同义词是现有模式对象的一个别名。概念性的东西，什么是模式呢？创建一个用户，就相应的创建了一个模式。模式是指数据库对象，是对用户所创建的数据对象的总称。模式对象包括表、视图、索引、同义词、序列、过
Ajax案例 gongmeitao Ajax jsp
数据库采用Sql Server2005 项目名称为:Ajax_Demo 1.com.demo.conn包 package com.demo.conn; import java.sql.Connection;import java.sql.DriverManager;import java.sql.SQLException; //获取数据库连接的类public class DBConnec
ASP.NET中Request.RawUrl、Request.Url的区别 hvt .net Web C#asp.net hovertree
如果访问的地址是：http://h.keleyi.com/guestbook/addmessage.aspx?key=hovertree%3C&n=myslider#zonemenu那么Request.Url.ToString() 的值是：http://h.keleyi.com/guestbook/addmessage.aspx?key=hovertree<&
SVG 教程（七）SVG 实例，SVG 参考手册天梯梦 svg
SVG 实例在线实例下面的例子是把SVG代码直接嵌入到HTML代码中。谷歌Chrome，火狐，Internet Explorer9，和Safari都支持。注意：下面的例子将不会在Opera运行，即使Opera支持SVG - 它也不支持SVG在HTML代码中直接使用。 SVG 实例 SVG基本形状一个圆矩形不透明矩形一个矩形不透明2 一个带圆角矩
事务管理 luyulong java spring 编程事务
事物管理 spring事物的好处为不同的事物API提供了一致的编程模型支持声明式事务管理提供比大多数事务API更简单更易于使用的编程式事务管理API 整合spring的各种数据访问抽象 TransactionDefinition 定义了事务策略 int getIsolationLevel()得到当前事务的隔离级别 READ_COMMITTED
基础数据结构和算法十一：Red-black binary search tree sunwinner Algorithm Red-black
The insertion algorithm for 2-3 trees just described is not difficult to understand; now, we will see that it is also not difficult to implement. We will consider a simple representation known
centos同步时间 stunizhengjia linux 集群同步时间
做了集群，时间的同步就显得非常必要了。以下是查到的如何做时间同步。在CentOS 5不再区分客户端和服务器，只要配置了NTP，它就会提供NTP服务。 1)确认已经ntp程序包： # yum install ntp 2)配置时间源（默认就行，不需要修改） # vi /etc/ntp.conf server pool.ntp.o
ITeye 9月技术图书有奖试读获奖名单公布 ITeye管理员 ITeye
ITeye携手博文视点举办的9月技术图书有奖试读活动已圆满结束，非常感谢广大用户对本次活动的关注与参与。 9月试读活动回顾：http://webmaster.iteye.com/blog/2118112本次技术图书试读活动的优秀奖获奖名单及相应作品如下（优秀文章有很多，但名额有限，没获奖并不代表不优秀）：《NFC：Arduino、Andro