duxingzhe103

Mastering the game of Go without human knowledge

A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo’s own move selections and also the winner of AlphaGo’s games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

人工智能的一个长期目标便是算法可以在挑战性领域中学习，纯粹的，并有着超过人类表现的能力。最近，AlphaGo成为了第一个能够打败世界围棋冠军的程序。AlphaGo的树搜索方法分析位置并通过深度搜索树来选择下一步。这些神经网络通过监督学习和人类围棋步骤来学习训练，通过加强学习来自我博弈。这里我们介绍了一种只基于增强学习方法的算法，不需要人类的数据，指导或者除了规则之外的其他专业知识。AlphaGo成为了自己的老师：一个神经网络用于预测AlphaGo自身的步骤选择并成为了AlphaGo对局中的赢家。这种神经网络提升了树搜索的强度，通过高质量的步骤选择和在下一次迭代的强大的自我博弈能力。从最空白的时刻开始，我们的新程序AlphaGo Zero达到了超过人类的性能，和前任已经打败了冠军的AlphaGo相比，是100:0的成绩。

Much progress towards artificial intelligence has been made using supervised learning systems that are trained to replicate the decisions of human experts. However, expert data sets are often expensive, unreliable or simply unavailable. Even when reliable data sets are available, they may impose a ceiling on the performance of systems trained in this manner. By contrast, reinforcement learning systems are trained from their own experience, in principle allowing them to exceed human capabilities, and to operate in domains where human expertise is lacking. Recently, there has been rapid progress towards this goal, using deep neural networks trained by reinforcement learning. These systems have outperformed humans in computer games, such as Atari and 3D virtual environments. However, the most challenging domains in terms of human intellect—such as the game of Go, widely viewed as a grand challenge for artificial intelligence—require a precise and sophisticated lookahead in vast search spaces. Fully general methods have not previously achieved humanlevel performance in these domains.

许多在神经网络方面的研究都是使用了监督学习系统，这个模型是用来复制人类专家的经验。然而，专业数据都通常相当昂贵、不可靠并且都不能提供。就算当可靠的数据时可以使用的，通过这种方法在性能上仍然有瓶颈。相反，增强学习系统则是通过自身经验来训练，重要的是让他们超过人类能力，在人类专家不擅长的领域有所突破。最近，这个目标开始快速靠近，方法是增强学习训练深度神经网络。这些系统在电脑游戏中比人类要优秀，比如说阿塔里和3D视觉环境。然而，更多的挑战集中于人类智力研究——比如围棋，普遍认为是一个对人工智能的巨大挑战——需要一个精确的、熟练的和巨大搜索空间的。完全通用技术在这些领域在此之前无法达到人类水平的能力。

AlphaGo was the first program to achieve superhuman performance in Go. The published version, which we refer to as AlphaGo Fan, defeated the European champion Fan Hui in October 2015. AlphaGo Fan used two deep neural networks: a policy network that outputs move probabilities and a value network that outputs a position evaluation. The policy network was trained initially by supervised learning to accurately predict human expert moves, and was subsequently refined by policygradient reinforcement learning. The value network was trained to predict the winner of games played by the policy network against itself. Once trained, these networks were combined with a Monte Carlo tree search (MCTS) to provide a lookahead search, using the policy network to narrow down the search to highprobability moves, and using the value network (in conjunction with Monte Carlo rollouts using a fast rollout policy) to evaluate positions in the tree. A subsequent version, which we refer to as AlphaGo Lee, used a similar approach (see Methods), and defeated Lee Sedol, the winner of international titles, in March 2016.

AlphaGo是第一个在围棋领域中性能超越了人类的程序。之前的公开版本，我们称之为AlphaGo Fan，是在2015年十月打败了欧洲冠军樊辉。AlphaGo Fan使用了两个深度神经网络：一个策略神经网络，用来输出可能的步骤；第二个是价值网络，用于判断位置。策略网络首先是通过监督学习网络来训练，用于准确预测人类专家的步骤，并通过策略径向增强学习来逐步修正。价值网络用于预测通过策略网络下的游戏的概率。一旦测试结束，这些网络便会用蒙特卡罗搜索树结合在一起，用来进行游戏中落子的预测。使用策略网络来筛选出高可能性的步骤，并用价值网络（用快速走子结合在一起的蒙特卡罗搜索树）来评估树中的位置。接下来的版本，我们称之为AlphaGo Lee，使用了类似的方法，并在2016年3月份打败了18次获得世界冠军的选手李世石。

Our program, AlphaGo Zero, differs from AlphaGo Fan and AlphaGo Lee in several important aspects. First and foremost, it is trained solely by selfplay reinforcement learning, starting from random play, without any supervision or use of human data. Second, it uses only the black and white stones from the board as input features. Third, it uses a single neural network, rather than separate policy and value networks. Finally, it uses a simpler tree search that relies upon this single neural network to evaluate positions and sample moves, without performing any Monte Carlo rollouts. To achieve these results, we introduce a new reinforcement learning algorithm that incorporates lookahead search inside the training loop, resulting in rapid improvement and precise and stable learning. Further technical differences in the search algorithm, training procedure and network architecture are described in Methods.

我们的程序AlphaGo Zero与以往的AlphaGo Fan和AlphaGo Lee在许多地方上有不同。第一也是最重要的一点，他只被自我博弈的增强型学习训练，被随机训练，并没有使用任何的人类数据。第二，他只用了板上黑棋和白棋做为输入内容。最后，他使用了一个更简单的树搜索，只用了这个简单的神经网络来评估位置和范例步骤，而不是用任何蒙特卡洛树走子来实现。为了达到这些目的，我们使用了一个新的增强学习算法，这个算法在训练循环中使用了非协同的前置搜索，以期达到快速的进步和准确稳定的学习。在搜索算法、训练过程和网络架构的技术不同在方法部分中有具体描述。

Reinforcement learning in AlphaGo Zero

AlphaGo Zero的增强学习

Our new method uses a deep neural network fθ with parameters θ. This neural network takes as an input the raw board representation s of the position and its history, and outputs both move probabilities and a value, (p, v) = fθ(s) . The vector of move probabilities p represents the probability of selecting each move a (including pass), pa = Pr(a| s). The value v is a scalar evaluation, estimating the probability of the current player winning from position s. This neural network combines the roles of both policy network and value network into a single architecture. The neural network consists of many residual blocks of convolutional layers with batch normalization and rectifier nonlinearities (see Methods).

我们的新方法是用了一个有参数θ的深度神经网络 fθ 。这个神经网络将原始棋盘上的位置展示s和他的历史做为输入值，并将他所有移动的可能性和值 (p, v) = fθ(s) 。步骤可能性p的向量表示选择每一个步骤a（包括经过这个地方）的概率， pa = Pr(a| s)。值v是一个标量评估，评估在现在的位置s时，当前玩家胜利的概率是多少。这个神经网络同时合成了策略网络和值网络，这两个网络在同一个单一架构中。神经网络还组合了许多批规范化和整流器非线性化的卷积网络中的残块。

The neural network in AlphaGo Zero is trained from games of selfplay by a novel reinforcement learning algorithm. In each position s, an MCTS search is executed, guided by the neural network fθ . The MCTS search outputs probabilities π of playing each move. These search probabilities usually select much stronger moves than the raw move probabilities p of the neural network fθ(s) ; MCTS may therefore be viewed as a powerful policy improvement operator. Selfplay with search—using the improved MCTSbased policy to select each move, then using the game winner z as a sample of the value—may be viewed as a powerful policy evaluation operator. The main idea of our reinforcement learning algorithm is to use these search operators repeatedly in a policy iteration procedure: the neural network’s parameters are updated to make the move probabilities and value (p, v) = fθ(s) more closely match the improved search probabilities and selfplay winner (π, z); these new parameters are used in the next iteration of selfplay to make the search even stronger. Figure 1 illustrates the selfplay training pipeline.

AlphaGo Zero的神经网络是通过一个虚拟机的增强学习算法的自我博弈游戏中训练的。在每一个位置s上，一个蒙特卡洛搜索便会执行，这个搜索树通过函数 fθ(s) 来指导。蒙特卡洛搜索的输出结果每一个步骤的概率 π 。这些搜索概率通常比原始神经网络 fθ(s) 的概率p要大；蒙特卡洛搜索也因此被认为是一种强大的策略提升因子。通过搜索的自我博弈——使用提升的基于蒙特卡罗搜索树的策略来选择每次移动的可能性，然后是用游戏胜率z来做为值的范例。我们的增强学习算法的主要想法是使用这些算法因子来反复执行策略迭代生成：神经网络的变量通过产生移动概率和值 (p,v)=fθ(s) 更与改进搜索的概率和自我博弈的胜者 (π,z) 相接近；这些新变量被用于自我博弈的下一次迭代当中，来让算法更加强劲。图例1说明了自我博弈的路径。

The MCTS uses the neural network fθ to guide its simulations (see Fig. 2). Each edge (s, a) in the search tree stores a prior probability P(s, a), a visit count N(s, a), and an action value Q(s, a). Each simulation starts from the root state and iteratively selects moves that maximize an upper confidence bound Q(s, a) + U(s, a), where U(s,a)∝P(s,a)/(1+N(s,a)) , until a leaf node s′ is encountered. This leaf position is expanded and evaluated only once by the network to generate both prior probabilities and evaluation, (P(s',⋅),V(s'))=fθ(s') . Each edge (s, a) traversed in the simulation is updated to increment its visit count N(s, a), and to update its action value to the mean evaluation over these simulations, Q(s,a)=1N(s,a)∑s′|s,a→s'V(s') where s,a→s' indicates that a simulation eventually reached s′ after taking move a from position s.

蒙特卡罗搜索树使用了神经网络 fθ 来指导他的模拟。搜索树的每一条边(s,a)存储了一个优先概率P(s,a)，访问次数N(s,a)，和活动值Q(s,a)。每一次模拟从根状态开始，然后递归的选择最大化的置信上线Q(s,a)+U(s,a)的步骤， U(s,a)∝P(s,a)/(1+N(s,a)) ，直到碰到叶子节点。这个叶子节点只会通过网络来生成优先概率和评估来伸展和分析一次， (P(s',⋅),V(s'))=fθ(s') 。每一条边(s,a)在模拟的时候穿越之后便会被更新，更新用来增加他的访问次数N(s,a)，并更新他的动作值来平衡这些模拟中的模拟， Q(s,a)=1N(s,a)∑s′|s,a→s'V(s') ，其中： s,a→s' 说明模拟当a运动到s的时候最后达到了s’。

MCTS may be viewed as a selfplay algorithm that, given neural network parameters θ and a root position s, computes a vector of search probabilities recommending moves to play, π=αθ(s), proportional to the exponentiated visit count for each move, πa∝N(s,a)1/τ , where τ is a temperature parameter.

蒙特卡罗搜索树被认为是一种自我博弈的算法，给定好神经网络变量 θ 和根位置s，计算了一个玩的推荐位置的搜索概率的向量， π=αθ(s)，部分正比于每次移动的指数级访问次数， πa∝N(s,a)1/τ ， τ 是一个临时变量。

The neural network is trained by a selfplay reinforcement learning algorithm that uses MCTS to play each move. First, the neural network is initialized to random weights θ0 . At each subsequent iteration i ≥ 1, games of selfplay are generated (Fig. 1a). At each timestep t, an MCTS search πt=αθi−1(st) is executed using the previous iteration of neural network fθi−1 and a move is played by sampling the search probabilities πt . A game terminates at step T when both players pass, when the search value drops below a resignation threshold or when the game exceeds a maximum length; the game is then scored to give a final reward of rT∈{−1,+1} (see Methods for details). The data for each timestep t is stored as (st,πt,zt) , where zt=±rT is the game winner from the perspective of the current player at step t. In parallel (Fig. 1b), new network parameters θi are trained from data (s, π, z) sampled uniformly among all timesteps of the last iteration(s) of selfplay. The neural network (p,v)=fθi(s) is adjusted to minimize the error between the predicted value v and the selfplay winner z, and to maximize the similarity of the neural network move probabilities p to the search probabilities π. Specifically, the parameters θ are adjusted by gradient descent on a loss function l that sums over the meansquared error and crossentropy losses, respectively:

神经网络通过自我博弈、使用了蒙特卡洛搜索树来下每一步走子的增强学习算法。第一步，神经网络被随机权重 θ0 初始化。在每一次子递归中i ≥ 1，自我博弈的游戏被生成。在每一次时间步骤中t，一个使用了之前神经网络 fθi−1 的迭代的蒙特卡洛搜索 πt=αθi−1(st) 被执行，并且通过范例化搜索概率 πt 来下下一步落子。当每个对手都走完了，或者当搜索值降低到一个特定的阈值或者游戏达到了最大值，游戏便在步骤T结束。于是游戏便会用 rT∈{−1,+1} 来打分。每一次时间所下的落子t便会以 (st,πt,zt) 存储起来。同时地，新的网络变量 θi 从数据(s, π, z)中训练得出，并通过自我博弈的最近一次迭代(s)的所有时间步骤进行统一规范处理。神经网络 (p,v)=fθi(s) 通过从预测值v和自我博弈胜者z最小化错误来调整，最大化神经网络移动概率p的相似性来搜索概率 π 。特别的，变量 θ 被损失函数l通过径向下降来分别计算平方差错误和交叉熵损失的总和：

(p, v) = f θ (s)

and

l = (z - v) 2 - π T l o g p + c | θ | 2

where c is a parameter controlling the level of L2 weight regularization (to prevent overfitting).

c是一个控制L2权重规范化的等级的变量，用于防止过度拟合

Empirical analysis of AlphaGo Zero training

AlphaGo Zero训练的宏观分析

We applied our reinforcement learning pipeline to train our program AlphaGo Zero. Training started from completely random behaviour and continued without human intervention for approximately three days.

我们使用了增强学习管道来训练我们的程序AlphaGo Zero。训练从一开始就是完全的随机行为，三天时间内没有任何人为干预。

Over the course of training, 4.9 million games of selfplay were generated, using 1,600 simulations for each MCTS, which corresponds to approximately 0.4 s thinking time per move. Parameters were updated from 700,000 minibatches of 2,048 positions. The neural network contained 20 residual blocks (see Methods for further details).

在训练知识，4900万次自我博弈被生成，每次蒙特卡洛搜索树可以使用1600次模拟，每次移动都用了大概0.4s来分析。变量从2048个位置中70万次迷你批次中更新。神经网络包括了20个多余块（查看方法来获得更多信息）。

Figure 3a shows the performance of AlphaGo Zero during selfplay reinforcement learning, as a function of training time, on an Elo scale. Learning progressed smoothly throughout training, and did not suffer from the oscillations or catastrophic forgetting that have been suggested in previous literature. Surprisingly, AlphaGo Zero outperformed AlphaGo Lee after just 36 h. In comparison, AlphaGo Lee was trained over several months. After 72 h, we evaluated AlphaGo Zero against the exact version of AlphaGo Lee that defeated Lee Sedol, under the same 2 h time controls and match conditions that were used in the man–machine match in Seoul (see Methods). AlphaGo Zero used a single machine with 4 tensor processing units (TPUs), whereas AlphaGo Lee was distributed over many machines and used 48 TPUs. AlphaGo Zero defeated AlphaGo Lee by 100 games to 0 (see Extended Data Fig. 1 and Supplementary Information).

数据3a表示了AlphaGo Zero在自我博弈的增强学习算法中的性能，做为训练时间的函数，在一个Elo跨度下。学习在训练的时候进展顺利，并没有遇到之前研究中所提及的震荡和遗忘。出乎意料的是，AlphaGo Zero在36小时候便超过了AlphaGo Lee的性能。相比之下，AlphaGo Lee则需要训练几个月。72小时候，我们在评估AlphaGo Zero时，让他和打败过李世石的AlphaGo Lee在首尔2小时控制时间和比赛环境下进行比赛。Alpha Zero使用了一个单一的机器，只有4个传感处理器，而AlphaGo则是使用了许多分布式的机器而且用了48个传感处理器。AlphaGo Zero和AlphaGo Lee的比赛以100:0结束。

To assess the merits of selfplay reinforcement learning, compared to learning from human data, we trained a second neural network (using the same architecture) to predict expert moves in the KGS Server dataset; this achieved state-of-the-art prediction accuracy compared to previous work (see Extended Data Tables 1 and 2 for current and previous results, respectively). Supervised learning achieved a better initial performance, and was better at predicting human professional moves (Fig. 3). Notably, although supervised learning achieved higher move prediction accuracy, the selflearned player performed much better overall, defeating the humantrained player within the first 24 h of training. This suggests that AlphaGo Zero may be learning a strategy that is qualitatively different to human play.

为了能够评价自我博弈的增强学习算法的优劣，与从人类大师中学习的算法进行比较，我们训练了第二个神经网络（使用了相同的架构）用以预测在KGS Server数据集上的专家步骤；这样获得到的最领先的预测准确性来比较之前的工作（查看额外数据表格1和2来分别查看现在和之前的工作）。监督学习在一开始获得了非常好的性能，并对人类的步骤有非常好的预测效果。很明显的是，尽管监督学习获得了更高的步骤预测能力，自我学习的玩家则获得了更高深的技巧，在第一次开始学习起的24小时内便打败了通过人类数据学习的玩家。这说明AlphaGo Zero可以学习到完全与人类不同的技能。

To separate the contributions of architecture and algorithm, we compared the performance of the neural network architecture in AlphaGo Zero with the previous neural network architecture used in AlphaGo Lee (see Fig. 4). Four neural networks were created, using either separate policy and value networks, as were used in AlphaGo Lee, or combined policy and value networks, as used in AlphaGo Zero; and using either the convolutional network architecture from AlphaGo Lee or the residual network architecture from AlphaGo Zero. Each network was trained to minimize the same loss function (equation (1)), using a fixed dataset of selfplay games generated by AlphaGo Zero after 72 h of selfplay training. Using a residual network was more accurate, achieved lower error and improved performance in AlphaGo by over 600 Elo. Combining policy and value together into a single network slightly reduced the move prediction accuracy, but reduced the value error and boosted playing performance in AlphaGo by around another 600 Elo. This is partly due to improved computational efficiency, but more importantly the dual objective regularizes the network to a common representation that supports multiple use cases.

为了将结构和算法的贡献分清楚，我们分析了AlphaGo Zero目前的神经网络架构和AlphaGo Lee所使用的神经网络架构。四种神经网络被生成，使用了单独分开的策略网络和价值网络，就如在AlphaGo Lee使用的，或者就如在AlphaGo Zero里面使用的综合性策略和价值网络；或者使用了AlphaGo Lee中的卷积网络或者是AlphaGo Zero中的剩余网络中的一种。每一种网络都训练起来，使用的是相同的损失函数。(方程1)，使用的是72小时候AlphaGo Zero生成的自我对奕生成的数据集。使用残差网络是因为能够更精确，在AlphaGo 60次对弈后会有更少的错误和性能的提升。将策略网络和价值网络混合在一起会稍微降低准确性，但是也同时降低了值的错误并在剩下的600次对弈中提高了游戏的性能。这一部分是因为提高了计算能力，但更重要的是双向物体规范化了网络到了一个能使用多个案例的通用方式。

Knowledge learned by AlphaGo Zero

从AlphaGo Zero学到的知识

AlphaGo Zero discovered a remarkable level of Go knowledge during its selfplay training process. This included not only fundamental elements of human Go knowledge, but also nonstandard strategies beyond the scope of traditional Go knowledge.

AlphaGo Zero在自我博弈的时候发现了围棋知识的新境界。这不仅包括了人类围棋的基础知识，也包括了传统知识领域未曾达到的地步。

Figure 5 shows a timeline indicating when professional joseki (corner sequences) were discovered (Fig. 5a and Extended Data Fig. 2); ultimately AlphaGo Zero preferred new joseki variants that were previously unknown (Fig. 5b and Extended Data Fig. 3). Figure 5c shows several fast selfplay games played at different stages of training (see Supplementary Information). Tournament length games played at regular intervals throughout training are shown in Extended Data Fig. 4 and in the Supplementary Information. AlphaGo Zero rapidly progressed from entirely random moves towards a sophisticated understanding of Go concepts, including fuseki (opening), tesuji (tactics), lifeanddeath, ko (repeated board situations), yose (endgame), capturing races, sente (initiative), shape, influence and territory, all discovered from first principles. Surprisingly, shicho (‘ladder’ capture sequences that may span the whole board)—one of the first elements of Go knowledge learned by humans—were only understood by AlphaGo Zero much later in training.

数据5展示了当专业的定式（位于边角的序列上）被发现时候的时间轴（数据5a和扩展数据2）；最终AlphaGo Zero使用了新的定式。数据5c表示出了在不同训练阶段几次快速自我博弈游戏的进行情况（见补充信息）。在训练过程中一般游戏长度中都有一些间隔，这些间隔都显示在了额外数据4和补充信息中。AlphaGo Zero在从新手到老手的过程当中迅速成长并理解围棋的几个阶段，包括开局、战术、存活和死亡、复盘、官子、捕捉棋子、初始、成型、影响和占领，都在第一步阶段的时候迅速掌握。令人意外的是，梯子（抓住了整个棋盘的序列）——在人类学习围棋中比较早被人类掌握的围棋知识点——却在AlphaGo Zero训练比较晚的时候才掌握到。

Final performance of AlphaGo Zero

AlphaGo Zero的最终性能

We subsequently applied our reinforcement learning pipeline to a second instance of AlphaGo Zero using a larger neural network and over a longer duration. Training again started from completely random behaviour and continued for approximately 40 days.

我们接着将我们的增强学习管道引入第二个AlphaGo Zero示例当中，并用了更大一点神经网络还持续了更长的时间来训练。从完全陌生的行为再次开始训练并持续了大约有40天。

Over the course of training, 29 million games of selfplay were generated. Parameters were updated from 3.1 million minibatches of 2,048 positions each. The neural network contained 40 residual blocks. The learning curve is shown in Fig. 6a. Games played at regular intervals throughout training are shown in Extended Data Fig. 5 and in the Supplementary Information.

在训练的时候，2900万次自我博弈的棋盘被生成。变量从2048个位置的310万个小位置中更新。神经网络包括了40个残差块。学习曲线在数据6a。训练中游戏在正常间隔中进行，这部分在额外数据5和补充信息中展示。

We evaluated the fully trained AlphaGo Zero using an internal tournament against AlphaGo Fan, AlphaGo Lee and several previous Go programs. We also played games against the strongest existing program, AlphaGo Master—a program based on the algorithm and architecture presented in this paper but using human data and features (see Methods)—which defeated the strongest human professional players 60–0 in online games in January 2017. In our evaluation, all programs were allowed 5 s of thinking time per move; AlphaGo Zero and AlphaGo Master each played on a single machine with 4 TPUs; AlphaGo Fan and AlphaGo Lee were distributed over 176 GPUs and 48 TPUs, respectively. We also included a player based solely on the raw neural network of AlphaGo Zero; this player simply selected the move with maximum probability.

我们让训练好的AlphaGo Zero在公司内部与AlphaGo Fan、AlphaGo Lee和之前研究的围棋程序进行比赛。我们同样使用了AlphaGo Master——程序使用了与论文相同的算法和结构的程序，但是训练数据则是人类数据和特点（在方法中查看）——这个AlphaGo Master在2017年1月与人类在网络对决，结果是60：0。我们的预测当中，所有的程序都可以在五秒内决定下一步的落子；AlphaGo Zero和AlphaGo Master都是使用了一个4颗TPU的单机；AlphaGo Fan和AlphaGo Lee分别使用了176颗GPU和48颗TPU的分布式机器。我们同样使用了一个只有神经网络的AlphaGo Zero；这个机器会选择最大可能性的落子。

Figure 6b shows the performance of each program on an Elo scale. The raw neural network, without using any lookahead, achieved an Elo rating of 3,055. AlphaGo Zero achieved a rating of 5,185, compared to 4,858 for AlphaGo Master, 3,739 for AlphaGo Lee and 3,144 for AlphaGo Fan.

数据6b显示在一部棋的规模下每一种程序的性能。原始神经网络在不使用任何前瞻性的算法，达到了3055积分，AlphaGo Zero获得了5185积分，相比之下AlphaGo Master获得了4858分，AlphaGo Lee获得了3739分，AlphaGo Fan获得了3144分。

Finally, we evaluated AlphaGo Zero head to head against AlphaGo Master in a 100-game match with 2h time controls. AlphaGo Zero won by 89 games to 11 (see Extended Data Fig. 6 and Supplementary Information).

最终，我们对AlphaGo Zero和AlphaGo Master的2小时内100场比赛对决的控制下，AlphaGo Zero赢得了89场，输了11场。

Conclusion

结论

Our results comprehensively demonstrate that a pure reinforcement learning approach is fully feasible, even in the most challenging of domains: it is possible to train to superhuman level, without human examples or guidance, given no knowledge of the domain beyond basic rules. Furthermore, a pure reinforcement learning approach requires just a few more hours to train, and achieves much better asymptotic performance, compared to training on human expert data. Using this approach, AlphaGo Zero defeated the strongest previous versions of AlphaGo, which were trained from human data using handcrafted features, by a large margin.

我们的结论十分完全的说明了一个完全的增强学习的实现是非常灵活的，在许多有挑战性的领域也是可以使用的：他可以有机会达到超过人类的高度，不需要人类的例子或指导，只需要基本规则而不需要额外知识。更重要的是，一个纯粹的增强学习的实现只需要几小时便可以训练完成，并能达到类似的性能，相比使用人类算法要好得多。使用了这个方法，AlphaGo Zero打败了之前最强版本的AlphaGo。之前的版本使用了人类下棋的棋盘，并用了许多资源来学习。

Humankind has accumulated Go knowledge from millions of games played over thousands of years, collectively distilled into patterns, proverbs and books. In the space of a few days, starting tabula rasa, AlphaGo Zero was able to rediscover much of this Go knowledge, as well as novel strategies that provide new insights into the oldest of games.

人类在过去几千年的历史来了数万盘棋，已经总结了大量的技巧、模式和书籍。然而在短短的几天时间里，AlphaGo Zero从开局开始就能重新发现大多数围棋知识，并能从这些虚拟的技巧中发现新东西，并让传统焕发出新的光芒。

扩展点：Batch Normalization和Rectifier Nonlinearity.

你可能感兴趣的:(英文文档翻译及简要解析)

高可用与低成本兼得：全面解析 TDengine 时序数据库双活与双副本 TDengine （老段） TDengine 案例分析时序数据库 tdengine 大数据涛思数据数据库物联网 iot
在现代数据管理中，企业对于可靠性、可用性和成本的平衡有着多样化的需求。为此，TDengine在3.3.0.0版本中推出了两种不同的企业级解决方案：双活方案和基于仲裁者的双副本方案，以满足不同应用场景下的特殊需求。本文将详细探讨这两种方案的适用场景、技术特点及其最佳实践，让大家深入了解这两大方案如何帮助企业在高效可靠的数据存储和管理中取得成功。TDengine双副本（+仲裁者）为了满足部分客户在保证
GitHub Actions 的深度解析与概念介绍青草地溪水旁 linux 环境配置开发管理 github linux ubuntu docker
GitHubActions核心定义GitActions是GitHub原生提供的自动化工作流引擎，允许开发者在代码仓库中直接创建、测试、部署代码。其本质是通过事件驱动（Event-Driven）的自动化管道，将软件开发中的重复任务抽象为可编排的流程。架构核心四要素工作流（Workflow）仓库中的自动化流程蓝图（.yml文件）存储在.github/workflows目录单仓库可包含多个独立工作流事件
Ansible——lookup,过滤器凤凰战士芭比Q Ansible ansible linux
文章目录Ansible——lookup,过滤器lookup读取文件lookup生成随机密码lookup读取环境变量lookup读取Linux命令的执行结果lookup读取template变量替换后的文件lookup读取配置文件lookup读取DNS解析的值过滤器过滤器使用的位置过滤器对普通变量的操作过滤器对文件路径的操作过滤器对字符串变量的操作过滤器对JSON的操作过滤器对数据结构的操作过滤器的链
【linux】yum工具篇 nanguochenchuan Linux操作系统 linux 运维服务器
Yum工具概述Yum（YellowdogUpdaterModified）是RedHat系列Linux发行版（如CentOS、Fedora）中最核心的软件包管理工具，它基于RPM包管理系统构建，通过自动解决依赖关系极大简化了软件管理流程。与直接使用rpm命令相比，Yum能自动处理软件包依赖，让系统管理员从"依赖地狱"中解脱出来。Yum工作原理深度解析Yum的工作流程可分为四个关键阶段：仓库配置读取：
【第15章】亿级电商平台订单系统-高可用架构设计 cherry5230 亿级流量架构设计与落地系统架构分布式架构中间件
1-1本章导学课程概述核心内容：订单系统高可用架构设计项目背景：年交易额200亿的B2B电商平台订单系统本章学习路径高可用概念解析设计原则学习七大架构设计方法论项目实战应用一、高可用核心概念定义与价值解析系统可靠性标准指标二、设计原则体系冗余设计故障自动转移服务降级策略监控预警机制三、七大高可用设计方法论<
Linux tcp_info：监控TCP连接的秘密武器 CodeWithMe 网络 linux tcp/ip
深入解析Linuxtcp_info：TCP状态的实时监控利器在开发和运维网络服务时，我们常常遇到这些问题：我的TCP连接为什么速度慢？是发生了重传，还是窗口太小？拥塞控制到底有没有生效？这些问题的答案，其实隐藏在内核的tcp_info结构中。本文将详细介绍：tcp_info是什么，怎么用？各字段含义和实际用途在调优TCP服务中的应用实践一、什么是tcp_info？tcp_info是Linux内核中
深度解析JavaScript 闭包 coding随想 JavaScript javascript 开发语言 ecmascript
深度解析JavaScript闭包引言：为什么闭包让人又爱又怕？在JavaScript的学习过程中，闭包（Closure）是一个绕不开的“坎”。很多开发者第一次接触闭包时，会感到一头雾水：“为什么函数能记住外部作用域的变量？”、“为什么闭包会导致内存泄漏？”。但另一方面，闭包又是JavaScript最强大的特性之一，它支撑着模块化开发、数据封装、异步编程等核心场景。本文将通过通俗的语言和生动的案例，
JavaScript中的函数柯里化（Currying）：从概念到实战 coding随想 JavaScript javascript ecmascript 开发语言前端
JavaScript中的函数柯里化（Currying）：从概念到实战在JavaScript开发中，函数式编程（FunctionalProgramming）逐渐成为一种主流思想。而函数柯里化（Currying），正是这一思想中的核心技巧之一。它不仅能提升代码的复用性和灵活性，还能帮助我们构建更优雅、更模块化的解决方案。本文将带你从零开始，深入理解柯里化的原理、实现方式及实际应用场景。一、什么是函数柯
Sonatype Nexus3安装配置及使用無法複制 nexus
1、简介SonatypeNexusRepositoryManager是一款强大的仓库管理工具，用于存储、管理和发布软件组件。它能够支持多种格式的仓库，如Maven、npm、Docker等。在企业开发中，私有Maven仓库常用于存储自定义依赖和发布组件，确保代码安全性和内部共享。本文将从服务器环境搭建、Nexus安装与配置、仓库创建、依赖上传，再到Maven项目中使用私有仓库的全过程，帮助你掌握如何
webpack和vite对比解析（AI）秉承初心 AI创造 webpack 前端 node.js
以下是Webpack和Vite的对比解析，从核心机制、性能、配置扩展性、适用场景等维度进行详细说明：⚙️一、核心机制差异构建模式Webpack：采用打包器模式，启动时需遍历整个模块依赖图，将所有资源打包成Bundle，再启动开发服务器。Vite：基于ESModules原生支持，开发环境跳过打包，按需编译（浏览器请求时实时编译）。生产环境才用Rollup打包。依赖处理Webpack：冷启动时需全量打
TDengine 技术参数配置大全 TDengine （老段） TDengine 产品设计 tdengine 涛思数据大数据数据库物联网时序数据库
1.背景TDengine的taos.cfg中配置项及使用SQL命令alter修改的系统变量之间的关系如何，哪些是持久存储项，哪些设置是临时项，这章将详细说明。本文是技术参考资料，请收藏。2.定义1.全局配置参数全局配置参数：作用于集群内所有dnode且在集群内必须保持一致的变量，也称为全局变量、系统变量或全局参数。例如:timezone/charset/countAlwaysReturnValue
微信小程序 progress 进度条内部圆角及内部条渐变色 Bonnie(大宝) 技术小程序
微信小程序表格微信小程序progress进度条内部圆角及渐变色html:css:.wx-progress-inner-bar{border-radius:8rpx!important;background:linear-gradient(toright,rgb(71,187,254,1),rgba(254,86,77,1))!important;}
【软考高级系统架构论文】论企业集成平台的理解与应用 _Richard_ 2025年软考系统架构师系统架构
论文真题请围绕“企业集成平台的理解与应用”论题，依次从以下三个方面进行论述。概要叙述你参与管理和开发的、采用企业集成平台进行企业信息集成的软件项目以及你在其中所承担的主要工作。请给出至少4种企业集成平台应具有的基本功能，并对这4种功能的内涵进行简要阐述。具体阐述你参与管理和开发的项目是如何使用企业集成平台进行企业信息集成的，并围绕上述4种功能，详细论述在集成过程中遇到了哪些实际问题，是如何解决的。
Transformer底层原理解析及基于pytorch的代码实现 LiRuiJie 人工智能 transformer pytorch 深度学习
1.Transformer底层原理解析1.1核心架构突破Transformer是自然语言处理领域的革命性架构，其核心设计思想完全摒弃了循环结构，通过自注意力机制实现全局依赖建模。整体架构图如下：以下是其核心组件：1）自注意力机制（Self-Attention）-输入序列的每个位置都能直接关注所有位置-数学公式（缩放点积注意力）：-Q：查询矩阵（当前关注点）-K：键矩阵（被比较项）-V：值矩阵（实际
c++常见英文单词（自用）叫我六胖子 c++英文 c++
c++常见英文单词application应用程式应用、应用程序applicationframework应用程式框架、应用框架应用程序框架architecture架构、系统架构体系结构argument引数（传给函式的值）。叁见parameter叁数、实质叁数、实叁、自变量array阵列数组arrowoperatorarrow（箭头）运算子箭头操作符assembly装配件assemblylanguag
深入剖析Nginx架构及其不同使用场景下的配置 LiRuiJie Nginx Nginx 系统架构反向代理
一、Nginx整体架构概览1.Nginx简介Nginx是采用C语言编写的高性能Web服务器、反向代理服务器及邮件代理服务器，特点是：高并发、高可用、低内存占用、模块化设计。架构核心理念：Master-Worker多进程模型事件驱动（Event-Driven）+异步非阻塞高度模块化设计2.进程模型Nginx的进程模型非常轻量，通常包含：1.Master进程启动时由shell进程fork出来主要负责：
小程序学习笔记：自定义组件创建、引用、应用场景及与页面的区别 you4580 小程序
在微信小程序开发中，自定义组件是一项极为实用的功能，它能有效提高代码的复用性，降低开发成本，提升开发效率。本文将深入剖析微信小程序自定义组件的各个关键方面，包括创建、引用、应用场景以及与页面的区别，并附上详细代码示例，帮助开发者全面掌握这一技术。一、自定义组件的创建创建自定义组件主要分为以下三个步骤：创建components文件夹：在项目根目录下，通过鼠标右键新建一个名为“components”的
小程序入门：跳过域名校验、跨域与 Ajax 问题解析 you4580 小程序
在小程序开发过程中，我们常常会遇到一些和网络请求相关的问题，比如合法域名校验、跨域以及Ajax的使用。今天这篇博客就来为大家详细讲解一下这些内容，帮助大家少走弯路，更高效地进行小程序开发。一、跳过request合法域名校验在小程序中发起网络数据请求，有两个硬性条件：接口必须基于https协议，同时要把接口对应的域名配置到合法域名列表里。可要是后端程序员只提供了http协议的接口，这时候该怎么办呢？
利用chatGPT提取复杂json数据到excel文件中 z日火工具使用 excel chatgpt json
利用chatGPT提取复杂json数据到excel文件中1利用swagger导出json类型的接口数据2使用hiJson工具查看json结构3利用ChatGPT写python代码解析数据4复制代码到vscode运行任务说明：整理一个项目的所有接口，保存到excel文档中。在这里插入图片描述1利用swagger导出json类型的接口数据2使用hiJson工具查看json结构我需要json数据的"pa
构建四则运算解析器：字符串处理与计算逻辑实战大熊小清新
本文还有配套的精品资源，点击获取简介：四则运算解析器是将包含四则运算符号的字符串表达式转化为可执行计算的程序。它对编程初学者而言是理解编程逻辑和语法分析的基础。通过理解四则运算的优先级规则，实现输入处理、词法分析、语法分析和计算步骤，可以采用递归下降解析或堆栈解析等方法。本解析器的实现涉及字符串处理、数据结构的运用，有助于学习者掌握编程语言的底层工作方式，提升编程技能和问题解决能力。1.四则运算解
提示词编程语言设计艺术探索 AI天才研究院计算 AI人工智能与大数据 AI大模型企业级应用开发实战 java python javascript kotlin golang 架构人工智能大厂程序员硅基计算碳基计算认知计算生物计算深度学习神经网络大数据 AIGC AGI LLM 系统架构设计软件哲学 Agent 程序员实现财富自由
《提示词编程语言设计艺术探索》关键词：提示词编程语言，设计艺术，编程语言设计，核心算法，实例分析，项目实战摘要：本文旨在深入探讨提示词编程语言的设计艺术，从基础概念到核心算法，再到实际应用和未来趋势，全面解析这一领域的关键技术和设计理念。通过具体的实例分析和项目实战，帮助读者更好地理解和掌握提示词编程语言的设计与实现。引言与概述1.1提示词编程语言的背景和重要性提示词编程语言（Prompt-Bas
linux日志文件详解 MagnumOvO 云计算 linux 5G linux 运维 centos
目录一、日志文件的分类二、日志文件位置三、常见日志文件1.分析日志文件2.内核及系统日志四、日志消息等级五、日志文件分析1.用户日志2.程序日志六、日志分析注意事项一、日志文件的分类日志文件是用于记录Linux系统中各种运行消息的文件,相当于Linux主机的“日记”。不同的日志文件记载了不同类型的信息,如Linux内核消息、用户登录事件、程序错误等·日志文件对于诊断和解决系统中的问题很有帮助,因为
C# 中 EventWaitHandle 实现多进程状态同步的深度解析 Leon@Lee c#开发语言
在现代软件开发中，多进程应用场景日益普遍。无论是分布式系统、微服务架构，还是传统的客户端-服务器模型，进程间的状态同步都是一个关键挑战。C#提供了多种同步原语，其中EventWaitHandle是一个强大的工具，特别适合处理跨进程的同步需求。本文将深入探讨EventWaitHandle的工作原理、使用场景及最佳实践。一、EventWaitHandle基础原理EventWaitHandle是.NET
计算机考研408真题解析（2024-34 二进制数字调制方法深度解析与FSK双频载波实现）
【良师408】计算机考研408真题解析（2024-34二进制数字调制方法深度解析与FSK双频载波实现）传播知识，做懂学生的好老师1.【哔哩哔哩】（良师408）2.【抖音】（良师408）goodteacher4083.【小红书】（良师408）4.【CSDN】（良师408）goodteacher4085.【微信】（良师408）goodteacher408特别提醒：【良师408】所收录真题根据考生回忆整
小程序领域H5的CSS布局优化小程序开发2020 CS 小程序 css 前端 ai
小程序领域H5的CSS布局优化：从“乱屏”到“丝滑”的实战指南关键词：小程序布局优化、CSSFlex、CSSGrid、rpx适配、重排重绘优化摘要：本文从开发者最头疼的“小程序页面布局错乱”问题出发，结合小程序特有的运行环境（如rpx单位、组件限制），用“装修房子”的生活化比喻拆解CSS布局核心概念，系统讲解Flex/Grid布局的实战技巧、多端适配策略及性能优化方法。通过真实代码案例（含wxml
Redis网络通信模块深度解析：单线程Reactor到多线程IO的架构演进
一、核心架构：单线程Reactor模型Redis网络模块采用经典Reactor模式，核心流程如下：voidaeMain(aeEventLoop*eventLoop){while(!eventLoop->stop){//前置钩子（集群心跳/数据持久化）if(eventLoop->beforesleep)eventLoop->beforesleep(eventLoop);//事件分派：I/O复用+定时
【安装Stable Diffusion以及遇到问题和总结】岁月玲珑 AI stable diffusion AI编程 AI作画
在本地安装部署StableDiffusion，需要准备好硬件环境，安装相关依赖，然后配置模型。下面为你详细介绍安装部署的步骤：一、硬件要求显卡：需要NVIDIAGPU，显存至少6GB，推荐8GB及以上。系统：Windows10/11、Linux（Ubuntu等）或macOS（需要Rosetta2）。内存：至少16GBRAM。存储空间：准备10GB以上的可用空间。二、软件准备首先要安装Python和
ARMv7内核架构手册及全部ARM内核资料下载杨焕月Great
ARMv7内核架构手册及全部ARM内核资料下载去发现同类优质开源项目:https://gitcode.com/资源介绍本仓库提供了一个重要的资源文件下载，标题为“Armv7内核架构手册+全部arm内核资料”。该资源文件包含了ARMv7内核架构的详细手册以及其他相关的配套资料，非常适合想要深入了解和学习ARM内核的朋友。资源内容ARMArchitectureReferenceManualARMv7-
C++智能指针编程实例 lixzest c++开发语言
智能指针是C++11引入的重要特性，用于自动管理动态分配的内存，防止内存泄漏。下面介绍几种高级智能指针编程实例。1.共享所有权模式(shared_ptr)循环引用问题及解决方案#include#includeclassB;//前向声明classA{public:std::shared_ptrb_ptr;~A(){std::couta_ptr;//这里会导致循环引用~B(){std::cout();
cJSON 源码解析
1.概述cJSON是一个轻量级的C语言JSON解析库，支持JSON数据的解析和生成。它采用单一头文件和源文件的设计，易于集成到项目中。主要特性完整的JSON支持（解析和生成）内存管理自动化支持格式化输出支持自定义内存分配器跨平台兼容2.核心数据结构2.1cJSON结构体typedefstructcJSON{structcJSON*next;//指向下一个兄弟节点structcJSON*prev;/
scala的option和some 矮蛋蛋编程 scala
原文地址： http://blog.sina.com.cn/s/blog_68af3f090100qkt8.html 对于学习 Scala 的 Java™ 开发人员来说，对象是一个比较自然、简单的入口点。在本系列前几期文章中，我介绍了 Scala 中一些面向对象的编程方法，这些方法实际上与 Java 编程的区别不是很大。我还向您展示了 Scala 如何重新应用传统的面向对象概念，找到其缺点
NullPointerException Cb123456 android BaseAdapter
java.lang.NullPointerException: Attempt to invoke virtual method 'int android.view.View.getImportantForAccessibility()' on a null object reference 出现以上异常.然后就在baidu上
PHP使用文件和目录天子之骄 php文件和目录读取和写入 php验证文件 php锁定文件
PHP使用文件和目录 1.使用include()包含文件 (1)：使用include()从一个被包含文档返回一个值 (2)：在控制结构中使用include() include_once()函数需要一个包含文件的路径，此外，第一次调用它的情况和include()一样，如果在脚本执行中再次对同一个文件调用，那么这个文件不会再次包含。在php.ini文件中设置
SQL SELECT DISTINCT 语句何必如此 sql
SELECT DISTINCT 语句用于返回唯一不同的值。 SQL SELECT DISTINCT 语句在表中，一个列可能会包含多个重复值，有时您也许希望仅仅列出不同（distinct）的值。 DISTINCT 关键词用于返回唯一不同的值。 SQL SELECT DISTINCT 语法 SELECT DISTINCT column_name,column_name F
java冒泡排序 3213213333332132 java 冒泡排序
package com.algorithm; /** * @Description 冒泡 * @author FuJianyong * 2015-1-22上午09:58:39 */ public class MaoPao { public static void main(String[] args) { int[] mao = {17,50,26,18,9,10
struts2.18 +json,struts2-json-plugin-2.1.8.1.jar配置及问题！ 7454103 DAO spring Ajax json qq
struts2.18 出来有段时间了！（貌似是稳定版）闲时研究下下！貌似 sruts2 搭配 json 做 ajax 很吃香！实践了下下！不当之处请绕过！呵呵网上一大堆 struts2+json 不过大多的json 插件都是 jsonplugin.34.jar strut
struts2 数据标签说明 darkranger jsp bean struts servlet Scheme
数据标签主要用于提供各种数据访问相关的功能，包括显示一个Action里的属性，以及生成国际化输出等功能数据标签主要包括： action ：该标签用于在JSP页面中直接调用一个Action，通过指定executeResult参数，还可将该Action的处理结果包含到本页面来。 bean ：该标签用于创建一个javabean实例。如果指定了id属性，则可以将创建的javabean实例放入Sta
链表.简单的链表节点构建 aijuans 编程技巧
/*编程环境WIN-TC*/ #include "stdio.h" #include "conio.h" #define NODE(name, key_word, help) \ Node name[1]={{NULL, NULL, NULL, key_word, help}} typedef struct node { &nbs
tomcat下jndi的三种配置方式 avords tomcat
jndi(Java Naming and Directory Interface，Java命名和目录接口)是一组在Java应用中访问命名和目录服务的API。命名服务将名称和对象联系起来，使得我们可以用名称访问对象。目录服务是一种命名服务，在这种服务里，对象不但有名称，还有属性。 tomcat配置
关于敏捷的一些想法 houxinyou 敏捷
从网上看到这样一句话：“敏捷开发的最重要目标就是：满足用户多变的需求，说白了就是最大程度的让客户满意。” 感觉表达的不太清楚。感觉容易被人误解的地方主要在“用户多变的需求”上。第一种多变，实际上就是没有从根本上了解了用户的需求。用户的需求实际是稳定的，只是比较多，也比较混乱，用户一般只能了解自己的那一小部分，所以没有用户能清楚的表达出整体需求。而由于各种条件的，用户表达自己那一部分时也有
富养还是穷养，决定孩子的一生 bijian1013 教育人生
是什么决定孩子未来物质能否丰盛？为什么说寒门很难出贵子，三代才能出贵族？真的是父母必须有钱，才能大概率保证孩子未来富有吗？-----作者：@李雪爱与自由事实并非由物质决定，而是由心灵决定。一朋友富有而且修养气质很好，兄弟姐妹也都如此。她的童年时代，物质上大家都很贫乏，但妈妈总是保持生活中的美感，时不时给孩子们带回一些美好小玩意，从来不对孩子传递生活艰辛、金钱来之不易、要懂得珍惜
oracle 日期时间格式转化征客丶 oracle
oracle 系统时间有 SYSDATE 与 SYSTIMESTAMP； SYSDATE：不支持毫秒，取的是系统时间； SYSTIMESTAMP：支持毫秒，日期，时间是给时区转换的，秒和毫秒是取的系统的。日期转字符窜：一、不取毫秒： TO_CHAR(SYSDATE, 'YYYY-MM-DD HH24:MI:SS') 简要说明， YYYY 年 MM 月
【Scala六】分析Spark源代码总结的Scala语法四 bit1129 scala
1. apply语法 FileShuffleBlockManager中定义的类ShuffleFileGroup，定义： private class ShuffleFileGroup(val shuffleId: Int, val fileId: Int, val files: Array[File]) { ... def apply(bucketId
Erlang中有意思的bug bookjovi erlang
代码中常有一些很搞笑的bug，如下面的一行代码被调用两次（Erlang beam） commit f667e4a47b07b07ed035073b94d699ff5fe0ba9b Author: Jovi Zhang <[email protected]> Date: Fri Dec 2 16:19:22 2011 +0100 erts:
移位打印10进制数转16进制-2008-08-18 ljy325 java 基础
/** * Description 移位打印10进制的16进制形式 * Creation Date 15-08-2008 9:00 * @author 卢俊宇 * @version 1.0 * */ public class PrintHex { // 备选字符 static final char di
读《研磨设计模式》-代码笔记-组合模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.util.ArrayList; import java.util.List; abstract class Component { public abstract void printStruct(Str
利用cmd命令将.class文件打包成jar chenyu19891124 cmd jar
cmd命令打jar是如下实现：在运行里输入cmd，利用cmd命令进入到本地的工作盘符。(如我的是D盘下的文件有此路径 D:\workspace\prpall\WEB-INF\classes) 现在是想把D:\workspace\prpall\WEB-INF\classes路径下所有的文件打包成prpall.jar。然后继续如下操作： cd D: 回车 cd workspace/prpal
[原创]JWFD v0.96 工作流系统二次开发包 for Eclipse 简要说明 comsci eclipse 设计模式算法工作 swing
JWFD v0.96 工作流系统二次开发包 for Eclipse 简要说明 &nb
SecureCRT右键粘贴的设置 daizj secureCRT 右键粘贴
一般都习惯鼠标右键自动粘贴的功能，对于SecureCRT6.7.5 ，这个功能也已经是默认配置了。老版本的SecureCRT其实也有这个功能，只是不是默认设置，很多人不知道罢了。菜单： Options->Global Options ...->Terminal 右边有个Mouse的选项块。 Copy on Select Paste on Right/Middle
Linux 软链接和硬链接 dongwei_6688 linux
1.Linux链接概念Linux链接分两种，一种被称为硬链接（Hard Link），另一种被称为符号链接（Symbolic Link）。默认情况下，ln命令产生硬链接。【硬连接】硬连接指通过索引节点来进行连接。在Linux的文件系统中，保存在磁盘分区中的文件不管是什么类型都给它分配一个编号，称为索引节点号(Inode Index)。在Linux中，多个文件名指向同一索引节点是存在的。一般这种连
DIV底部自适应 dcj3sjt126com JavaScript
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml&q
Centos6.5使用yum安装mysql——快速上手必备 dcj3sjt126com mysql
第1步、yum安装mysql [root@stonex ~]# yum -y install mysql-server 安装结果： Installed: mysql-server.x86_64 0:5.1.73-3.el6_5 &nb
如何调试JDK源码 frank1234 jdk
相信各位小伙伴们跟我一样，想通过JDK源码来学习Java，比如collections包，java.util.concurrent包。可惜的是sun提供的jdk并不能查看运行中的局部变量，需要重新编译一下rt.jar。下面是编译jdk的具体步骤： 1.把C:\java\jdk1.6.0_26\sr
Maximal Rectangle hcx2013 max
Given a 2D binary matrix filled with 0's and 1's, find the largest rectangle containing all ones and return its area. public class Solution { public int maximalRectangle(char[][] matrix)
Spring MVC测试框架详解——服务端测试 jinnianshilongnian spring mvc test
随着RESTful Web Service的流行，测试对外的Service是否满足期望也变的必要的。从Spring 3.2开始Spring了Spring Web测试框架，如果版本低于3.2，请使用spring-test-mvc项目（合并到spring3.2中了）。 Spring MVC测试框架提供了对服务器端和客户端（基于RestTemplate的客户端）提供了支持。 &nbs
Linux64位操作系统（CentOS6.6）上如何编译hadoop2.4.0 liyong0802 hadoop
一、准备编译软件 1.在官网下载jdk1.7、maven3.2.1、ant1.9.4，解压设置好环境变量就可以用。环境变量设置如下：（1）执行vim /etc/profile （2）在文件尾部加入: export JAVA_HOME=/home/spark/jdk1.7 export MAVEN_HOME=/ho
StatusBar 字体白色 pangyulei status
[[UIApplication sharedApplication] setStatusBarStyle:UIStatusBarStyleLightContent]; /*you'll also need to set UIViewControllerBasedStatusBarAppearance to NO in the plist file if you use this method
如何分析Java虚拟机死锁 sesame java thread oracle 虚拟机 jdbc
英文资料： Thread Dump and Concurrency Locks Thread dumps are very useful for diagnosing synchronization related problems such as deadlocks on object monitors. Ctrl-\ on Solaris/Linux or Ctrl-B
位运算简介及实用技巧（一）：基础篇 tw_wangzhengquan 位运算
http://www.matrix67.com/blog/archives/263 去年年底写的关于位运算的日志是这个Blog里少数大受欢迎的文章之一，很多人都希望我能不断完善那篇文章。后来我看到了不少其它的资料，学习到了更多关于位运算的知识，有了重新整理位运算技巧的想法。从今天起我就开始写这一系列位运算讲解文章，与其说是原来那篇文章的follow-up，不如说是一个r
jsearch的索引文件结构 yangshangchuan 搜索引擎 jsearch 全文检索信息检索 word分词
jsearch是一个高性能的全文检索工具包，基于倒排索引，基于java8，类似于lucene，但更轻量级。 jsearch的索引文件结构定义如下： 1、一个词的索引由=分割的三部分组成：第一部分是词第二部分是这个词在多少