Probability Theory


by Inexorably

By making approximations on a system, it is far simpler to see general rules and logic lines, and their effects, than pre-approximations. To this end, let us look at Yu-Gi-Oh! after making some approximations in this manner.
(作者的目的应该是完成一个可编程的、模拟环境卡组间对局胜率的算法
目前我的进展是可以计算3回合内定胜负的卡组间的对局胜率
本文为Part1)


Part1 - “Dice Theory”

What does it mean to win? To win, it simply means that the combination of your decision making, deck building, and luck were greater than the decision making, deck building, and luck of your opponent. In a way, a game can be described or approximated as the two players each pulling out their [heavily modded, sleeved, 1st edition hobby league ready-microwaved] dice, and rolling them. Whoever rolls higher would be the winner.
胜负,意味着什么?简而言之,就是敌我双方在决策、构筑、运气这三个指标上的综合对比。从某个角度,一场决斗可以被 描述 或 近似地看 成两个玩家各自投出一枚骰子,点数大的一方就是赢家。

In this comparison, the various numbers on the dice would be the results of a formula accounting for the three qualities stated above (decision making, deck building, and luck). It should be noted that your opening hand is both a subset of deck building and luck, and thus the variance in your opening hands is shown in the dice’s face values. A dice could have various amounts of sides (just like how decks have various amounts of standard plays – example, +1 Fire Fist vs Dragon Rulers). In the following examples, I’m going to set the amount of sides to 10 for all, because it makes it easier to illustrate my point.
在这个对比中,骰子上的点数就是上述三个指标(谋筑命)经过一个合理算法(具体见下文)演算整合后的结果。值得一提的是,你的起手也是构筑与运气的一个子集,因此起手的差异同样会从骰面的点数 上显现出来。不同的骰子可以拥有不同数量的骰面(形状不规则),就像不同卡组可以有不同数量的标准展开(例如+1炎星 VS 征龙)。在之后的例子中,我将设定所有的骰面数量为10,以方便阐述我的观点。

注释1

骰面函数 P = F(决策, 构筑, 运气)
我把骰面理解为初盘行动力,具体见下文

  • 决策,是指玩家的控制力、设计战术路线的能力,即相同的手牌在不同玩家手中是不同的
  • 构筑,决定不同起手的概率分布
  • 运气,是随机变量的赋值,无法控制但可以通过概率理解

下文主要是构筑的讨论
假设玩家决策为局部理性(Rational Actor Models),即对每个Move的资源变化估值准确

Player one has a dice with not only holographic, astral pack edges but also an 8 on each face – so 10 faces with 8 on each. If he rolls any amount of times, his average value will be 8. This kind of dice, consistent and having decent values, could be alikened to +1 fire fist.
玩家1有一个全息星包棱边并且10面都是点数8(8888888888)的骰子。
无论投多少次,平均值都将是8。
+1炎星就很像这种稳定可靠且数值体面的骰子。

Player two is going for maximum yolo(You Only Live Once!). He just got back from 360 n0sc0ping some pleb on COD, and is using a minimum scope maximum skill dice, on which 6 sides are 9, and 4 sides are 1. The average value of this dice’s faces are 5.8.
玩家2则是莽夫流(人生能有几回搏?此时不搏更待何时?)
他刚在使命召唤盲狙(no scoping)了些平民回来,用的是一个视界小杀伤大的(狙?)6面9点、4面1点(9999991111)的骰子。
{盲狙是准了致命(9),不准则一点伤害没有(1);而上面全8骰子则像散射的AK47,准不准总能给点伤害(8)。这个比喻还是挺生动的}
骰面的平均值是5.8。

So, between player one and player two, who is more likely to win? The simple answer which new players may mistakenly pick is player one, because on average his roll is an 8, versus player two, who’s rolling not only his swagged out tricycle but also a 5.8. However, at no point does player two’s dice land on 5.8 – it will always land on either a 9, or a 1. If it lands on a 9 player two wins, and if it lands on a 1 he loses. So in other words, player two has the advantage with a 60% win rate (as games approach infinity).
所以,这俩人中,谁更可能获胜?
对于新手来说,玩家1(8888888888)是显而易见的答案, 因为平均点数8,对比玩家2(9999991111)的5.8,有种大炮干鸟枪的错觉。
其实玩家2的骰子任何时候都不会落在点数5.8上,而是非9即1。落在9点赢了,落在1点投 了(9>8>1)。
换句话说,玩家2胜率在60%(样本趋向无穷),更有优势。

Let’s change player one out for player three. Player three is using a consistent dice from a past meta, so it has 6 on each of its 10 sides. Player three will have the same win – loss rate as player one versus player two’s dice, but will have a 0% win rate against player one. So, let’s look at this:
让我们把玩家1换成玩家3。玩家3用的是一个10面6点(6666666666)的稳定的骰子代表旧环境的主流卡组。
玩家3将有相同的胜负率对抗玩家2的骰子,但永远无法战胜玩家1。如下所示:

(1). Player one(8888888888) vs Player two(9999991111): 40-60

(2). Player one(8888888888) vs Player three(6666666666): 100-0

(3). Player three(6666666666) vs Player two(9999991111): 40-60

Note that player one and player three seem to be equal based on our win rates in (1) and (3).
值得注意的是玩家1和玩家3在case(1)和case(3)中的胜率比相等

However, they are obviously not equal when we look at (2), with a 100% win rate for player one. So, what does this mean?
然而,在case(2)中却明显不同,玩家1(对玩家3)有着100%的胜率。这意味着什么呢?

注释2

Player1(8888888888):高上限,高稳定
Player2(9999991111):高上限,低稳定
Player3(6666666666):低上限,高稳定

低稳定→【Variance Based,波动基准】
高稳定→【Consistency Based,稳定基准】
(以上术语将在后文中出现)

一个显见的结论
Player2 vs Player3易在开局(第一次投掷)出现明显优劣势(即autowin(差值为+3)和autoloss(差值为-5)),而快速分出胜负
可以通过起手概率分布,确定胜率比

Traditionally, a difference between OCG and TCG deck building has been noted: OCG seems to run builds with more variance between hands (sacrificing consistency for higher power). TCG traditionally (not saying it is forever to be static) runs builds with lower variance between hands, but at the cost of some of the higher power (and inconsistent) hands. This does not only extend to variance in the consistency sense – ‘win-more’ has become a buzzword in the TCG, but with good reason? When looking at the game in the approximated dice form above, we see consistency is not the deciding factor – nor is a card like Dark Hole. In the past, Dark Hole was a very dualistic card – good for baiting your opponent into overcommitting and good when you were in a losing position, but suboptimal when you have an established field. The argument here is generally ‘if I’m winning its fine to have a suboptimal card, as if I start to lose the card becomes optimal’. However, is this correct?
传统意义上,OCG与TCG的卡组构筑上有着不同的倾向:OCG倾向于手牌变数大的构筑(牺牲稳定性而追求更高Power手牌的可能性);TCG则往往倾向于手牌变数小的构筑(并不是说将一成不变),但牺牲了某些更高Power起手的可能性(及不确定性,即低Power起手的可能性)。
这不仅仅局限于稳定性上的变数 —— 'win-more'(优势卡,锦上添花)已经成为TCG的一个术语,(在这个Dice Theory下)又有什么好的解释吗?
{(起手行动力)稳定的目标是胜率,而行动力上限也是构筑上追求胜率的一个重要因素}
当我们通过上述dice模型的视角重新审视这个游戏,我们发现稳定性并不是(胜负的)决定性因素,类似Dark Hole(等在环境中平均价值高)的卡片也不是。过去,Dark Hole是一张(在不同情况下的价值)两极分化的卡片:引诱对手过度地场面投资和当你处于场面劣势时,它能体现出高价值(optimal);但当你已经建立起场面时,却低于平均值(subpar)。
这里一般出现的争议是“当我处于优势则无所谓多一张低费卡,相反处于劣势则多了张高费卡”。
然而,这是正确的吗?{这样的描述足够精确吗?}

Let’s say that player two adds a Dark Hole, or a similar card, to his deck. What happens here is the values it at suboptimal times and optimal times become averaged over all faces, but are applied differently to different faces of the dice? For example, two of the faces with 1’s on them may become 4s, while two of the faces with 9s on them (when he is in a winning position / established field) may become 7s (he lost a combo piece for dark hole). His dice now has two faces with 1s, two faces with 4s, four faces with 9s, and two faces with 7s.
让我们假设玩家2多下了张Dark Hole或者类似的卡片去卡组。
这张卡价值的高低将分摊到所有骰面(10面)来决定。对于不同的骰面,估值也不同吗?
例如,(投入这张卡后)两个1点的骰面将变成4点,然而两个9点的骰面(胜势/场面优势)将变成7点(上手均卡Dark Hole而损失了combo组件)
他的骰面现在是(1144779999)
{可以同样方式分析强夺、终焉之始等}

His average dice roll is now a 6, up from 5.8 – a more consistent deck. Let us examine his win rates.
他的平均点数从5.8上升到6,形成一个更稳定的构筑。让我们来校验他的胜率。

  1. Player two(1144779999) vs Player one(8888888888): 40-60

  2. Player two(1144779999) vs Player three(6666666666): 60-40

While his win rate vs player three is still 60%, his win rate versus player one has dropped 20% (from 60% to 40%) because he sacrificed power for consistency. So power is a broad word, how are we defining it? Power is generally used to describe how strong a field can be thrown up (example, Abyss-Teus + Aqua Spirit used to be a common power play), as well as how much damage (both life point and card advantage) something can do (Judgment Dragon is powerful). However, speed / tempo is built into this definition, and often overlooked. There is a winning threshold in this game, where if you are above your opponent by a certain amount (dependent on point in the game and match up) you win.
尽管他(玩家2)对玩家3的胜率仍有60%,他对玩家1的胜率下降了20%(60%→40%)因为他牺牲Power上限换取了稳定性
所以Power是一个宽泛的词,如何(更精确地)定义它?
Power一般用来描述(回合)可以铺出的场面有多强(例如邓氏+水之精灵=锁链+R7曾是一个经典的Power Play),同时可以做到多少优势对换(血量和资源上,削减对手)(例如裁龙,就很Powerful)
然而,往往被忽视的 牌速/节奏,也应当融入这个定义。这个游戏中一个获胜手段就是,(在节奏上)领先于对手一定量。

战术节奏优势
策略上有一定的提前量,能一定程度上预测对手行为并针对性地调整资源结构,对大部分场面有充足的资源和行动力来修正,维持提前量/最终将场面状态控制在甜区内。例如,HAT环境,Sahabi的永火策略,对环境其他策略的牌速压制。

注释3

以我的理解
这个模型,实际上是拟合每回合【行动力】的曲线
【行动力】是节奏变化的能力上限(节奏的增减,不同节奏间的取舍/转换)
【节奏】(血量节奏,资源节奏(生产效率,对换效率))(战术节奏)
(节奏这个概念的范畴,后文会提及所以在此提一下)
结合上文,所谓“上限”是指【行动力】所支撑的节奏变化幅度,而“稳定”是指开局【行动力】离散程度(体现于统计学标准差,理想样本容量为无限)

行动力主要由两部分组成,进攻端防御端,这两部分有不同的描述方法

  • 进攻端行动力
    体现在己方节奏的提高上
  • 防御端行动力
    体现在对方节奏的损失上(对方行动力增益为negative,换算为己方则是positive)
    这部分投资的收益的回收是reactive的,对换的主动权在对手,所以是一种预投资,变被动为主动需要【引导对换】
    防御端行动力较进攻端变数大,需要基于已知信息进行估值,实际的对换发生后定价
    对方防御端行动力则是己方进攻端行动力估值的重点

行动力
=(进攻端行动力 - 对方·防御端行动力)+ 防御端行动力
= 系统行动力 + 均卡行动力

由于功能单位的特性和上手率,我们一般将资源划分为系统和均卡,均适用于以上框架
系统功能组件的上手一般较均卡稳定,所以往往优先关注系统行动力上限,也最易分析
在系统组件积累>消耗(往往生产效率>对换效率)时,系统行动力上限逐步提高至其极限值(Limit),从玩家2的角度即骰面点数1→9
系统行动力主要关注系统资源和资源激活机制
起手质量(资源存量和激活机制上手与否)决定初始系统行动力,稳定性越高,即系统功能组件易凑齐的系统,越容易在前期达到系统行动力上限的极限
结合系统节奏的定义,901机壳的系统节奏0节奏3,可以设定为骰面点数03

Big Deck与Small Deck不是坑量的本质界限
若进攻端行动力难于提高(+1炎星)/存在低谷(机壳进入节奏3之前),则期待防御端的弥补,最终目标都是手牌资源效率的最大化
优化资源结构也是提高整体行动力的手段,系统内有将进攻端转化为防御端的手段,例如BA火湖/机壳再星、神龙骑士,此外则依靠构筑精度、屯牌策略
而Combo Deck由于功能单位大,需要充足的系统资源,则构筑上会削减防御端,即坑量少

当资源无法支持行动力达到预设上限时,可以资源积累为核心目标,即将行动力投资在(资源节奏中的)资源生产效率

Boundary: 当行动力达到预设上限,若资源进一步积累则是资源溢出,应将重心放在节奏推进上,即血量节奏(对换LP)和(资源节奏中的)资源对换效率(削减对手资源以降低对手行动力)

(下文是以Dice Model来对环境卡组进行分类和对局解释)

To better implement describe this, let’s give each player three hearts. A player loses the difference in rolls in hearts, and players keep rolling until one dies. What does this do? If one player rolls a 9 and the other a 6, the player whom rolled a 6 will lose right on the first roll. However, if one player rolls a 9 and the other an 8, and then the next roll is 4 vs 8, the player rolling the 8s will have one: as the variance based player failed to reach the winning threshold in the allotted time (a better way to describe this would be to have players gaining hearts per turn given by a non-linear function dependent upon the number of turns).
为了更有效地(形象地)描述这个(Dice Theory的运作),假设每个玩家拥有3颗心。一个玩家失去掷骰子点数差的心数,这样两个玩家持续掷直到一方死亡(心数归零)。
这将发生什么?如果一个玩家(玩家2,1144779999)掷出9点而另一个(玩家3,6666666666)掷出6点,掷出6点的玩家会直接输在这第一轮投掷。然而,如果一个玩家(玩家2)掷出9点而另一个(玩家1,8888888888)掷出8点,下一轮中却是4点对抗8点,则第一轮中掷8的玩家获胜。
波动基准的玩家(玩家2,上限高稳定性差;相对的,玩家1/3的dice属于稳定基准)没能在预设的时间内达到获胜门槛(战术中的win position)
(一个更好的描述是,让玩家们每回合获得一定的心数,数量由一个以当前回合数为自变量x的非线性函数F(x)确定)

At this point we realize that if we are the variance player we have to just forgo the lost hands (because if we roll a 1 or a 4 in Yu-Gi-Oh! it’s not likely to turn into a 9) and find the winning threshold. Once we have done this, we transition into maximizing the likelihood we reach that threshold (as there is no difference between rolling an 11 against an 8 or a 17 against an 8), and then maximize it. Of course, due to the sheer number of hands possible in a 37 card deck, our plot will not simply be us seeing the threshold and then having a line for two units at four, for two units at one, for four units at nine, and for two units at seven – if we were to graph it holistically, with ‘true’ values assigned to each hand, it would reflect a traditional, continuous graph much more due to the number of points. At this point we would graph different builds, and see which has the largest unit distance above the threshold: in effect, which value of deck satisfies this for the largest percentage of hands. This would of course be extremely tedious to do by hand, and thus would be done by computer (if we generalize hands to a degree this is actually not hard to code).

Returning from this tangent to the concept of assigning different values to different faces, we have four dice, with the following faces:

  1. Ten 8s (Tier One Consistency)

  2. Ten 6s (Lower Tier Consistency)

  3. Four 1s, Six 9s (Variance)

  4. Two 1s, Two 4s, Two 7s, Four 9s (Variance with Dark Hole type card(s))

Our averages are 8, 6, 5.8, and 6, respectively. We previously examined the variance vs consistency match up, and saw that it was based on the winning threshold being reached as fast as possible (using the model with non-linear amount of hearts gained per turn to represent grind game). In Yu-Gi-Oh!, this is seen how a combo heavy, aggressive deck will generally win with the superior power it generates over a limited amount of turns – compared to the relatively linear power generation of a deck such as +1 Fire Fist or HAT, or will simply fail to win early and then get out grinded as the average power decreases with following turns. This is due to Yu-Gi-oh!’s lack of a restraining mana system – the only thing limiting the combo deck’s aggression is its ability (how long it takes) to convert hand advantage into field advantage. While we are on the Magic related note, it should be noted that dice interactions describe deck interactions between agro, combo, and control relatively well, while allowing us to easier compare this to Yu-Gi-Oh! despite the mana system.

Let us examine the consistency mirror, and the variance mirror. In the consistency mirror, the win will almost always go to the player with the higher average roll. For example, going back to January 2014 we see things like Bujins, Blackwings, and Hunders doing abysmally. This represents the concept of fairness that has been used increasingly in recent times – these decks are consistency based in rolls, and for the most part do not have high power variance rolls. Because of this, the winning threshold between these decks and a tier one consistency deck (+1 fire fists, represented by dice (1) above) is very rarely achieved in early game (first couple rolls), and almost without fail the lower tier consistency deck falls. From this, we see that there is almost no benefit in playing lower tier consistency based decks.

In contrast, the variance mirror includes two dice with large bounds in roll values. This is allows for the winning threshold to be achieved extremely easily, as if one person rolls an 8 and the other a 2, the game immediately ends. This concept is exemplified in a match up like Quickdraw quasar vs Karakuri. Essentially whoever bricks will lose immediately if the other does not brick equivalently (for example rolling a 3 vs a 2). However, at this point we notice that with the high variance rolls there is little to be gained by increasing your roll power – for example, if your opponent rolls a 2 there is no difference between rolling a 6 or rolling a 9. So, if the meta is mostly composed of variance based decks or dice, you should seek to restrict your high rolls to the minimum winning threshold, and increase your lower variance rolls enough to increase your win rate slightly (for example, adding dark hole type cards).

Referring to a post by Hoban in the ban list thread, we see him note that the format generally goes consistency -> variance in terms of tier one decks / dice. Of course, over time builds become refined and more consistent at hitting their goal (what they’re winning condition is). In terms of a constancy deck, which generally has a stable, non-combo based core, we might see slight adjustments to the monster line up (such as the addition of Card Car Ds to Fire Fist, though Card Car is not a true monster) and standardizations to the main deck, such as fiendish. This is because in a consistency-based meta, there is only one ‘best dice’, and thus this dice changes to combat itself better. This represents the first half of January 2014’s format. Then variance is introduced in the form of Mermail. Contrary to +1 Fire Fist, Mermail did in fact have the ability to brick, which resulted in something resembling dice (3) or dice (4) (this was further accentuated games two and three where the fire fist could draw a macro and win unless Mermails could quickly out it). However, as noted in the constancy vs variance section, there is no difference between losing with a difference of 10 cards in card advantage between you and your opponent, and a difference of 1 card – both mean a loss. Because of this, Mermail’s variance based style allowed it to accept the losses to macro / difi, and focus on the six or so faces of very high value rolls which were enough to reach the winning threshold over fire fist without progressing to the late game, or at least reach the late game with correct set-up (waters for Tidal in grave, and preferably a controller in hand)

你可能感兴趣的:(Probability Theory)