[文献名] Calder, Brad, Dirk Grunwald, Michael Jones, Donald Lindsay, James Martin, Michael Mozer, and Benjamin Zorn. “Evidence-Based Static Branch Prediction Using Machine Learning.” ACM Trans. Program. Lang. Syst. 19, no. 1 (January 1997): 188–222. doi:10.1145/239912.239923.
[相关观点]
1.综述了基于程序的方法:利用程序结构去进行分支预测。
2.此文利用一套现有的程序作为依据去预测新程序中的分支行为。
3.优势:基于程序结构技术,在不同的程序语言和风格间依然高效,不需要专家定义的启发方法。
4.ESP branch prediction results in a miss rate of 20%, as compared
with the 25% miss rate obtained using the best existing program-based heuristics.
This is a standard neural network architecture. We also use a fairly standard neural network dynamics in which the activity of hidden unit i,denoted hi, is computed as
where xj is the activity of input unit j; wij is the connection weight from input unit j to hidden unit i; bi is a bias weight associated with the unit;and tanh is the hyperbolic tangent function
[文献名] Reches, S., and Shlomo Weiss. “Implementation and Analysis of Path History in Dynamic Branch Prediction Schemes.” IEEE Transactions on Computers 47, no. 8 (August 1998): 907–12. doi:10.1109/12.707596.
[相关观点]
1.以地址作为跳转依据
and branch prediction depends on the actual program execution path.
2.利用了多种信息进行异或操作。
3.We have observed that the lower address bits carry higher weight in every branch address or path history component (ADDR, HIST, and LAST)
4.本质上依旧依赖于GHR(全局跳转历史)
[文献名] Chang, M.-C., and Y.-W. Chou. “Branch Prediction Using Both Global and Local Branch History Information.” Computers and Digital Techniques, IEE Proceedings - 149, no. 2 (March 2002): 33–38. doi:10.1049/ip-cdt:20020273.
[相关观点]
1.比较全面的综述,可以做参考。
2.局部与全局历史记录利用
3.举例子说明了全局和局部两种方法都有缺点。
4.优点:利用理论分析了G和L的优缺点,同时给出了结合两者的方案。
5.缺点:在仿真结果并不是所有结果都比G和L要好。
[文献名] Young, Cliff, and Michael D. Smith. “Static Correlated Branch Prediction.” ACM Trans. Program. Lang. Syst. 21, no. 5 (September 1999): 1028–75. doi:10.1145/330249.330255.
[相关观点]
1.BP定义:Branch prediction, whether dynamic (hardware-based) or static (software-based), makes good guesses about likely branch targets and allows the instruction unit to fetch instructions early.When predictions are accurate, the execution engine can operate at full speed and performance improves.
2.相关性:Perhaps the most promising branch-related discovery of the last 10 years is that branches exhibit correlation: the outcome of a conditional branch is often determined by the branch's historical pattern of outcomes or the historical pattern of outcomes of its neighboring branches.
3.一些编译器也可以发现出程序中的相关性。
4.对于编译器的优化来说,最好是将程序看作图而不是流水指令。
5.对程序段进行控制流图(CFG)分析以确定“流”的频率和关联性。
6.If we can collect statistics about path frequencies, then some optimizations can exploit these di
erences in behavior to produce more effcient versions of the final program.
8.利用最小的历史信息去确定相关性
最后利用计算编程导出控制树,确定全局的预测方案。
[文献名] Jiménez, Daniel A., and Calvin Lin. “Neural Methods for Dynamic Branch Prediction.” ACM Trans. Comput. Syst. 20, no. 4 (November 2002): 369–97. doi:10.1145/571637.571639.
[相关观点]
1.单纯利用神经网络方法将会消耗很多资源。
2.使用了最简单的感知器方法。
3.也提到了overriding方法:前后2预测,接受一些延迟惩罚
4.优势:They are easier to understand, they are simpler to implement and tune, they train faster, and they are computationally much less expensive。
4.指出PHT方法呈现指数级上升,而神经元为线性级。
5.正权重代表正相关,负权重代表负相关。
6.综述了动态分支方法。二级适应。----缺点:别名使用问题。
7.We show that for a 4 KB hardware budget, a simple version of our method that uses a global history achieves a misprediction rate of 4.6% on the SPEC 2000 integer benchmarks, an improvement of 26% over gshare
[文献名] Parikh, D., K. Skadron, Yan Zhang, and M. Stan. “Power-Aware Branch Prediction: Characterization and Design.” IEEE Transactions on Computers 53, no. 2 (February 2004): 168–86. doi:10.1109/TC.2004.1261827.
[相关观点]
1.Yet, the branch predictor, including the BTB, is the size of a small cache and dissipates a nontrivial amount of power—typically about 7 percent and as much as 10 percent of the total processor’s power dissipation 指出cache消耗大
2.给出了数种关于降低功耗的方法:
Accuracy: For a given predictor size, better prediction accuracy will not change the power in the predictor.but will make the program run faster, hence reducing total energy.
Configuration: Changing the table size(s) can reduce power within the predictor, but may affect accuracy.
Number of Lookups: Reducing the number of lookups into the predictor is an obvious source of power savings, but it must come with no performance penalty.
Number of Updates: Reducing the number of predictor updates is another obvious way to reduce power, but is less efficient because misspeculated computation means that there are many more lookups than updates; because of this aspect, we do not further consider updates in this paper.
[文献名] Falcon, A, J. Stark, A Ramirez, Konrad Lai, and M. Valero. “Better Branch Prediction through Prophet/critic Hybrids.” IEEE Micro 25, no. 1 (January 2005): 80–89. doi:10.1109/MM.2005.5.
[相关观点]
***重点
1.对后续预测的结果的利用。
2.尽管存在多预测器的设计,但是它们使用的都是同一时刻的信息。
3.阐述了后续预测器应该能检测出前预测错误的百分比部分。
5缺点:需要两套预测器,不可避免需要内存,加长流水线,性能与硬件数相关。没有脱离以RAM为主的预测架构。
[文献名] Jiménez, Daniel A. “Improved Latency and Accuracy for Neural Branch Prediction.” ACM Trans. Comput. Syst. 23, no. 2 (May 2005): 197–218. doi:10.1145/1062247.1062250.
[相关观点]
1.阐述了神经网络方法有最高的精度。
2.阐述了神经网络方法难实践因为预测的高延时
3.指出感知器只能区分线性可分的分支,也存在不可分的分支。
4.有些分支是不可分的,但是在程序路径上可分。
5.结合了路径历史和模式历史。
6.指出了感知器需要长历史记录。这个方法可以使用更短的历史记录
7.Calder et al. [1995]提出使用神经网络进行静态分析。
8.Evers et al. 1998 感知器之所以比二级预测精度高是因为其长历史分析能够有效提取相关性信息。
9.Path:使用分支来的路径而不是分支地址本身
10.指出了预测慢的主要源头是读取缓存
11.Rosenblatt 1962;感知器起源
12.Jim´enez and Lin 2001,2002 感知器使用
改善权重值的时间上选取以获得更快的取值。