连理o

Memory Hierarchy

参考： $Computer\ Arichitecture\ (6\th\ Edition)$

The Principle of Locality
Memory Hierarchy
- Terminology
- 存储层次的性能参数
- Four Questions for Memory Hierarchy
- - Q1: Where can a block be placed in the upper level?
  - Q2: How is a block found if it is in the upper level?
  - Q3: Which block should be replaced on a miss?
  - Q4: What happens on a write?
Virtual Memory Address Space
- Four Memory Hierarchy Questions Revisited
- - Q1: Where Can a Block Be Placed in Main Memory?
  - Q2: How Is a Block Found If It Is in Main Memory?
  - Q3: Which Block Should Be Replaced on a Virtual Memory Miss?
  - Q4: What Happens on a Write?
- Caching vs. Demand Paging
Cache Design
- Six Basic Cache Optimizations
- What causes a MISS?
- 1. Larger Block Size to Reduce Miss Rate
- 2. Larger Caches to Reduce Miss Rate
- 3. Higher Associativity to Reduce Miss Rate
- 4. Multilevel Caches to Reduce Miss Penalty
- 5. Giving Priority to Read Misses over Writes to Reduce Miss Penalty
- 6. Avoiding Address Translation during Indexing of the Cache to Reduce Hit Time
- 7. Victim Cache to Reduce Miss Rate
- Summary of Basic Cache Optimization

The Principle of Locality

局部性原理

The Principle of Locality:
- Program access a relatively small portion of the address space at any instant of time.
- It is a property of programs which is exploited in machine design.
Two Different Types of Locality:
- Temporal Locality (时间局部性) (Locality in Time): If an item is referenced, it will tend to be referenced again soon (e.g., loops, reuse)
- Spatial Locality (空间局部性) (Locality in Space): If an item is referenced, items whose addresses are close by tend to be referenced soon (e.g., straight-line code, array access)

Memory Hierarchy

Goal: Illusion of large, fast, cheap memory

Memory Hierarchy: Apple iMac G5

Terminology

Hit 命中: data appears in some block in the upper level (example: Block $X$ )
- Hit Rate: the fraction of memory access found in the upper level
  - So high that usually talk about Miss rate
- Hit Time: Time to access the upper level which consists of RAM access time + Time to determine hit/miss
Miss 失效: data needs to be retrieve from a block in the lower level (Block $Y$ )
- Miss Rate = 1 - (Hit Rate)
  - as MIPS to CPU performance, miss rate to average memory access time in memory
- Miss Penalty: Time to replace a block in the upper level + Time to deliver the block to the processor
Hit Time << Miss Penalty

存储层次的性能参数

设： $S$ ── 容量; $T_A$ ── 访问时间; $C$ ── 每位价格
考虑由 $M_1$ 和 $M_2$ 构成的两级存储层次：
- $M_1$ 的参数： $S_1$ ， $T_{A_1}$ ， $C_1$
- $M_2$ 的参数： $S_2$ ， $T_{A_2}$ ， $C_2$
每位价格 $C$
命中率 $H$ 和失效率 $F$
$H＝N_1/(N_1＋N_2)\\F＝1－H$
- $N_1$ ── 访问 $M_1$ 的次数; $N_2$ ── 访问 $M_2$ 的次数
Average memory access time = Hit time + Miss rate x Miss penalty
$T_A＝HT_{A_1}＋(1－H)(T_{A_1}＋T_M)＝T_{A_1}＋(1－H)T_M＝ T_{A_1}＋FT_M$
- 失效开销 Miss Penalty $T_M$ ：time to replace a block from lower level, including time to replace in CPU
  - access time: time to lower level = $f$ (Latency to lower level)
  - transfer time: time to transfer block = $f$ (BandWidth between upper & lower levels)
  - 从向 $M_2$ 发出访问请求到把整个数据块调入 $M_1$ 中所需的时间: $T_M ＝T_{A_2}＋T_B$ (传送一个信息块所需的时间为 $T_B$ (根据数据量大小发生变化 (总线宽度)))

程序执行时间

CPU 时间＝(CPU 执行周期数+存储器停顿周期数) $\times$ 时钟周期时间
- 存储器停顿时钟周期数＝访存次数 $\times$ 失效率 $\times$ 失效开销

例

假设 Cache 失效开销为 50 个时钟周期，当不考虑存储器停顿时，所有指令的执行时间都是 2.0 个时钟周期，访问 Cache 失效率为 2%，平均每条指令访存 1.33 次 (每条指令都必访问指令 Cache，不一定访问数据 Cache)。试分析 Cache 对性能的影响

解

CPU 时间＝IC $\times$ (2.0＋1.33 $\times$ 2% $\times$ 50) $\times$ 时钟周期时间 = IC $\times$ 3.33 $\times$ 时钟周期时间;
实际 CPI ：3.33
但若不采用 Cache, 则：CPI＝2.0＋50 $\times$ 1.33＝68.5

Four Questions for Memory Hierarchy

Q1: Where can a block be placed in the upper level?

块的放置：映像规则 Block placement

全相联映象：主存中的任一块可以被放置到 Cache 中的任意一个位置
- 空间利用率最高，冲突概率最低，实现最复杂。
直接映象：主存中的每一块只能被放置到 Cache 中唯一的一个位置; 对于主存的第 $i$ 块，若它映象到 Cache 的第 $j$ 块，则 $j＝i\ \mod (M )$ （ $M$ 为 Cache 的块数）; 设 $M＝2^m$ ，则当表示为二进制数时， $j$ 实际上就是 $i$ 的低 $m$ 位
- 空间利用率最低，冲突概率最高，实现最简单
组相联映象：主存中的每一块可以被放置到 Cache 中唯一的一个组中的任何一个位置。若主存第 $i$ 块映象到第 $k$ 组，则: $\ \mod(G)$ （ $G$ 为 Cache 的组数）设 $G＝2^g$ ，则当表示为二进制数实际上就是 $i$ 的低 $g$ 位
- 是直接映象和全相联的一种折衷。相联度越高，空间利用率就越高，块冲突概率就越低，失效率也就越低

Q2: How is a block found if it is in the upper level?

块的定位：查找算法 Block identification

tag 是块标记，Index 是组地址

Fully Associative Cache

2-Way Set-Associative Cache

Direct-Mapped Cache

Q3: Which block should be replaced on a miss?

块的替换：替换算法 Block replacement

Easy for Direct Mapped, no choice
Set Associative or Fully Associative:
- Random
- First in, first out (FIFO)
- LRU (Least Recently Used)

Q4: What happens on a write?

写策略 Write strategy

Additional option – let writes to an un-cached address allocate a new cache line (“write-allocate”). (写一个不在 cache 中的数据)

Write allocate and No-write allocate

Write allocate (fetch on write) 按写分配 (写时取)
- The block is allocated on a write miss. 写失效时，先把所写单元所在的块调入Cache，再行写入
- Write-back caches always use. (要写的数据总在 Cache 中，因此常用写回法)
No-write allocate (write around) 不按写分配 (绕写法)
- This apparently unusual alternative is write misses do not affect the cache. Instead, the block is modified only in the lower-level memory.
- Write-through caches often use.

Write Policy Choices

Cache hit: write through / write back
Cache miss: no write allocate / write allocate
Common combinations:
- write through & no write allocate
- write back & write allocate

Write Buffers for Write-Through Caches

Q. Why a write buffer?
- So CPU doesn’t stall
Q. Why a buffer, why not just one register?
- Bursts of writes are common.
Q. Are Read After Write (RAW) hazards an issue for write buffer?
- Yes! Drain buffer before next read, or send read 1st after check write buffers.

Virtual Memory Address Space

User programs run in an standardized virtual address space (每一个进程都有自己的虚存空间)
Address Translation hardware managed by the operating system (OS) maps virtual address to physical memory Hardware supports “modern” OS features: Protection, Translation, Sharing

Use virtual addresses for cache?

A. Synonym(同义字) problem. If two address spaces share a physical frame, data may be in cache twice. Maintaining consistency is a nightmare.

Four Memory Hierarchy Questions Revisited

Q1: Where Can a Block Be Placed in Main Memory?

Operating systems allow blocks to be placed anywhere in main memory. The strategy is fully associative.

Q2: How Is a Block Found If It Is in Main Memory?

Both paging and segmentation rely on a data structure that is indexed by the page or segment number.
- For paging, this data structure is page table which contains the physical page address. Indexed by the virtual page number, the size of the table is the number of pages in the virtual address space.
  - Given a 32-bit virtual address,4 KB pages, and 4 bytes per Page Table Entry (PTE), the size of the page table would be $(2^{32}/2^{12}) \times 2^2 = 2^{22}$ or $4$ MB.

Q3: Which Block Should Be Replaced on a Virtual Memory Miss?

Replace the least-recently used (LRU) page.
- A use bit or reference bit is provided. The operating system periodically clears the bits and later records them so it can determine which pages were touched during a particular time period.

Q4: What Happens on a Write?

Because of the great discrepancy in access time, the write strategy is always write back.
- Virtual memory systems usually include a dirty bit. It allows blocks to be written to disk only if they have been altered since being read from the disk.

Page replacement policy

Caching vs. Demand Paging

Cache Design

Six Basic Cache Optimizations

Average memory access time = Hit time + Miss rate x Miss penalty

Reducing Miss Rate

(1) Larger Block size (compulsory misses)
(2) Larger Cache size (capacity misses)
(3) Higher Associativity (conflict misses)

Reducing Miss Penalty

(4) Multilevel Caches
(5) Giving Reads Priority over Writes
- E. g., Read complete before earlier writes in write buffer

Reducing hit time

(6) Avoiding address translation when indexing the cache

What causes a MISS?

Three Major Categories of Cache Misses:

Compulsory 强制性 —The very first access to a block cannot be in the cache, so the block must be brought into the cache. These are also called cold-start misses (冷启动失效) or first-reference misses.(首次访问失效)
Capacity 容量 —If the cache cannot contain all the blocks needed during execution of a program, capacity misses will occur because of blocks being discarded and later retrieved.
Conflict 冲突 —If the block placement strategy is set associative or direct mapped, conflict misses will occur because a block may be discarded and later retrieved if too many blocks map to its set. These misses are also called collision misses (碰撞失效).

1. Larger Block Size to Reduce Miss Rate

Larger block sizes will reduce compulsory misses, because larger blocks take advantage of spatial locality.

Larger Blocks may Increase Conflict Misses

Since they reduce the number of blocks in the cache, larger blocks may increase conflict misses and even capacity misses if the cache is small.

可以看到，Cache 容量为 16K 或 4K 时，block size 过大还会导致 miss rate 上升。这是因为在 Cache 容量有限的情况下，过大的 block size 使 Cache 中放不下几个 block，可能会导致其他类型的 miss

Larger Blocks may Increase the Miss Penalty

Assume the memory system takes 80 clock cycles of overhead and then delivers 16 bytes every 2 clock cycles. Thus, it can supply 16 bytes in 82 clock cycles, 32 bytes in 84 clock cycles, and so on.
- 16-byte block $\rightarrow$ Miss penalty: 82; 32-byte block $\rightarrow$ Miss penalty: 84
- The selection of block size depends on both the latency and bandwidth of the lower-level memory.

2. Larger Caches to Reduce Miss Rate

The obvious way to reduce capacity misses is to increase capacity of the cache.
- The obvious drawback is potentially longer hit time and higher cost and power.
- This technique has been especially popular in off-chip caches.

3. Higher Associativity to Reduce Miss Rate

Figure shows how miss rates improve with higher associativity

Higher Associativity Increase the Clock Cycle Time

Greater associativity can come at the cost of increased hit time.
- Clock cycle time 2-way = 1.36 × Clock cycle time 1-way
- Clock cycle time 4-way = 1.44 × Clock cycle time 1-way
- Clock cycle time 8-way = 1.52 × Clock cycle time 1-way

硬件复杂度太高，因此现在最多也就三级 Cache

4. Multilevel Caches to Reduce Miss Penalty

The performance gap between processors and memory leads the architect to this question: Should I make the cache faster to keep pace with the speed of processors, or make the cache larger to overcome the widening gap between the processor and main memory?
- One answer is, do both. Adding another level of cache between the original cache and memory simplifies the decision.
- The first-level cache can be small enough to match the clock cycle time of the fast processor. Yet the second-level cache can be large enough to capture many accesses that would go to main memory, thereby lessening the effective miss penalty.

A Typical Memory Hierarchy

多核架构下的多级 Cache

产生 Cache 一致性问题，这个后面讲

It Complicates Performance Analysis

Average memory access time = $\textrm{Hit\ time}_{L_1}$ + $\textrm{Miss\ rate}_{L_1}$ $\times$ ( $\textrm{Hit\ time}_{L_2}$ + $\textrm{Miss\ rate}_{L_2}$ $\times$ $\textrm{Miss\ penalty}_{L_2}$
- Local miss Such as $\textrm{Miss\ rate}_{L_1}$ , and $\textrm{Miss\ rate}_{L_2}$ .
- Global miss The global miss rate for the first-level cache is still just $\textrm{Miss\ rate}_{L_1}$ , but for the second-level cache it is $\textrm{Miss\ rate}_{L_1}$ $\times$ $\textrm{Miss\ rate}_{L_2}$ .

5. Giving Priority to Read Misses over Writes to Reduce Miss Penalty

e.g. Read complete before earlier writes in write buffer

write-through cache

With a write-through cache the most important improvement is a write buffer of the proper size. Write buffers do complicate memory accesses because they might hold the updated value of a location needed on a read miss.
Assume a direct-mapped, write-through cache that maps 512 and 1024 to the same block, and a four-word write buffer that is not checked on a read miss. Will the value in $R_2$ always be equal to the value in $R_3$ ?
- 不一定，得看什么时候把 write buffer 中的脏数据写回 memory

SW R3, 512(R0)		;M[512] ← R3 (cache index 0) (实际写入 write buffer)
LW R1, 1024(R0)		;R1 ← M[1024] (cache index 0) (block 1024 替换 Cache 中的 block 512)
LW R2, 512(R0) 		;R2 ← M[512] (cache index 0)

Solve the Problem

The simplest way is for the read miss to wait until the write buffer is empty.
The alternative is to check the contents of the write buffer on a read miss, and if there are no conflicts and the memory system is available, let the read miss continue. (如果有 conflict，一个比较激进的方法是直接从 write buffer 中取数据)

write-back cache

The cost of writes by the processor in a write-back cache can also be reduced. Suppose a read miss will replace a dirty memory block. Instead of writing the dirty block to memory, and then reading memory, we could copy the dirty block to a buffer, then read memory, and then write memory. This way the processor read, for which the processor is probably waiting, will finish sooner.

6. Avoiding Address Translation during Indexing of the Cache to Reduce Hit Time

Cache must cope with the translation of a virtual address from the processor to a physical address to access memory.

virtual caches vs. physical cache

Using virtual addresses for the cache eliminates address translation time from a cache hit. 但每个进程都有一个虚存空间，即不同虚地址可能对应同一个物理地址。有如下两种解决方案：
- (1) One solution is when a process is switched, the virtual addresses refer to different physical addresses, requiring the cache to be flushed.
- (2) The alternative is to increase the width of the cache address tag with a process-identifier tag (PID).

virtual caches 最终未能流行，因为产生了 Cache 一致性的问题

7. Victim Cache to Reduce Miss Rate

基本思想：在 Cache 和它从下一级存储器调数据的通路之间设置一个全相联的小 Cache，用于存放被替换出去的块(称为 Victim)，以备重用
- 对于减小冲突失效很有效，特别是对于小容量的直接映象数据 Cache，作用尤其明显
- 例如，项数为 4 的 Victim Cache: 使 4KB Cache 的冲突失效减少 20%～90%

Summary of Basic Cache Optimization

No optimization in this figure helps more than one category.

+ meaning that the technique improves the factor, – meaning it hurts that factor

CPU 指令集架构复杂指令集架构（CISC）和精简指令集架构（RISC） ARM、MIPS、RISC-V和Alpha 指令集架构（Instruction Set Architecture，ISA） EwenWanW AGI 架构 arm开发 risc-v
CPU指令集架构CPU指令集架构是计算机体系结构中与程序设计有关的重要部分。它定义了计算机如何执行和操作指令，是计算机执行程序的基础。指令集架构包括基本数据类型、指令集、寄存器、寻址模式、存储体系、中断、异常处理以及外部IO等多个方面。在CPU指令集架构中，主要有两种类型：复杂指令集架构（CISC）和精简指令集架构（RISC）。复杂指令集架构（CISC）的设计目标是尽可能将任务一次性完成，因此它的
CISC和RISC指令集 TENET- ARM架构架构嵌入式指令集
文章目录1.指令集2.CISC（复杂指令集计算）3.RISC（精简指令集计算）4.RISC的设计初衷5.CISC和RISC流程对比CISC（复杂指令集计算）的实现RISC（精简指令集计算）的实现比较与总结6.CISC与RISC指令对比7.RISC-V1.指令集指令集（InstructionSet）是计算机处理器（CPU）能够识别和执行的所有指令的集合。它是计算机体系结构的一个关键组成部分，定义了处
CSP知识点(人物) IZGRI c++
1958年9月12日，基尔比研制出世界上第一块集成电路，成功实现了把电子管器件集成在一块半导体材料上的构想。2000年，基尔比因发布集成电路而荣获诺贝尔物理学奖。最早提出计算机体系结构的人是冯诺依曼，他提出计算机应该具有五大部件，分别为存储器、运算器、控制器、输入设备和输出设备。其中，控制器和运算器又称CPU，是冯诺依曼计算机体系结构的核心，其他部件都是通过CPU进行通信的。1936年，数学家图灵
《C++内存对齐探秘：优化性能的关键步骤》程序猿阿伟 c++java jvm
在C++编程的深邃世界中，内存对齐是一个常常被忽视却又至关重要的概念。它不仅影响着程序的性能，还与硬件的高效运作紧密相连。让我们一同深入探索如何在C++中进行内存对齐，揭开这一神秘面纱，为我们的编程之旅增添强大的性能优化武器。一、什么是内存对齐内存对齐是指将数据安排在特定的内存地址上，以满足硬件的访问要求。在现代计算机体系结构中，内存访问通常是以特定的字节数为单位进行的，例如4字节、8字节等。如果
并发问题的根源：CPU/内存/IO设备的速度差异码上一元并发编程 java 多线程
CPU、内存、IO设备的速度差异程序整体的性能取决于最慢的操作—读写IO设备为了合理利用CPU的高性能，平衡三者的速度差异，计算机体系结构、操作系统、编译程序做了以下优化：CPU增加了缓存，以均衡与内存的速度差异；操作系统增加了进程、线程，以分时复用CPU，进而均衡CPU与I/O设备的速度差异；编译程序优化指令执行顺序，使得缓存能够更加合理的利用。并发程序的问题根源1.缓存导致的可见性问题单核时代
【jvm】栈顶缓存技术王佑辉 jvm jvm
目录1.说明2.技术背景3.技术原理4.应用场景5.优势与局限5.1优势5.2局限1.说明1.栈顶缓存技术（Top-of-StackCaching，简称ToS）。2.是一种在计算机体系结构中用于提高指令执行性能的优化技术。3.通常与流水线处理器（pipelining）相关，旨在减少数据冒险（datahazards）和控制冒险（controlhazards）,从而提升处理器的执行效率。4.栈顶缓存技
C++竞赛初阶L1-14-第六单元-数组(31~33课)541: T456471 计算书费麓小墨哥 c++免费文章 c++开发语言青少年编程算法数据结构
题目内容下面是一个图书的单价表：计算概论28.9元/本数据结构与算法32.7元/本数字逻辑45.6元/本C++程序设计教程78元/本人工智能35元/本计算机体系结构86.2元/本编译原理27.8元/本操作系统43元/本计算机网络56元/本JAVA程序设计65元/本依次给定每种图书购买的数量，编程计算应付的总费用。输入格式输入一行，含10个非负整数，每两个整数之间有一个空格。第i个整数表示要购买上述
CPU内部结构窥探·「8」--ARMv8的流水线机制冬大大计算机体系结构计算机体系结构 CPU 流水线机制
ARMv8流水线机制分析引言在现代计算机体系结构中，流水线技术是提升处理器性能的重要手段。ARMv8架构作为一款广泛应用于移动设备、嵌入式系统以及服务器中的高效处理器，其流水线机制尤为重要。本文将深入分析ARMv8的流水线机制，探讨其工作原理、设计特点以及优化策略。什么是流水线？流水线是一种将指令执行过程分解为若干个阶段，并使这些阶段能够并行执行的技术。每个阶段完成指令的一部分工作，从而提高整体指
计算机体系结构详解：冯·诺依曼与哈佛体系欢迎交流计算机组成原理嵌入式硬件
提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录一、冯·诺依曼体系结构背景与发展核心特点：优缺点应用领域二、哈佛体系结构三、总结与比较一、冯·诺依曼体系结构背景与发展冯·诺依曼体系结构，又称为普林斯顿体系结构，得名于20世纪40年代中期的约翰·冯·诺依曼及其团队。这一体系结构奠定了现代电子计算机的基本框架，至今仍是大多数计算机系统的核心设计基础。核心特点：数据与指令共享内存
算法部署优化工程师面试题整理发狂的小花 C/C++面试宝典算法面试性能优化计算机视觉
原文来自【知乎-高性能计算方向面试问题总结】个人简介：一个全栈工程师的升级之路！个人专栏：C/C++面试整理CSDN主页发狂的小花人生秘诀：学习的本质就是极致重复!目录整体情况简介高性能计算基础AI框架知识算法题一些比较零碎的问题推荐参考资料整体情况简介面试中的问题基本上分成以下几类：基础的八股文：C/C++，OS，计算机体系结构等。这一部分略，网上已经有很多总结了。高性能计算基础知识：这一部分是
哈佛结构和冯诺依曼结构 UPUPUPEveryday 嵌入式单片机单片机 stm32 嵌入式硬件 mcu
哈佛结构和冯诺依曼结构的联系和区别哈佛结构和冯诺依曼结构是计算机体系结构中两种常见的组织方式，它们有一些联系和区别。联系：数据和指令的存储方式：哈佛结构和冯诺依曼结构都将数据和指令存储在计算机的存储器中，但它们的存储方式略有不同。运算方式：哈佛结构和冯诺依曼结构在进行运算时都采用类似的算法和操作。区别：存储器的划分方式：哈佛结构将指令存储器和数据存储器分开存储，每个存储器有独立的地址空间；而冯诺依
java多线程——并发数据不一致java中的解决方案台风天赋 java多线程多线程 java 并发编程
多线程并发编程线程安全主要是由于多线程并发、同时操作共享变量导致的数据不一致。至于共享变量，需要涉及到计算机体系结构的内容：因为现代计算机都一般是设置了两级甚至三级cache。以两级cache为例：假设此时有两个CUP，线程1 线程2 | | v v CUP1 CUP2 | | v v Cache1-1 Cache2-1 | V 公用c
【软考中级备考笔记】计算机体系结构 lyx7762 笔记软考计算机组成原理
计算机体系结构2月19日–天气：阴转小雪1.冯诺依曼计算机体系结构冯诺依曼将计算机分为了五大部分，分别是：控制器：主要负责协调指令到执行运算器：负责算数和逻辑运算存储器：负责存储在指令执行过程中产生的一些中间变量输出输出设备：用于接收用户输入并将结果显示给用户冯诺依曼计算机体系结构由一下特点：冯·诺依曼计算机主要由五大部件组成，分别是：运算器、控制器、存储器、输入设备和输出设备；冯诺依曼体系结构的
【研究生复试】计算机&软件工程&人工智能研究生复试——资料整理（速记版）——计算机体系结构沐风—云端行者研究生复试—面试——速记资料软件工程考研计算机体系结构计算机人工智能
1、JAVA2、计算机网络3、计算机体系结构4、数据库5、计算机租场原理6、软件工程7、大数据8、英文自我介绍3.计算机体系结构1.基本概念2.指令与寻址3.输入输出系统、贮存体系现代：存储器为中心冯诺依曼：运算器为中心段页式：三次段式或页式：两次4.流水技术原理瓶颈段不能被分割6.互联网络
【研究生复试】计算机&软件工程&人工智能研究生复试——资料整理（速记版）——JAVA 沐风—云端行者研究生复试—面试——速记资料 java 软件工程开发语言考研
1、JAVA2、计算机网络3、计算机体系结构4、数据库5、计算机租场原理6、软件工程7、大数据8、英文自我介绍1.Java1.==和equals的区别比较基本数据类型是比较的值，引用数据类型是比较两个是不是同一个对象，也就是引用是否指向同一个对象，地址是否相同，equals本质上也是，但是可以重写这个方法，比如String和Integer类。2.为什么重写equals要重写hashcode？我个人
【研究生复试】计算机&软件工程&人工智能研究生复试——资料整理（速记版）——数据库沐风—云端行者研究生复试—面试——速记资料软件工程数据库考研计算机
1、JAVA2、计算机网络3、计算机体系结构4、数据库5、计算机租场原理6、软件工程7、大数据8、英文自我介绍4.数据库1.B+树相对于B树的区别及优势B树中有重复元素，B树没有重复元素B树种每个节点都存储了key和data，B+树内节点去掉了其中指向数据(datarecord)的指针，使得每个节点中可以存放更多的key，意味着树的高度可以被压缩B+树的叶子节点是链表形式，可以更方便的进行顺序遍历
【研究生复试】计算机&软件工程&人工智能研究生复试——资料整理（速记版）——自我介绍（英文）沐风—云端行者研究生复试—面试——速记资料软件工程人工智能考研
1、JAVA2、计算机网络3、计算机体系结构4、数据库5、计算机租场原理6、软件工程7、大数据8、英文自我介绍自我介绍英文自我介绍英文第一段：Goodafternoon,dearprofessors,thankyouforthechancetointroducemyself.MynameisYanZhenXing,andIamafinalyearstudentatChongqingUniversi
【研究生复试】计算机&软件工程&人工智能研究生复试——资料整理（速记版）——计算机网络沐风—云端行者研究生复试—面试——速记资料计算机网络软件工程考研
1、JAVA2、计算机网络3、计算机体系结构4、数据库5、计算机租场原理6、软件工程7、大数据8、英文自我介绍2.计算机网络1.TCP如何解决丢包和乱序？序列号：TCP所传送的每段数据都有标有序列号，避免乱序问题发送端确认应答、超时重传：解决丢包问题滑动窗口：避免速度过快或多慢丢包和乱序问题2.cookie和session的区别HTTP是无状态的，一次请求完成，不会持久化请求与相应的信息。为了保存
CPU是如何工作的？什么是冯·诺依曼架构和哈弗架构？车载系统攻城狮嵌入式软件开发 /C语言架构嵌入式硬件单片机
《嵌入式工程师自我修养/C语言》系列——CPU是如何工作的？什么是冯·诺依曼架构和哈弗架构？一、CPU内部结构及工作原理1.1CPU的结构1.2CPU工作流程举例二、计算机体系结构2.1冯·诺依曼架构2.2哈弗架构三、总结快速学习嵌入式开发其他基础知识？>>>>>>>>>返回专栏总目录《嵌入式工程师自我修养/C语言》>>>>>一文帮你快速区分常用存储器！>>>>>一文帮你快速区分常用存储器！>>>
软件评测师学习笔记-计算机体系结构分类 Go_Viola
Flynn分类记忆方法：S：singleI：instructionM：MultipleD：data
STM32-寄存器和HAL库以及如何使用 nownow_ stm32 嵌入式硬件单片机
在电子工程领域，“寄存库”和“HAL库”都是与微控制器（MCU）编程紧密相关的概念。寄存器（Register）含义：在电子工程领域，特别是计算机体系结构和微控制器设计中，寄存器是一种非常小的、快速的存储设备，它位于处理器的内部，用于暂时存储数据或指令地址。寄存器是CPU（中央处理器）内部的一部分，可以直接由CPU访问，因此读写速度非常快。作用：寄存器在电子系统中扮演着关键角色，它们用于存储CPU操
软考09-上午题-计算机体系结构 ruleslol 软考中级学习笔记
一、RISC和CISC一个处理器支持的指令，和指令的字节集编码，称为其：指令集体系结构ISA。1-1、指令集发展的两种途径RISC：精简指令集计算机CISC：复杂指令集计算机1-2、RISC和CISC的区别1-3、真题真题1：真题2：真题3：真题4：真题5：真题6：二、指令流水线2-1、指令的控制方式顺序方式重叠方式流水线方式2-2、指令流水线5条指令的执行时间：（0.1+0.2+0.3）+4*0
使用 C++23 从零实现 RISC-V 模拟器（1）：最简CPU everystep_ c++23 risc-v
本节实现一个最简的CPU，最终能够解析add和addi两个指令。如果对计算机组成原理已经有所了解可以跳过下面的内容直接看代码实现。完整代码在这个分支：lab1-cpu-add，本章节尾有运行的具体指令。1.冯诺依曼结构冯·诺依曼结构是现代计算机体系结构的基础，由约翰·冯·诺依曼在1945年提出。这种结构也称为冯·诺依曼体系结构，其核心特点是将程序指令和数据存储在同一个读写存储器（内存）中，计算机的
计算机体系结构曹元_
计算机体系结构是指那些对程序员可见的系统属性，还包括设计思想与体系结构。今天课课就来和大家分享这篇文章，全面概述了计算机体系结构。要认真阅读~计算机体系结构（ComputerArchitecture）是程序员所看到的计算机的属性，即概念性结构与功能特性。按照计算机系统的多级层次结构，不同级程序员所看到的计算机具有不同的属性。一般来说，低级机器的属性对于高层机器程序员基本是透明的，通常所说的计算机体
汇编程序设计与计算机体系结构,《汇编程序设计与计算机体系结构：软件工程师教程》 —2.8　作业... 当回忆牵手未来汇编程序设计与计算机体系结构
2.8作业2.8.1内存有这样一个算式：taxableIncome=salary-exempts-percent401k/100*salary，其中的变量salary=50000，exempts=7000，percent401k=4.5。在配有IntelCorei7处理器的系统中，taxableIncome的值在内存中是怎样表示的？假设该值是从0x013A32A8h这一地址开始存放的。2.8.2指
计算机体系结构期末复习流程大纲华东设计之美计算机体系结构计算机体系结构
1.存储器和cache存储器的容量、速度与价格之间的要求是相互矛盾的，速度越快，没bit位价格越高，容量越大，速度越慢，目前主存一般有DRAM构成。处理器CPU访问存储器的指标：延迟时间（Latency）——单次存储器的访问时间：存储器访问时间>>处理器时钟周期；带宽(Bandwidth)——单位时间对存储器的访问次数：如果每条指令的执行需要m次访存操作，总计每条指令需要m+1次存储器访问（包括1
软考中级-数据库系统工程师复习大纲亦清尘软考计算机网络网络基础知识数据结构经验分享算法
上午考题一、计算机系统知识计算机系统基础计算机体系结构安全性可靠性与系统性能评测多媒体基础逻辑运算二、程序语言基础知识三、数据结构与算法线性结构（线性表）线性结构（栈和队列）数组和矩阵数和二叉树图排序算法查找算法四、操作系统知识进程管理存储管理设备管理文件与作业管理五、网络基础知识计网与网络硬件概述OSI模型与TCP/IP协议Internet基础信息安全与网
libnuma 及底层实现 phone1126 linux kernel
libnuma是一个用于Linux系统的NUMA（非一致性内存访问）API。libnuma提供了一组函数和工具，用于管理和优化NUMA系统中的内存分配和访问。NUMA是一种计算机体系结构，其中多个处理器和内存模块通过高速互联网络连接在一起。在NUMA系统中，每个处理器都有自己的本地内存，但也可以访问其他处理器的内存。libnuma的主要功能包括以下几个方面：1.内存分配：libnuma提供了一些函
计算机科学导论第五版第二章答案,(计算机科学导论第2章答案.docx weixin_39894932 计算机科学导论第五版第二章答案
(计算机科学导论第2章答案第2章计算机体系结构与组织习题(答案)一．选择题1．D2．D3．D4．D5．C6．B7．A8．C9．A10．C11．A12．C13．C14．C15．A16．A17．B18．A二．简答题1．试简单叙述计算机采用二进制的原因。答：计算机只认识二进制编码形式的指令和数据。因此，包括数字、字符、声音、图形、图像等信息都必须经过某种方式转换成二进制的形式，才能提供给计算机进行识别和
嵌入式系统设计师教程素数之恋嵌入式硬件
1计算机系统基础知识1.1嵌入式计算机系统概述1.2数据表示1.2.1进位计数制及转换1.2.2数值型数据的表示1.2.3其他数据的表示1.2.4校验码1.3算术运算和逻辑运算1.3.1算术运算1.3.2逻辑运算1.4计算机硬件组成及主要部件功能1.4.1中央处理单元1.4.2存储器1.4.3总线1.4.4输入/输出控制1.5计算机体系结构1.6可靠性与系统性能评测基础知识1.6.1计算机可靠性1
java杨辉三角 3213213333332132 java基础
package com.algorithm; /** * @Description 杨辉三角 * @author FuJianyong * 2015-1-22上午10:10:59 */ public class YangHui { public static void main(String[] args) { //初始化二维数组长度 int[][] y
《大话重构》之大布局的辛酸历史白糖_ 重构
《大话重构》中提到“大布局你伤不起”，如果企图重构一个陈旧的大型系统是有非常大的风险，重构不是想象中那么简单。我目前所在公司正好对产品做了一次“大布局重构”，下面我就分享这个“大布局”项目经验给大家。背景公司专注于企业级管理产品软件，企业有大中小之分，在2000年初公司用JSP/Servlet开发了一套针对中
电驴链接在线视频播放源码 dubinwei 源码电驴播放器视频 ed2k
本项目是个搜索电驴（ed2k）链接的应用,借助于磁力视频播放器（官网： http://loveandroid.duapp.com/ 开放平台），可以实现在线播放视频，也可以用迅雷或者其他下载工具下载。项目源码： http://git.oschina.net/svo/Emule,动态更新。也可从附件中下载。项目源码依赖于两个库项目，库项目一链接： http://git.oschina.
Javascript中函数的toString()方法周凡杨 JavaScript js toString function object
简述 The toString() method returns a string representing the source code of the function. 简译之，Javascript的toString()方法返回一个代表函数源代码的字符串。句法 function.
struts处理自定义异常 g21121 struts
很多时候我们会用到自定义异常来表示特定的错误情况，自定义异常比较简单，只要分清是运行时异常还是非运行时异常即可，运行时异常不需要捕获，继承自RuntimeException，是由容器自己抛出，例如空指针异常。非运行时异常继承自Exception，在抛出后需要捕获，例如文件未找到异常。此处我们用的是非运行时异常，首先定义一个异常LoginException: /** * 类描述：登录相
Linux中find常见用法示例 510888780 linux
Linux中find常见用法示例 ·find path -option [ -print ] [ -exec -ok command ] {} \; find命令的参数；
SpringMVC的各种参数绑定方式 Harry642 springMVC 绑定表单
1. 基本数据类型(以int为例，其他类似)： Controller代码： @RequestMapping("saysth.do") public void test(int count) { } 表单代码： <form action="saysth.do" method="post&q
Java 获取Oracle ROWID aijuans java oracle
A ROWID is an identification tag unique for each row of an Oracle Database table. The ROWID can be thought of as a virtual column, containing the ID for each row. The oracle.sql.ROWID class i
java获取方法的参数名 antlove java jdk parameter method reflect
reflect.ClassInformationUtil.java package reflect; import javassist.ClassPool; import javassist.CtClass; import javassist.CtMethod; import javassist.Modifier; import javassist.bytecode.CodeAtt
JAVA正则表达式匹配查找替换提取操作百合不是茶 java 正则表达式替换提取查找
正则表达式的查找;主要是用到String类中的split(); String str; str.split();方法中传入按照什么规则截取,返回一个String数组常见的截取规则: str.split("\\.")按照.来截取 str.
Java中equals()与hashCode()方法详解 bijian1013 java set equals()hashCode()
一.equals()方法详解 equals()方法在object类中定义如下： public boolean equals(Object obj) { return (this == obj); } 很明显是对两个对象的地址值进行的比较（即比较引用是否相同）。但是我们知道，String 、Math、I
精通Oracle10编程SQL(4)使用SQL语句 bijian1013 oracle 数据库 plsql
--工资级别表 create table SALGRADE ( GRADE NUMBER(10), LOSAL NUMBER(10,2), HISAL NUMBER(10,2) ) insert into SALGRADE values(1,0,100); insert into SALGRADE values(2,100,200); inser
【Nginx二】Nginx作为静态文件HTTP服务器 bit1129 HTTP服务器
Nginx作为静态文件HTTP服务器在本地系统中创建/data/www目录，存放html文件(包括index.html) 创建/data/images目录，存放imags图片在主配置文件中添加http指令 http { server { listen 80; server_name
kafka获得最新partition offset blackproof kafka partition offset 最新
kafka获得partition下标，需要用到kafka的simpleconsumer import java.util.ArrayList; import java.util.Collections; import java.util.Date; import java.util.HashMap; import java.util.List; import java.
centos 7安装docker两种方式 ronin47
第一种是采用yum 方式 yum install -y docker
java-60-在O(1)时间删除链表结点 bylijinnan java
public class DeleteNode_O1_Time { /** * Q 60 在O(1)时间删除链表结点 * 给定链表的头指针和一个结点指针(!!)，在O(1)时间删除该结点 * * Assume the list is: * head->...->nodeToDelete->mNode->nNode->..
nginx利用proxy_cache来缓存文件 cfyme cache
user zhangy users; worker_processes 10; error_log /var/vlogs/nginx_error.log crit; pid /var/vlogs/nginx.pid; #Specifies the value for ma
[JWFD开源工作流]JWFD嵌入式语法分析器负号的使用问题 comsci 嵌入式
假如我们需要用JWFD的语法分析模块定义一个带负号的方程式，直接在方程式之前添加负号是不正确的，而必须这样做： string str01 = "a=3.14;b=2.71;c=0;c-((a*a)+(b*b))" 定义一个0整数c,然后用这个整数c去
如何集成支付宝官方文档 dai_lm android
官方文档下载地址 https://b.alipay.com/order/productDetail.htm?productId=2012120700377310&tabId=4#ps-tabinfo-hash 集成的必要条件 1. 需要有自己的Server接收支付宝的消息 2. 需要先制作app，然后提交支付宝审核，通过后才能集成调试的时候估计会真的扣款，请注意
应该在什么时候使用Hadoop datamachine hadoop
原帖地址：http://blog.chinaunix.net/uid-301743-id-3925358.html 存档，某些观点与我不谋而合，过度技术化不可取，且hadoop并非万能。 --------------------------------------------万能的分割线-------------------------------- 有人问我，“你在大数据和Hado
在GridView中对于有外键的字段使用关联模型进行搜索和排序 dcj3sjt126com yii
在GridView中使用关联模型进行搜索和排序首先我们有两个模型它们直接有关联: class Author extends CActiveRecord { ... } class Post extends CActiveRecord { ... function relations() { return array( '
使用NSString 的格式化大全 dcj3sjt126com Objective-C
格式定义The format specifiers supported by the NSString formatting methods and CFString formatting functions follow the IEEE printf specification; the specifiers are summarized in Table 1. Note that you c
使用activeX插件对象object滚动有重影蕃薯耀 activeX插件滚动有重影
使用activeX插件对象object滚动有重影 <object style="width:0;" id="abc" classid="CLSID:D3E3970F-2927-9680-BBB4-5D0889909DF6" codebase="activex/OAX339.CAB#
SpringMVC4零配置 hanqunfeng springmvc4
基于Servlet3.0规范和SpringMVC4注解式配置方式，实现零xml配置，弄了个小demo，供交流讨论。项目说明如下： 1.db.sql是项目中用到的表，数据库使用的是oracle11g 2.该项目使用mvn进行管理，私服为自搭建nexus,项目只用到一个第三方 jar，就是oracle的驱动； 3.默认项目为零配置启动，如果需要更改启动方式，请
《开源框架那点事儿16》：缓存相关代码的演变 j2eetop 开源框架
问题引入上次我参与某个大型项目的优化工作，由于系统要求有比较高的TPS，因此就免不了要使用缓冲。该项目中用的缓冲比较多，有MemCache，有Redis，有的还需要提供二级缓冲，也就是说应用服务器这层也可以设置一些缓冲。当然去看相关实现代代码的时候，大致是下面的样子。 [java] view plain copy print ? public vo
AngularJS浅析 kvhur JavaScript
概念 AngularJS is a structural framework for dynamic web apps. 了解更多详情请见原文链接：http://www.gbtags.com/gb/share/5726.htm Directive 扩展html，给html添加声明语句，以便实现自己的需求。对于页面中html元素以ng为前缀的属性名称，ng是angular的命名空间
架构师之jdk的bug排查(一)---------------split的点号陷阱 nannan408 split
1.前言. jdk1.6的lang包的split方法是有bug的,它不能有效识别A.b.c这种类型,导致截取长度始终是0.而对于其他字符,则无此问题.不知道官方有没有修复这个bug. 2.代码 String[] paths = "object.object2.prop11".split("'"); System.ou
如何对10亿数据量级的mongoDB作高效的全表扫描 quentinXXZ mongodb
本文链接: http://quentinXXZ.iteye.com/blog/2149440 一、正常情况下，不应该有这种需求首先，大家应该有个概念，标题中的这个问题，在大多情况下是一个伪命题，不应该被提出来。要知道，对于一般较大数据量的数据库，全表查询，这种操作一般情况下是不应该出现的，在做正常查询的时候，如果是范围查询，你至少应该要加上limit。说一下，
C语言算法之水仙花数 qiufeihu c 算法
/** * 水仙花数 */ #include <stdio.h> #define N 10 int main() { int x,y,z; for(x=1;x<=N;x++) for(y=0;y<=N;y++) for(z=0;z<=N;z++) if(x*100+y*10+z == x*x*x
JSP指令 wyzuomumu jsp
jsp指令的一般语法格式： <%@ 指令名属性 =”值 ” %> 常用的三种指令： page,include,taglib page指令语法形式： <%@ page 属性 1=”值 1” 属性 2=”值 2”%> include指令语法形式： <%@include file=”relative url”%> (jsp可以通过 include

Memory Hierarchy

目录

The Principle of Locality

Memory Hierarchy

Terminology

存储层次的性能参数

Four Questions for Memory Hierarchy

Q1: Where can a block be placed in the upper level?

Q2: How is a block found if it is in the upper level?

Q3: Which block should be replaced on a miss?

Q4: What happens on a write?

Virtual Memory Address Space

Four Memory Hierarchy Questions Revisited

Q1: Where Can a Block Be Placed in Main Memory?

Q2: How Is a Block Found If It Is in Main Memory?

Q3: Which Block Should Be Replaced on a Virtual Memory Miss?

Q4: What Happens on a Write?

Caching vs. Demand Paging

Cache Design

Six Basic Cache Optimizations

What causes a MISS?

1. Larger Block Size to Reduce Miss Rate

2. Larger Caches to Reduce Miss Rate

3. Higher Associativity to Reduce Miss Rate

4. Multilevel Caches to Reduce Miss Penalty

5. Giving Priority to Read Misses over Writes to Reduce Miss Penalty

6. Avoiding Address Translation during Indexing of the Cache to Reduce Hit Time

7. Victim Cache to Reduce Miss Rate

Summary of Basic Cache Optimization

你可能感兴趣的:(计算机体系结构)