Storage System

Storage hierarchy

Cache, memory -> hard disks, SSD, Tape, Optical Disk
(读写速度,成本)

Access time

Time taken before drive is ready to transfer data
(物理设备(硬盘,内存..)在进行数据的转换前需要索引到目标位置所消耗的时间)
一般来说,
内存:纳秒级
SSD:微秒级
HDD:毫秒级

Access times.png

Storage device information

  • Characters of storage device:

    • Capacity (bytes)
    • Cost(price per byte of storage)
    • Bandwidth (number of bytes that can be transferred per second; read bandwidth is not equal to write bandwidth)
    • Latency(waiting time for response/delivery of data)
  • Basic function/operation:CRUD

  • Time to complete an operation depends on both bandwidth and latency
    CompletionTime = Latency + Size/Bandwidth
    Influence factor:
    Technology(HDD or SSD);Operation type,(read or write);number of operations in the workload; Access pattern(sequential or random)

  • Access pattern:

    1. Sequential: data to be accessed are located next to each other or sequentially on the device
    2. Random: data located randomly on the storage device

Hard Disk Drive

HDD structure.png
  • One or more spinning magnetic platters
    • Typically two surfaces per platter
  • Disk arm positions over the radial position (tracks) where data are stored
    • It swings across tracks (but do not extend/shrink)
  • Data is read/written by a read/write head as platter spins

Hard disk head movement while copying files between two folders:https://www.youtube.com/watch?v=BlB49F6ExkQ

  • Physical characteristics:
    2.5‘’ in laptops, 3.5‘’ common in desktops
    rotational speed: 4,800/5,400/7,200,10,000 RPM (rotations per minute)
    platter number: 5~7
    current capicity: 10 TB (Western Digital)

  • Disk organization: platter -> tracks -> sectors
    Each platter consists of a number of tracks;
    Each track is divided into N fixed size sectors (sector size: 4KB)

CHS (cylinder-head-sector)

Early way to address a sector (Logical Block Addressing) is more common now)


CHS structure.png
example:
# cylinders: 256
# heads: 16 (i.e., 8 platters, 2 heads/platter)
# sectors/track: 64
   sector size = 4KB

capacity of the drive:
2^8 * 2^6 * 2^2* 2^10 * 2^4 = 2^30 = 1GB
overall:capacity = C * H * S * sector size 

According to CHS, data can be located before transferring, then data can be transferred

T = Tseek + Trotation + Ttransfer
Tseek : Time to get the disk head on right track
Trotation :Time to wait for the right sector to rotate under the head
Ttransfer: Time to actually transfer the data

  1. rotational latency: waiting for the right sector to rotate under the head
    On average: about 1⁄2 of time of a full rotation


    rotation.png
example:
Assume 10,000 RPM (rotations per minute)
60000 ms/ 10000 rotations  = 6ms / rotation
  1. seek times (For multiple tracks): waiting for the head to the right track
    On average seek time is about 1/3 max seek time


    seek the track.png

3.transfer time (related to transmission bandwidth)

Assume that data will be transferred:  512KB, 128 MB/sec transmission bandwidth
Transfer time:  512KB/128MB * 1000ms = 4ms
  1. Actual bandwidth
    Actual bandwidth = amount data/ autual time
    actually time = Tseek + Trotation + Ttransfer

Sector vs. Block

  • Block is the smallest unit of the file system
  • Sector is the smallest unit of the hard disk
  • Block has 1 or more sectors

Sequential vs. Random

Sequential operation:

  • May assume all sectors involved are on the same track
    -- need to seek to the right track or rotate to the first sector
    -- But no rotation/seeking needed afterward

Random operation: May assume all sectors are on different tracks and sectors

example: 7ms avg seek,  10,000 RPM  50 MB/sec transfer rate 4KB/block
Sequential access of 10 MB:
– Completion time = 7ms + 60*1000/10000/2 ms + 10/50 *1000 ms = 210ms
– Actual bandwidth = 10MB/210ms = 47.62 MB/s

Random access of 10 MB 
– block numbers: 10*1000/4 = 2500  (assume 1 block = 1 sector)
– Completion time = 2500 * (7 + 3 + 4/50) = 25.2s
– Actual bandwidth = 10MB / 25.2s = 0.397 MB/s

Solid State Drive

SSD.png
  • All electronic, made from flash memory
  • Limited lifetime, can only write a limited number of times.
  • Significantly better latency: no seek or rotational delay
  • Much better performance on random (however, write has much higher latency than read )
Speed comparison between read and write.png

structures of SSD

  • SSD contains a number of flash memory chips
    chip -> dies -> planes -> blocks -> pages (rows) -> cells
• Typically, a chip may have 1, 2, or 4 dies
• A die may have1or 2 planes
• A plane has a number of blocks
• A block has a number of pages 
* A page has a number of cells 
Die Layout.png
  • Page is the smallest unit of data transfer between SSD and main memory

How data is stored in SSD

  • Cells are made of floating-gate transistors : By applying high positive/negative voltage to control gate, electrons can be attracted to or repelled from floating gate
    • State = 1, if no electrons in the floating gate
    • State = 0, if there are electrons (negative charges)
      – Electrons stuck there even when power is off
      – So state is retained
  • Data in SSD are represented by the '101010...' formats, that is the state of the eletrons
floating-gate transistor.png

Read Operations

  • Electrons on the floating gate affect the threshold voltage for the floating gate transistor to conduct
  • Higher voltage needed when gate has electrons


    Read operation.png
Steps:
• Apply Vint (intermediate voltage)
• If the current is detected, gate has no electrons=> bit = 1
• If no current, gate must have electrons => bit = 0
  • Page is the smallest unit that can be read (about more details, I choose to give up.)

Write and erase

  • Write: 1 => 0
    – Apply high positive voltage (>> voltage for read) to the control gate
    – Attract electrons from channel to floating gate (through quantum tunneling)
    – Page is the smallest unit for write

  • Erase: 0 => 1 (make electrons empty)
    – Need to apply much higher negative voltage to the control gate
    – Get rid of electrons from floating gate
    – May stress surrounding cells(dangerous to do on individual pages)
    – Block is the smallest unit for erase

P/E cycle (1->0->1->0...)

P: program/write;
E: erase

  • what is P/E cycle?
    Data are written to cells (P): cell value from 1 -> 0 – Then erased (E): 0 -> 1
  • why P/E cycle?
    Every write & erase damages oxide layer surrounding the floating-gate to some extent


    P/E cycle.png

latency: read < write < erase

latency.png

MLC (Multi-level cell)

  • floating gate can hold a number of electrons to represent different states

  • SLC vs. MLC
    – Less complex
    – Faster
    – More reliable
    – Less storage
    – More costly


    MLC example.png
2 bits, 3 intermediate voltages.png

an example about the write page of SSD

P/E/P.png

你可能感兴趣的:(Storage System)