<编程珠玑>笔记 (一) 需求分析

  作为开篇第一章, 作者强调的重点在于如何精确的描述问题,这也是所有程序开发的第一步 -- 需求分析

1  Precise problem statement

1) input

  A file containing at most 107 positive intergers (each < 107); any interger occurs twice is an error; no other data is associated with the interger.

2) output

  a sorted list in increasing order

3) constraints

  at most 1MB is available in main memory; ample disk storage is available; 10s ≤ run time < several minutes (at most)

 

2  Program design

1) merge sort with work files

  read the file once from the input, sort it with the aid of work files that are read and written many times, and then write it once

 

2) 40-pass algorithm

  if we store each number in 4 bytes (32-bit int), we can store 250,000 numbers in 1MB (1 megabytes/4 bytes).

  we use a program that makes 40 passes over the input files. The first pass reads 0 ~ 249,999, and the 40th pass reads 9,750,000 ~ 9,999,999

 

3) read once without intermediate files

  only if we could represent all the integers in the input file in 1MB of main memory

  实际上,即使利用下文的 bitmap 数据结构,107个整数仍然需要1.25MB(> 1MB)

 

 

3  Implementation sketch

1) bitmap

  we use bitmap data structure to represent the file by a string of ten million bits in which the ith bit is on only if the interger i is in the file

  E.g. store the set {1, 2, 3, 5, 8, 13} in a string of 20 bits

  0  1  1  1  0  1  0  0  1  0  0  0  0  1  0  0  0  0  0  0 

  直白解释:从 0~19,对应相应的 string 位置为 1th ~ 20th,出现则填1,否则填0。例如,当12出现时,在其相应的位置处(也即 sting 的第13位)填1

2) pseudocode

// n is the number of bits in the vector (in this case 10,000,000)
// 1) initialize set to empty
for i = [0, n)
  bit[i] = 0

// 2) insert present elements into the set
for each i in the input file 
  bit[i] = 1

// 3) write sorted output
for i = [0, n)
  if bit[i] == 1
    write i on the output file

 

4  Principles

  a simple design = the right problem + bitmap data structure + multiple-pass algorithm + time-space tradeoff

 

5  merge sort 合并排序

  it works as follows (摘自维基)

1)  divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted).

2)  repeatedly merge sublists to produce new sorted sublists until there is only 1 sublist remaining.

 

    <编程珠玑>笔记 (一) 需求分析_第1张图片                 <编程珠玑>笔记 (一) 需求分析_第2张图片

 

你可能感兴趣的:(<编程珠玑>笔记 (一) 需求分析)