The practice of programming
simplicity,
clarity and
generality form the bedrock of good software.
Chp 1 Style
The purpose of style is to make the code
easy to read for yourself and others,
1.1 Names
A name should be
informative,
concise,
memorable, and
pronounceable if possible.
Much information comes from context and scope; the broader the scope of a variable, the more information should be conveyed by its name.
Use descriptive names for globals, short names for locals.
Programmers are often encouraged to use long variable names regardless of context.That is a mistake:
clarity is often achieved through brevity.
The longer the program, the more important is the choice of good, descriptive, systematic names
Function names should be based on
active verbs, perhaps followed by nouns
Functions that return a boolean value should be named so that the return value is unambiguous.
1.2 Expressions and Statements
Write expressions as you might speak them aloud. Conditional expressions that include negations are always hard to understand.
1.3 Consistency and Idioms
Specific style is much less important than its consistent application. Pick one style, preferably ours, use it consistently, and don't waste time arguing.
The program's consistency is more important than your own, because it makes life easier for those who follow.
A central part of learning any language is developing a familiarity with its idioms.
1.5 Magic Numbers
As a guideline, any number other than 0 or 1 is likely to be magic and should have a name of its own
1.6 Comments
Comments are meant to help the reader of a program. They do not help by saying things the code already plainly says, or by contradicting the code, or by distracting the reader with elaborate typographical displays.
Comments shouldn't report self-evident information.
Global variables have a tendency to crop up intermittently throughout a program; a comment serves as a reminder to be referred to as needed.
When you change code, make sure the commentsare still accurate.
Good code needs fewer comments than bad code.
1.7 Why Bother?
The key observation is that good style should be a matter of habit.
If you think about style as you write code originally, and if you take the time to revise and improve it, you will develop good habits.
Chp 2 Algorithms and Data Structures
Even within an intricate program like a compiler or a web browser, most of the data structures are arrays, lists, trees, and hash tables.
Chp3 Design and Implementation
As the quotation from Brooks's classic book suggests,
the design of the data structures is the central decision in the creation of a program.Once the data structures are laid out, the algorithms tend to fall into place, and the coding is comparatively easy.
The design of a program is rooted in the layout of its data. The data structures don't define every detail, but they do shape the overall solution.
This point of view is oversimplified but not misleading.
As a rule, try to handle irregularities and exceptions and special cases in data.Code is harder to get right so the control flow should be as simple and regular as possible.
The great strengths of C are that it gives the programmer complete control over implementation, and programs written in it tend to be fast. The cost, however, is that the C programmer must do more of the work, allocating and reclaiming memory, creating hash tables and linked lists, and the like.
C is a razor-sharp tool, with which one can create an elegant and efficient program or a bloody mess.
Less clear, however, is how to assess the loss of control and insight when the pile of system-supplied code gets so big that one no longer knows what's going on underneath.
This is the case with the STL version; its performance is unpredictable and there is no easy way to address that.
Chp 4 Interfaces
It's not usually until you've built and used a version of the program that you understand the issues well enough to get the design right.
As a principle,
library routines should not just die when an error occurs; error status should be returned to the caller for appropriate action.
Expansion of size and complexity is a typical result of moving from prototype to production.
4.5 Interface Principles
Hide implementation details
Choose a small orthogonal set of primitives.
Having lots of functions may make the library easier to use-whatever one needs is there for the taking. But a large interface is harder to write and maintain, and sheer size may make it hard to learn and use as well.
In the interest of convenience, some interfaces provide multiple ways of doing the same thing,
a tendency that should be resisted.
Narrow interfaces are to be preferred to wide ones, at least until one has strong evidence that more functions are needed.
Don't reach behind the user's back
Do the same thing the same way everywhere.
The basic strxxx functions in the C library are easy to use without documentation because they all behave about the same: data flows from right to left, the same direction as in an assignment statement, and they all return the resulting string
4.6 Resource Management
Free a resource in the same layer that allocated it.
To avoid problems, it is necessary to write code that is
reentrant
Detect errors at a low level, handle them at a high level.
In most cases, the caller should determine how to handle an error, not the callee.
Use exceptions only for exceptional situations.
Chp 5 Debugging
5.1 Debuggers
As a personal choice, we
tend not to use debuggers beyond getting a stack trace or the value of a variable or two.
Debuggers can be arcane and difficult programs,and especially for beginners may provide more confusion than help
5.2 Good Clues, Easy Bugs
Debugging involves
backwards reasoning, like solving murder mysteries.
Look for familiar patterns
Examine the most recent change
Debug it now, not later
Get a stack trace:The source line number of the failure, often part of a stack trace, is the most useful single piece of debugging information
Read before typing: Resist the urge to start typing; thinking is a worthwhile alternative
Explain your code to someone else
5.3 No Clues, Hard Bugs
Make the bug reproducible
Divide and conquer
Study the numerology of failures
Display output to localize your search.
Write self-checking code
Write a logfile
Draw a picture
Use tools
Keep records.
5.4 Last Resorts
These "
mental model" bugs are among the hardest to find; the mechanical aid of debugger is invaluable.
A debugger is a help, since it forces you to go in a different direction, to follow what the program is doing, not what you think it is doing
5.5 Non-reproducible Bugs
The very fact that the behavior is nondeterministic is itself information, however; it means that the error is not likely to be a flaw in your algorithm but that in some way your code is using information that changes each time the program runs.
5.8 Summary
Once a bug has been seen, the first thing to do is to
think hard about the clues it presents.
If there aren't good clues, hard thinking is still the best first step, to be followed by systematic attempts to narrow down the location of the problem.
Chp 6 Testing
Edsger Dijkstra made the famous observation that
testing can demonstrate the presence of bugs, but not their absence.
One way to write bug-free code is to generate it by a program. If some programming task is understood so well that writing the code seems mechanical. then it should be mechanized.
6.1 Test as You Write the Code
Test code at its boundaries
The idea is that
most bugs occur at boundaries .If a piece of code is going to fail, it will likely fail at a boundary. Conversely, if it works at its boundaries, it's likely to work elsewhere too.
Test pre- and post-conditions
Use assertions
Assertions are particularly helpful for validating properties of interfaces because they draw attention to inconsistencies between caller and callee and may even indicate who's at fault.
Program defensively
Check error returns
6.2 Systematic Testing
Test incrementally
Test simple parts first
Know what output to expect
Compare independent implementations.
Measure test coverage
Complete coverage is often quite difficult to achieve
6.3 Test Automation
Automate regression testing
The most basic form of automation is
regression testing, which performs a sequence of tests that compare the new version of something with the previous version.
It's easy to overlook the possibility that the fix broke something else.
Create self-contained tests.
What should you do when you discover an error? If it was not found by an existing test, create a new test that does uncover the problen~and verify the test by running it with the broken version of the code
Keep a record of bugs, changes, and fixes; it will help you identify old problems and fix new ones
6.5 Stress Tests
Higher volume of machine-generated input in itself tends to break things because very large inputs cause overflow of input buffers, arrays, and counters. and are effective at finding unchecked fixed-size storage within a program.
Some testing is based on explicitly malicious inputs.
Any routine that might receive values from outside the program, directly or indirectly,
should validate its input values before using them.
6.6 Tips for Testing
Test on multiple machines, compilers, and operating systems. Each combination potentially reveals errors that won't be seen on others
6.7 Who Does the Testing?
It is important to test your own code: don't assumethat some testing organization or user will find things for you
The reason for testing is to find bugs, not to declare the program working.
It's hard to test interactive programs, especially if they involve mouse input.
Interactive programs should be controllable from scripts that simulate user behaviors so they can be tested by programs
6.9 Summary
The single most important rule of testing is to
do it !.
Chp 7 Performance
The first principle of optimization is
don't !
7.1 A Bottleneck
When solving problems, it's important to ask the right question.
7.2 Timing and Profiling
Knuth's guideline is right: a small part of the program consumes most of the run-time
When a single function is so overwhelmingly the bottleneck, there are only two ways to go: improve the function to use a better algorithm, or eliminate the function altogether by rewriting the surrounding program.
7.3 Strategies for Speed
Use a better algorithm or data structure.
Enable compiler optimizations
One thing to be aware of is that the more aggressively the compiler optimizes, the more likely it is to introduce bugs into the compiled program. After enabling the optimizer, re-run your regression test suite. as you should for any other modification.
Tuning
It is typical of tuning: some things help,some things don't. and one must measure to find out which.
Don't optimize what doesn't matter
Optimizing public services like the spam filter or a library is almost always worthwhile; speeding up test programs is almost never worthwhile.
7.4 Tuning the Code
Bear in mind that good compilers will do some of these for you, and in fact you may impede their efforts by complicating the program.
Collect common subexpressions
Replace expensive operations by cheap ones
Unroll or eliminate loops.
Cache frequently-used values.
Write a special-purpose allocator.
Buffer input and output
When a C program calls printf, for example, the characters are stored in a buffer but not passed to the operating system until the buffer is full or flushed explicitly. The operating system itself may in turn delay writing the data to disk.
Handle special cases separately
Precompute results——trading space for time
Use approximate values
Rewrite in a lower-level language
7.5 Space Efficiency
In general, it is best to
store information as text wherever feasible rather than in some binary representation. Text is
portable,
easy to read, and
amenable to processing by all kinds of tools; binary representations have none of these advantages.
7.7 Summary
By the way, it's extremely difficult to do good benchmarking, and it is not unknown for companies to tune their products to show up well on benchmarks. so
it is wise to take all benchmark results with a grain of salt.
Chp 8 Portability
8.1 Language
Binaries don't port well, but source code does.
Program in the mainstream.
It's hard to know just where the mainstream is, but it's easy to recognize constructions that are well outside it.
By definition, all side effects and function calls must be completed at each semicolon, or when a function is called.
Bitfields are so machine-dependent that no one should use them.
8.2 Headers and Libraries
Use standard libraries
8.3 Program Organization
There are two major approaches to portability, which we will call
union and
intersection.The approach we recommend is
intersection: use only those features that exist in all target systems
Union code is by design unportable.
Avoid conditional compilation.
Conditional compilation with #ifdef and similar preprocessor directives is
hard to manage, because information tends to get sprinkled throughout the source.
Mixing compile-time control flow (determined by #ifdef statements) with runtimecontrol flow is much worse, since it is very
difficult to read.
The nastiest problem with conditional compilation is one we haven't mentioned: it is
almost impossible to test
8.4 Isolation
Localize system dependencies in separate files.
When different code is needed for different systems, the differences should be localized in separate files, one file for each system.
Hide system dependencies behind interfaces
8.5 Data Exchange
Textual data moves readily from one system to another and is the
simplest portable way to exchange arbitrary information between systems.
Good example: SMTP use
MIME encoding for transferring binary data in mail messages
8.6 Byte Order
Still, the best solution is often to
convert information to text format, which (except for the CRLF problem) is
completely portable.
8.7 Portability and Upgrade
Change the name if you change the specification(behavior)
8.8 Internationalization
Unicode documents are usually translated into a byte-stream encoding called UTF-8 before being sent between programs or over a network.
Chp 9 Notation
"Perhaps of all the creations of man language is the most astonishing"
The right language can make all the difference in how easy it is to write a program. This is why a practicing programmer's arsenal holds not only general purpose languages like C and its relatives, but also programmable shells, scripting languages, and lots of application-specific languages.
9.1 Formatting Data
There is always a gap between what we want to say to the computer ("solve my problem") and what we are required to say to get a job done.
The narrower this gap, the better.
Good notation makes it easier to say what we want and harder to say the wrong thing by mistake
Little languages are specialized notations for narrow domains.The
printf control sequences are a good example.
9.3 Programmable Tools
Programmable tools often originate in little languages designed for natural expression of solutions to problems within a narrow domain
9.4 Interpreters, Compilers, and Virtual Machines
A virtual machine combines many of the advantages of conventional interpretation and compilation.
Parsers are often written with the aid of an automatic parser generator, also called a compiler-compiler. such as yacc or bison.
Virtual machines are a lovely old idea.
9.5 Programs that Write Programs
The most common program-writing program is a compiler that translates highlevel language into machine code.
In spite of the power of program generators, and in spite of the existence of many good examples, the notion is not appreciated as much as it should be and is infrequently used by individual programmers.
The large-scale version of self-documenting code is
literate programming, which integrates a program and its documentation so one process prints it in a natural order for reading, and another arranges it in the right order for compilation.
As tasks become so focused and well understood that programming them feels almost mechanical, it may be time to create a notation that naturally expresses the tasks and a language that implements it.
Regular expression is a good example.
Epilogue
Simplicity and clarity are first and most important