这只猩猩很难控制 - Two star programming

Two star programming

2013-01-08 • C, Torvalds,Algorithms •46 Comments

A few weeks ago Linus Torvalds answered some questions on slashdot. All his responses make good reading but one in particular caught my eye. Asked to describe his favourite kernel hack, Torvalds grumbles he rarely looks at code these days — unless it’s to sort out someone else’s mess. He then pauses to admit he’s proud of the kernel’s fiendishly cunning filename lookup cache before continuing to moan about incompetence.

At the opposite end of the spectrum, I actually wish more people understood the really core low-level kind of coding. Not big, complex stuff like the lockless name lookup, but simply good use of pointers-to-pointers etc. For example, I’ve seen too many people who delete a singly-linked list entry by keeping track of the prev entry, and then to delete the entry, doing something like

if (prev)
    prev->next = entry->next;
else
    list_head = entry->next;

and whenever I see code like that, I just go “This person doesn’t understand pointers”. And it’s sadly quite common.

People who understand pointers just use a “pointer to the entry pointer”, and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing a*pp = entry->next.

Well I thought I understood pointers but, sad to say, if asked to implement a list removal function I too would have kept track of the previous list node. Here’s a sketch of the code:

This person doesn’t understand pointers
typedef struct node
{
    struct node * next;
    ....
} node;

typedef bool (* remove_fn)(node const * v);

// Remove all nodes from the supplied list for which the 
// supplied remove function returns true.
// Returns the new head of the list.
node * remove_if(node * head, remove_fn rm)
{
    for (node * prev = NULL, * curr = head; curr != NULL; )
    {
        node * const next = curr->next;
        if (rm(curr))
        {
            if (prev)
                prev->next = next;
            else
                head = next;
            free(curr);
        }
        else
            prev = curr;
        curr = next;
    }
    return head;
}

The linked list is a simple but perfectly-formed structure built from nothing more than a pointer-per-node and a sentinel value, but the code to modify such lists can be subtle. No wonder linked lists feature in so manyinterview questions!

The subtlety in the implementation shown above is the conditional required to handle any nodes removed from the head of the list.

Now let’s look at the implementation Linus Torvalds had in mind. In this case we pass in a pointer to the list head, and the list traversal and modification is done using a pointer to the next pointers.

Two star programming
void remove_if(node ** head, remove_fn rm)
{
    for (node** curr = head; *curr; )
    {
        node * entry = *curr;
        if (rm(entry))
        {
            *curr = entry->next;
            free(entry);
        }
        else
            curr = &entry->next;
    }
}

Much better! The key insight is that the links in a linked list are pointers and sopointers to pointers are the prime candidates for modifying such a list.

§

The improved version of remove_if() is an example of two star programming: the doubled-up asterisks indicate two levels of indirection. Athird star would be one too many.


体会: linus信手拈来的代码, 我却要花费半天时间去理解为什么它能工作

          我觉得有两点原因。 一: linus对指针理解的更深刻,(我尝试用汇编去理解**, 结果是它并不难理解)

                                        二: linus审阅的代码,或是写的代码比我多得多, (我远比不上linus聪明, 却比linus懒惰)


* 与 ** 与 变量名

node tmp;

node *p = &tmp;

node **pp = &p;

无论是*还是**,在32位系统上都是用栈空间里的4个字节表示

经编译后,在执行时 tmp变成栈里的一段空间,  (假设起始地址为 0xffffa0b0) (esp + 0x18)

p在栈里是4字节空间, 这4个字节的内容是tmp的地址 (则p的起始地址为 &p == 0xffffa0ac, 内容为: 0xb0 0xa0 0xff 0xff)  (esp + 0x14)

pp在栈里是4字节空间,这4个字节的内容是p的地址   (则pp的起始地址为 &pp == 0xffffa0a8, 内容为: 0xac 0xa0 0xff 0xff) (esp + 0x10)

mov 0x10(%esp) %eax    ; eax = 0xffffa0ac

mov (%eax) %eax             ; eax = 0xffffa0b0


对pp的赋值 是修改地址为0xffffa0a8的内容

对*pp的赋值 是修改地址为0xffffa0ac的内容


#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>

typedef struct node {  
	int i;
	struct node *next;
} node;  
 
typedef bool (*remove_fn)(node const *v);
 
static int glo_cnt = 0;
bool cbfun(node const *v)
{
	glo_cnt++;
	if (glo_cnt % 2)
		return 0;
	return 1;
}

void remove_if(node **head, remove_fn rm)
{
	node **curr;
	for (curr = head; *curr; ) {
		node * entry = *curr;
		if (rm(entry)) {
			*curr = entry->next;
		} else {
			curr = &entry->next;
		}
	}
}

int main(int argc, char *argv[])
{
	struct node tmp[3] = {{1, &tmp[1]}, {2, &tmp[2]}, {3, NULL}};

	struct node *pnode = tmp;
	remove_if(&pnode, cbfun);

	for (pnode = tmp; pnode; pnode = pnode->next) {
		printf("%d ", pnode->i);
	}
	return 0;
}


原文链接: http://wordaligned.org/articles/two-star-programming

你可能感兴趣的:(这只猩猩很难控制 - Two star programming)