MIT6S.081 Lab1:Unix utilities

实验目的

This lab will familiarize you with xv6 and its system calls.实现几个unix实用工具,熟悉系统调用

find[moderate]

Write a simple version of the UNIX find program: find all the files in a directory tree with a specific name. Your solution should be in the file user/find.c.

some hints:
Look at user/ls.c to see how to read directories.
Use recursion to allow find to descend into sub-directories.
Don’t recurse into “.” and “…”.
Changes to the file system persist across runs of qemu; to get a clean file system run make clean and then make qemu.
You’ll need to use C strings. Have a look at K&R (the C book), for example Section 5.5.
Note that == does not compare strings like in Python. Use strcmp() instead.
Add the program to UPROGS in Makefile.
背景知识

  • 文件类型:T_DIR, T_FILE, T_DEVICE
  • Disk layout:[boot block | super block | log | inode blocks | free bit map | data blocks]
    其中,super block 描述disk layout; inode blocks 存储inodes,data blocks 存储文件内容
  • 每个inode存储源文件元信息,每个inode都有一个编号,系统根据inode编号可以快速计算出inode在inodes存储区域的偏移,然后从中获取inode信息,再根据inode信息中记录的block快位置,从block存储区读取文件内容。
  • struct stat 保存文件信息,完整的stat结构是包括我们最常关心的文件大小、创建时间、访问时间、修改时间等。这里有个常用的函数stat(char * file,struct stat *st) //函数将文件file的信息写入st中。
struct stat {
  int dev;     // File system's disk device,文件所在设备的ID
  uint ino;    // Inode number
  short type;  // Type of file
  short nlink; // Number of links to file,链接到此文件的硬连接数
  uint64 size; // Size of file in bytes
};

实验代码

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"
#include "kernel/fs.h"

void find(char *path, char *target) {
	char buf[512], *p;
	int fd;
	struct dirent de;
	struct stat st;
	
	if((fd = open(path, 0)) < 0) {
		fprintf(2, "find: cannot open file %s\n", path);
		return;
	}
	if(fstat(fd, &st) < 0) {
		fprintf(2, "find: cannot stat file %s\n", path);
		close(fd);
		return;
	}
	switch(st.type) {
		case T_FILE:
			if(strcmp(path + strlen(path) - strlen(target), target) == 0) {   //compare the name from back,amazing!
				printf("%s\n", path);
			}
			break;
		case T_DIR:
			if(strlen(path) + 1 + DIRSIZ + 1 > sizeof(buf)){
				printf("find : path too long\n");
				break;
			}
			strcpy(buf, path);
			p = buf + strlen(buf);
			*p++ = '/'; //enter next dir
			while(read(fd, &de, sizeof(de)) == sizeof(de)) {
				if(de.inum == 0) continue;
				memmove(p, de.name, DIRSIZ); //put the name of file in dir at the pos of p; 
				p[DIRSIZ] = 0;   //sign of end of string
				if(stat(buf, &st) < 0) {
					printf("find: cannot stat %s\n", buf);
					continue;
				}
				//we don't consider file such as .git
				if(strcmp(buf + strlen(buf) - 2,"/.") != 0 && strcmp(buf + strlen(buf) - 3,"/..") != 0) {   
					find(buf, target);
				}
			}
			break;
	}
	close(fd);
	
}

int main(int argc, char *argv[]) {
	if(argc < 3){
		exit(0);
	}
	char target[512];
	target[0] = '/';
	strcpy(target + 1, argv[2]); 
	find(argv[1], target);
	exit(0);
}

实验结果

...
xv6 kernel is booting

hart 2 starting
hart 1 starting
init: starting sh
$ echo > b
$ mkdir a
$ echo > a/b
$ find . b
./b
./a/b

xargs[moderate]

Write a simple version of the UNIX xargs program: read lines from the standard input and run a command for each line, supplying the line as arguments to the command. Your solution should be in the file user/xargs.c.

some hints:
Use fork and exec to invoke the command on each line of input. Use wait in the parent to wait for the child to complete the command.
To read individual lines of input, read a character at a time until a newline (‘\n’) appears.
kernel/param.h declares MAXARG, which may be useful if you need to declare an argv array.
Add the program to UPROGS in Makefile.
Changes to the file system persist across runs of qemu; to get a clean file system run make clean and then make qemu.

背景知识

  • 一些命令如cat、sort、uniq、grep等命令均支持管道符 | ,是因为这些命令均可从标准输入中读取要处理的文本(即从标准输入中读取参数);而对于部分命令,例如rm、kill等命令则不支持从标准输入中读取参数,只支持从命令行中读取参数(即rm命令后面必须指定删除的文件或者目录,kill命令后面必须要指定杀死的进程号等);这是就需要xargs来将标准输入转化为这种命令的参数,xargs常和管道符 | 配合使用
  • exec(char *file, char *argv[]) //对文件file,以argv参数加载

主要考虑如何把管道符 | 前面的指令得到的标准输入(可能有多行)放到xargs后的命令参数argv[1]所代表文件的参数数组中;

// 带参数列表,执行某个程序
void run(char *program, char **args) {
	if(fork() == 0) { // child exec
		exec(program, args);
		exit(0);
	}
	return; // parent return
}

int main(int argc, char *argv[]){
	char buf[2048]; // 读入时使用的内存池
	char *p = buf, *last_p = buf; // 当前参数的结束、开始指针
	char *argsbuf[128]; // 全部参数列表,字符串指针数组,包含 argv 传进来的参数和 stdin 读入的参数
	char **args = argsbuf; // 指向 argsbuf 中第一个从 stdin 读入的参数
	for(int i=1;i<argc;i++) {
		// 将 argv 提供的参数加入到最终的参数列表中
		*args = argv[i];
		args++;
	}
	char **pa = args; // 开始读入参数
	while(read(0, p, 1) != 0) {
		if(*p == ' ' || *p == '\n') {
			// 读入一个参数完成(以空格分隔,如 `echo hello world`,则 hello 和 world 各为一个参数)
			*p = '\0';	// 将空格替换为 \0 分割开各个参数,这样可以直接使用内存池中的字符串作为参数字符串
						// 而不用额外开辟空间
			*(pa++) = last_p;//put string 
			last_p = p+1;

			if(*p == '\n') {
				// 读入一行完成
				*pa = 0; // 参数列表末尾用 null 标识列表结束
				run(argv[1], argsbuf); // 执行最后一行指令
				pa = args; // 重置读入参数指针,准备读入下一行
			}
		}
		p++;
	}
	if(pa != args) { // 如果最后一行不是空行
		// 收尾最后一个参数
		*p = '\0';
		*(pa++) = last_p;
		// 收尾最后一行
		*pa = 0; // 参数列表末尾用 null 标识列表结束
		// 执行最后一行指令
		run(argv[1], argsbuf);
	}
	while(wait(0) != -1) {}; // 循环等待所有子进程完成,每一次 wait(0) 等待一个
	exit(0);
}

你可能感兴趣的:(MIT6.081操作系统,unix,c语言)