多线程实现单词统计c语言程序,多线程统计多个文件的单词数目---C++0x多线程使用示例...

对多线程经验不多,仅提供一些个人看法,如有错误请指正。

1.C++0X 多线程简介

C++0x STL提供了对多线程的支持就不用再去选择跨平台的多线程库了,用标准的吧:)

看了一下BOOST和当前STL的接口几乎完全一致:)也就是说用boost thread写的程序应该把例如boost::thread, boost::unique_lock ...等等的地方换成std::thread, std::unique_lock...就OK了,个人觉得,不过我还没用过boost thread.所以说熟悉pthread的应该能很快上手,而熟悉boost thread应就可以直接上手了~

但是现在GCC还不支持thread local变量。不支持原子操作。基本的mutex,condional 变量等等都支持了。

关于C++0X thread的入门介绍请参考:

该文有中文的翻译

C++0x 概览: 多线程(1) - VC++ - 逆风者

Multithreading in C++0x

[译]放弃原来的线程API,采用新的C++线程

另外使用C++0X的话大家最好了解一下所谓的右值引用和move语义,个人觉得这两个概念很有用处的,算是突破性的概念了,当然不清楚也可以用thread.参考刘未鹏写的,《C++0x漫谈》系列之:右值引用(或“move语意与完美转发”)(上),既可

http://blog.csdn.net/pongba/archive/2007/07/10/1684519.aspx

或者看A Brief Introduction to Rvalue References

2. 如何编译运行

我只在linux下用GCC4.4.2试过。能够调试成功。

首先你要确保你的GCC编译时是thread enable的一般默认安装就是enable的了。可以用gcc -v查看一下:

allen:~/study/unix_system/CH14$ gcc -v

使用内建 specs。

目标:i686-pc-linux-gnu

配置为:../configure

线程模型:posix

gcc 版本 4.4.2 (GCC)

如上显示线程模型posix,表示就是可以用多线程了否则显示为空。注意本质上还是posix多线程所以你要加编译参数-pthread

同时标明使用 c++0x特性, -std=c++0x 或者-std=gnu0x

例如:

g++ -g -o wordcount wordcount.cc -std=c++0x -pthread

3.一个完整的小程序示例,多线程文本单词数目统计

首先说一下程序目的,多个线程每个统计一个文本的单词数目,作为实验程序,这里演示3个线程统计的例子。这个例子会涉及到线程的同步,因为要每个线程统计完之后向主线程报告自己已经完成统计并且将信息写到互斥区的mail_box中。而主线程会将信息从mail_box中读出,然后通知下一个线程可以写mail_box.就是说当3个统计单词的线程看为写者,主线程看为读者,一个时间只能有一个线程访问mail_box读或者写。

这个问题来源是《unix 编程实践教程》p14.5.2使用条件变量编写程序 一节的例子 twocount4.c的例子改写的,原文中使用pthread, 但是它是有问题的,因为书中的例子是2个线程负责统计,如果扩充到3个会死锁,主要由于书中的例子只用了一个条件变量,并没有区分读者通知和写者通知。我以前写过一篇文章分析这个情况多线程同步问题 条件变量和信号量  http://www.cnblogs.com/rocketfan/archive/2009/07/24/1530477.html

当时我对条件变量理解的还不是很清楚,所以用信号量写了一个正确的版本。其实mutex+条件变量可以解决一切用信号量能解决的问题。不过我还是不是很清楚为什么c++0x STL不提供信号量,boost thread也不提供而是存在在boost interprocess库中,好像大家用多线程也基本不用信号量,为什么呢,谁比较清楚给我详细解释下,谢谢。确实条件变量更强大,但我始终觉得用信号量还是更清晰一些的,操作系统课本也基本都用信号量举例。比如操作系统内核设与设计原理一书。例如这个问题用信号量可以表示如下,当然用条件变量类似:

write = 0

read = 0

mail = null

//server  main thread

v(read) //to let one client sub thread can write at first since mail == null

for (i =0 ; i < sub threads num; i++)

p(write)

read mail

mail = null

v(read)

//clinent  sub threads

p(read)

write mail

v(write)

用C++0X,mutex + conditional variable 的实现,如下, 有很详细的注释的,如有问题还请指正:

代码

/**

*  ==============================================================================

*

*          \file   wordcount.cc

*

*        \author

*

*          \date   2009-12-02 09:55:25.197674

*

*   Description:   演示c++0x,线程建立结束,mutex,条件变量的使用,注意为了简单名字未加

*                  std::

*  ==============================================================================

*/

#include #include #include #include using namespace std;

mutex m;

condition_variable reader_cond;   //cond notify reader

condition_variable writer_cond;   //cond2 notify writer

struct MailBox {

string file_name;  //which file

int count;         //how many words

int tid;           //which thread write

};

MailBox mail_box;

class WordCounter

{

private:

int tid_;

string infile_name_;

public:

WordCounter(int tid, string infile_name):

tid_(tid),infile_name_(infile_name){}

void operator()() {

int c, prevc = '\0';

int count = 0;

//统计文本的单词数目

ifstream input_file(infile_name_.c_str(),ios::binary);

input_file.unsetf(ios::skipws); // 要接受空格符

istreambuf_iteratoreos;               // end-of-range iterator

istreambuf_iteratoriit (input_file);

for (; iit!=eos; ++iit) {

c = *iit;

if ( !isalnum(c) && isalnum(prevc) )

count++;

prevc = c;

}

input_file.close();

cout << "COUNT " << tid_ << " waiting to get lock" << endl;

unique_locklk(m);

cout << "COUNT " << tid_ << " have lock, store data" << endl;

//如果mail_box不是空的,注意mail_box是互斥区所保护的

while (!mail_box.file_name.empty()) {

cout << "COUNT " << tid_ << " oops.. mail box not empty, wait for signal" << endl;

//等待读者读完并通知,写着释放锁控制

writer_cond.wait(lk);  //注意wait只接受unique_lock不接受lock_guard

//重新锁住互斥区

}

cout << "COUNT " << tid_ << " OK, I can write mail" << endl;

mail_box.file_name = infile_name_;

mail_box.count = count;

mail_box.tid = tid_;

cout << "COUNT " << tid_ << " rasing flag" << endl;

//通知读者我已经写完你可以读了

reader_cond.notify_one();

//注意在这里我还是拥有锁的,所以尽管读者已经借到可读通知了,但是它还是会被卡到互斥区外的

cout << "COUNT " << tid_ << " Finished writting.Words are " << count << " for file "

<< infile_name_ << endl;

cout << "COUNT " << tid_ << " will unlock" << endl;

//在结束的时候lk的析构函数会自动调用解锁,注意这之后不一定是读者能够获得互斥区,有可能

//被其它写者抢先抢到但是没关系,它们发现mail_box非空,会wait,并且释放互斥区的,所以可能

//出现下面的情况

//COUNT 0 will unlock

//COUNT 1 have lock, store data

//COUNT 1 oops.. mail box not empty, wait for signal

//COUNT 2 have lock, store data

//COUNT 2 oops.. mail box not empty, wait for signal

//MAIN: Wow! flag was raised, I have the lock

//11 1.log 0

//Main has finished reading

}

};

void read_mail(char *argv[])

{

//读者要先锁住互斥区

unique_locklk(m);

cout << "Main locking the box" << endl;

//建立3个线程,注意这里采用的是建立空线程,

//然后利用临时变量的move,线程是不可拷贝但是可move的

thread t[3];

for (int i = 0; i < 3; i++)

t[i] = thread(WordCounter(i, argv[i + 1]));

int total_words = 0;

//读者要等待3个写者统计完文本信息

for(int reports_in = 0; reports_in < 3; reports_in++) {

cout << "MAIN: waiting for flag to go up" << endl;

//等待一个写者写完mail,同时释放指定的锁m

reader_cond.wait(lk);

//收到写者写完的信号,到达这里,读者可以读了,同时锁住互斥区

cout << "MAIN: Wow! flag was raised, I have the lock" << endl;

cout << mail_box.count << " " << mail_box.file_name << " " << mail_box.tid << endl;

total_words += mail_box.count;

//读者读完后将mail_box的文件名置空,标明已经读完

mail_box.file_name.clear();

cout << "Main has finished reading" << endl;

//通知写者我已经读完mail box

writer_cond.notify_one();

}

for (int i =0; i < 3; i++) {

t[i].join();

}

cout << total_words << ": total words" << endl;

}

int main(int argc, char *argv[])

{

if ( argc != 4 ){

cout << "usage: " << argv[0] << " file1 file2 file3" << endl;

exit(1);

}

read_mail(argv);

}

运行结果

Main locking the box

COUNT 0 waiting to get lock

COUNT 1 waiting to get lock

COUNT 2 waiting to get lock

MAIN: waiting for flag to go up

COUNT 0 have lock, store data

COUNT 0 OK, I can write mail

COUNT 0 rasing flag

COUNT 0 Finished writting.Words are 11 for file 1.log

COUNT 0 will unlock

COUNT 1 have lock, store data

COUNT 1 oops.. mail box not empty, wait for signal

COUNT 2 have lock, store data

COUNT 2 oops.. mail box not empty, wait for signal

MAIN: Wow! flag was raised, I have the lock

11 1.log 0

Main has finished reading

MAIN: waiting for flag to go up

COUNT 1 OK, I can write mail

COUNT 1 rasing flag

COUNT 1 Finished writting.Words are 128 for file twordcount1.c

COUNT 1 will unlock

MAIN: Wow! flag was raised, I have the lock

128 twordcount1.c 1

Main has finished reading

MAIN: waiting for flag to go up

COUNT 2 OK, I can write mail

COUNT 2 rasing flag

COUNT 2 Finished writting.Words are 382 for file twordcount4_semaphore.c

COUNT 2 will unlock

MAIN: Wow! flag was raised, I have the lock

382 twordcount4_semaphore.c 2

Main has finished reading

521: total words

你可能感兴趣的:(多线程实现单词统计c语言程序)