接上一个简单的demo之后,又花了几天时间做了个简单的多线程断点续传下载。主要用到的知识还是上一个Demo中的关于如何获取指定范围数据的知识点。这里因为是多线程的断点续传下载,所以当然是在逻辑上要让文件的各个数据部分同时被写入,因此就引入了数据分块获取和写入以及记录分块信息的功能。
实现思路:
1、仿照迅雷,使用两个临时文件,一个是要下载的文件加上临时后缀,一个是记录下载信息的临时文件。
2、开始时,判断目标文件是否存在,存在则认为下载完成了,否则以不存在则创建的方式打开临时下载文件(这里如果是新创建的话,要将文件大小设置成目标文件大小,这里文件的大小要通过HTTP的HEAD VERB去获取),然后以同样的方式打开记录下载信息的文件。
3、如果下载信息的文件是新创建的,则认为是第一次下载,根据文件大小去进行分块,要记录分块的起始位置和对应块的大小。如果是已经存在的,则从文件中读取对应信息。
4、开辟对应数目的下载线程(挂起状态),同时开辟一个监视线程(也处于挂起状态的)对下载状态进行监视。然后启动这些线程,等待所有下载线程都结束。
5、所有下载线程都结束后,要去判断是否下载完成,如果没有下载完成,则要将当前下载状态信息都记录到文件中。如果下载完成了,则删除临时记录下载信息的文件,同时将临时下载文件重命名即表示完成。
其实整个过程中有很多需要注意的地方,包括线程开启的先后顺序,还有所有线程都要写入文件,以及监视线程的结束问题等。代码中有比较详尽的注释,当然我仍然使用的很蹩脚的英语注释的。
/* Project: Multi Thread Breakpoint Continue Download Demo ——— MltBtkptCtnDldDemo Filename: main.cpp Function: 多线程断点续传功能实验。 获取要下载的URL及文件名后,首先获取文件大小,打开该文件名对应的临时文件(不存在则创建对应大小的空文件,文件名.td),以及对应的配置文件(文件名.td.cfg,不存在则重新分块,否则默认初始化分块数据)。 开辟N个线程去负责接收每一个块数据,并写入到对应的位置中。 当中断时,每个线程都向记录文件中写入当前数据块的接受状态(包括位置,剩余大小等)。 可优化的地方: 1、根据要下载的对象的大小,去决定要开辟合适的线程数目,减少开辟线程的开销同时能增加下载速度。 2、将数据写入功能单独交给一个线程去做。 3、当有线程先完成了所有任务时,请求去均摊其他线程工作。 4、能否实现P2P下载。 Knowledge: 需要具备HTTP相关知识。 HTTP 请求头中verb(行为)可以用HEADER去获取大小. HTTP Header中存在range关键字用于指定所请求的数据范围。格式为"Range: bytes=StartPos-EndPos\r\n"如果"EndPos"不写,则默认为接收后面所有数据。 如果StartPos超出了范围,则会返回”416 Requested Range Not Satisfiable“。当存在Range时返回的状态码始终为206。 另外当EndPos==StartPos时,会返回1字节数据。当EndPos>StartPos时,返回所有数据。 临界区访问相关知识。 多线程设计。 History: time: 2012/11/29 remarks: start auth: monotone time: 2012/12/02 remarks: finish auth: monotone */ #include <iostream> #include <string> #include <windows.h> #include <WinInet.h> #pragma comment(lib, "wininet.lib") #include <shlwapi.h> using namespace std; const char* STR_TEST_URL = "http://dl_dir.qq.com/qqfile/qq/QQ2013/QQ2013Beta1.exe"; const DWORD DWORD_MAX_CCH_OF_TEST_URL = 256; const DWORD DWORD_MAX_CCH_OF_HOST_NAME = 128; const DWORD DWORD_MAX_CCH_OF_URL_PATH = 256; const LONG COUNT_OF_DOWNLOAD_THREADS = 10; // Critical Section Object typedef struct _MCDD_CRITICALSECTION_OBJECTS { HANDLE lhFile; LONGLONG llFileValidDataSize; }MCDD_CRITICALSECTION_OBJECTS,*PMCDD_CRITICALSECTION_OBJECTS; // download thread parameter information typedef struct _DOWNLOAD_THREAD_PARAMETER { LONGLONG llStartPos; LONGLONG llBlockSize; const char* lscUrlPath; HINTERNET lhConnent; MCDD_CRITICALSECTION_OBJECTS* lpdCriticalSectionData ; }DOWNLOAD_THREAD_PARAMETER,*PDOWNLOAD_THREAD_PARAMETER; // Monitoring thread parameter information typedef struct _MONITORING_THREAD_PARAMETER { bool* lpbIsAllDownloadThreadFinish; LONGLONG llFileSize; MCDD_CRITICALSECTION_OBJECTS* lpdCriticalSectionData ; }MONITORING_THREAD_PARAMETER,*PMONITORING_THREAD_PARAMETER; // Get the size of specified url path // return -1 means error. LONGLONG GetFileSizeOfSession(IN HINTERNET nhInetConnect, IN const char* nscUrlPath); DWORD WINAPI DownloadThreadProc(__in LPVOID lpParameter); DWORD WINAPI MonitoringThreadProc(__in LPVOID lpParameter); CRITICAL_SECTION gdCriticalSection; int main() { HINTERNET hInetOpen = NULL; HINTERNET hInetConnect = NULL; MCDD_CRITICALSECTION_OBJECTS ldCriticalSectionData; ZeroMemory(&ldCriticalSectionData, sizeof(MCDD_CRITICALSECTION_OBJECTS)); HANDLE lhConfigFile = NULL; HANDLE lhMonitoringThread = NULL; HANDLE lhThreadArry[COUNT_OF_DOWNLOAD_THREADS]; for(int i = 0; i < COUNT_OF_DOWNLOAD_THREADS; ++i) { lhThreadArry[i] = NULL; } // Initialize the critical section one time only. InitializeCriticalSection(&gdCriticalSection); do { // struct to contains the constituent parts of a URL URL_COMPONENTSA ldCrackedURL; ZeroMemory(&ldCrackedURL, sizeof(URL_COMPONENTS)); ldCrackedURL.dwStructSize = sizeof(URL_COMPONENTS); // 必须设置 // buffer to store host name char szHostName[DWORD_MAX_CCH_OF_HOST_NAME] = {0}; ldCrackedURL.lpszHostName = szHostName; ldCrackedURL.dwHostNameLength = DWORD_MAX_CCH_OF_HOST_NAME; // 字符数 // buffer to store url path char szUrlPath[DWORD_MAX_CCH_OF_URL_PATH] = {0}; ldCrackedURL.lpszUrlPath = szUrlPath; ldCrackedURL.dwUrlPathLength = DWORD_MAX_CCH_OF_URL_PATH; // 字符数 // 该函数用来将给定的Ulr分割成对应的部分。如果URL_COMPONENTS内部成员指针指向提供的缓冲,则其对应的长度也必须提供缓冲区大小。函数成功返回后,会将实际拷贝的内容大小存放在指针对象的大小中,不包括最后结束符。 // 如果提供的URL_COMPONENTS内部各指针指向NULL,而dwStructSize成员不为0,则调用函数后,指针成员会存储对应内容的第一个字符的地址,对应长度则为该内容实际的长度。 // 注意不要在使用"file://"类的URL时包含空格。 if(FALSE == InternetCrackUrlA(STR_TEST_URL, (DWORD)strlen(STR_TEST_URL), 0, &ldCrackedURL)) { // GetLastError(); break; } // First, open file and init download config-------------------------------------------------------------------------------------------------------------- // open internet hInetOpen = InternetOpenA("MltBrkptCtnDldDemo", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0); if(NULL == hInetOpen) { // GetLastError(); break; } // connect server hInetConnect = InternetConnectA(hInetOpen, ldCrackedURL.lpszHostName, ldCrackedURL.nPort, NULL, NULL, INTERNET_SERVICE_HTTP, 0, 0); if(NULL == hInetConnect) { // GetLastError(); break; } // Get file size LONGLONG llFileSize = GetFileSizeOfSession(hInetConnect, ldCrackedURL.lpszUrlPath); if(-1 == llFileSize) break; // Get file name string loStrUrl = STR_TEST_URL; string loStrFileName; string::size_type liFileNamePos = loStrUrl.rfind("/"); if(string::npos != liFileNamePos) { loStrFileName = loStrUrl.substr(liFileNamePos + 1); } if(false != loStrFileName.empty()) { cout << "Get file name failed." << endl; break; } // if the file has exist, just think it has download finished. if(INVALID_FILE_ATTRIBUTES != GetFileAttributesA(loStrFileName.c_str())) { cout << "The file has exist." << endl; break; } // open temp file with share read access string loStrTempFileName = loStrFileName; loStrTempFileName += ".td"; ldCriticalSectionData.lhFile = CreateFileA(loStrTempFileName.c_str(), GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ, NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL); if(ldCriticalSectionData.lhFile == INVALID_HANDLE_VALUE) { break; } // new create file,set file size if(ERROR_ALREADY_EXISTS != GetLastError()) { LARGE_INTEGER ldFileSize; ldFileSize.QuadPart = llFileSize; if(FALSE == SetFilePointerEx(ldCriticalSectionData.lhFile, ldFileSize, NULL, FILE_BEGIN)) break; if(FALSE == SetEndOfFile(ldCriticalSectionData.lhFile)) break; } // open temp config file with share read access string loStrConfigFileName = loStrTempFileName; loStrConfigFileName += ".cfg"; lhConfigFile = CreateFileA(loStrConfigFileName.c_str(), GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ, NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL); if(INVALID_HANDLE_VALUE == lhConfigFile) { break; } // if file is not create, read the data in it. DOWNLOAD_THREAD_PARAMETER ldDownloadThreadParameterArry[COUNT_OF_DOWNLOAD_THREADS]; ZeroMemory(ldDownloadThreadParameterArry, sizeof(DOWNLOAD_THREAD_PARAMETER) * COUNT_OF_DOWNLOAD_THREADS); // Set every block start position and size if(ERROR_ALREADY_EXISTS == GetLastError()) { // read from file string loStrConfigBuffer; char lscBuffer[1024] = {0}; DWORD ldwBytesReaded; while(FALSE != ReadFile(lhConfigFile, lscBuffer, 1024, &ldwBytesReaded, NULL) && 0 != ldwBytesReaded) { loStrConfigBuffer += lscBuffer; ZeroMemory(lscBuffer, sizeof(lscBuffer)); } // valid data size string::size_type liStartPos = 0; string::size_type liEndPos = loStrConfigBuffer.find(";", liStartPos); string loStrWord; if(string::npos != liEndPos) { loStrWord = loStrConfigBuffer.substr(liStartPos, liEndPos - liStartPos); ldCriticalSectionData.llFileValidDataSize = _atoi64(loStrWord.c_str()); // block info liStartPos = liEndPos; liEndPos = loStrConfigBuffer.find(",", liStartPos); int liBlockIndex = 0; while(liEndPos != string::npos) { // block start position liStartPos += 1; loStrWord = loStrConfigBuffer.substr(liStartPos, liEndPos - liStartPos); // block size liStartPos = liEndPos; liEndPos = loStrConfigBuffer.find(";", liStartPos); if(liEndPos != string::npos) { // get the start position ldDownloadThreadParameterArry[liBlockIndex].llStartPos = _atoi64(loStrWord.c_str()); // get the size liStartPos += 1; loStrWord = loStrConfigBuffer.substr(liStartPos, liEndPos - liStartPos); ldDownloadThreadParameterArry[liBlockIndex].llBlockSize = _atoi64(loStrWord.c_str()); } else break; ++liBlockIndex; // next block info liStartPos = liEndPos; liEndPos = loStrConfigBuffer.find(",", liStartPos); } } } else // init dada { LONGLONG llBlockSize = llFileSize / COUNT_OF_DOWNLOAD_THREADS; LONGLONG llLastBlockSize = llFileSize - llBlockSize * (COUNT_OF_DOWNLOAD_THREADS - 1); for(int i = 0; i < COUNT_OF_DOWNLOAD_THREADS; ++i) { // note the last block size if(i != COUNT_OF_DOWNLOAD_THREADS - 1) ldDownloadThreadParameterArry[i].llBlockSize = llBlockSize; else ldDownloadThreadParameterArry[i].llBlockSize = llLastBlockSize; } } // set the common info and create thread for(int i = 0; i < COUNT_OF_DOWNLOAD_THREADS; ++i) { ldDownloadThreadParameterArry[i].lpdCriticalSectionData = &ldCriticalSectionData; ldDownloadThreadParameterArry[i].lhConnent = hInetConnect; ldDownloadThreadParameterArry[i].lscUrlPath = ldCrackedURL.lpszUrlPath; // create thread and suspend it lhThreadArry[i] = CreateThread(NULL, 0, DownloadThreadProc, &(ldDownloadThreadParameterArry[i]), CREATE_SUSPENDED, NULL); if(NULL == lhThreadArry[i]) { cout << "create thread failed." << endl; break; } } // create monitoring thread to print schedule MONITORING_THREAD_PARAMETER ldMonitoringThreadParameter; ldMonitoringThreadParameter.lpdCriticalSectionData = &ldCriticalSectionData; ldMonitoringThreadParameter.llFileSize = llFileSize; bool lbIsAllDownloadThreadFinish = false; ldMonitoringThreadParameter.lpbIsAllDownloadThreadFinish = &lbIsAllDownloadThreadFinish; lhMonitoringThread = CreateThread(NULL, 0, MonitoringThreadProc, &ldMonitoringThreadParameter, CREATE_SUSPENDED, NULL); if(INVALID_HANDLE_VALUE == lhMonitoringThread) { cout << "create monitoring thread failed" << endl; break; } // resume all the thread // first resume the monitoring thread ResumeThread(lhMonitoringThread); // then resume the download thread int i; for(i = 0; i < COUNT_OF_DOWNLOAD_THREADS; ++i) { if(-1 == ResumeThread(lhThreadArry[i])) { cout << "start thread failed." << endl; break; } } if(i != COUNT_OF_DOWNLOAD_THREADS) break; // stop the monitoring thread when all download thread finish. if(WAIT_OBJECT_0 == WaitForMultipleObjects(COUNT_OF_DOWNLOAD_THREADS, lhThreadArry, TRUE, INFINITE)) lbIsAllDownloadThreadFinish = true; // download finish if(ldCriticalSectionData.llFileValidDataSize == llFileSize) { // delete the config file CloseHandle(lhConfigFile); lhConfigFile = NULL; DeleteFileA(loStrConfigFileName.c_str()); // rename the des file CloseHandle(ldCriticalSectionData.lhFile); ldCriticalSectionData.lhFile = NULL; rename(loStrTempFileName.c_str(), loStrFileName.c_str()); cout << " file : " << loStrFileName.c_str() << " finish." << endl; } else // write crt download state to config file { // valid data size char lscTempBuffer[30] = {0}; if(0 != _i64toa_s((__int64)(ldCriticalSectionData.llFileValidDataSize), lscTempBuffer, sizeof(lscTempBuffer), 10)) { cout << "convert longlong to string failed." << endl; break; } string loStrBufferToWrite; loStrBufferToWrite += lscTempBuffer; loStrBufferToWrite += ";"; // write block info int liBlockIndex = 0; char lscStartPos[30] = {0}; char lscBlockSize[30] = {0}; while(liBlockIndex < COUNT_OF_DOWNLOAD_THREADS) { if(ldDownloadThreadParameterArry[liBlockIndex].llBlockSize < llFileSize && ldDownloadThreadParameterArry[liBlockIndex].llBlockSize != 0) { ZeroMemory(lscStartPos, sizeof(lscStartPos)); ZeroMemory(lscBlockSize, sizeof(lscBlockSize)); // block start pos and block size if(0 == _i64toa_s((__int64)(ldDownloadThreadParameterArry[liBlockIndex].llStartPos), lscStartPos, sizeof(lscStartPos), 10) && 0 == _i64toa_s((__int64)(ldDownloadThreadParameterArry[liBlockIndex].llBlockSize), lscBlockSize, sizeof(lscBlockSize), 10)) { loStrBufferToWrite += lscStartPos; loStrBufferToWrite += ","; loStrBufferToWrite += lscBlockSize; loStrBufferToWrite += ";"; } } ++liBlockIndex; } // write buffer to file DWORD ldwCbWritten; if(FALSE == WriteFile(lhConfigFile, (LPVOID)loStrBufferToWrite.c_str(), loStrBufferToWrite.size(), &ldwCbWritten, NULL) || ldwCbWritten != loStrBufferToWrite.size()) { cout << "write download infomation to config file failed." << endl; break; } } } while (false); // Release resources used by the critical section object. DeleteCriticalSection(&gdCriticalSection); // close all thread handle for(int i = 0; i < COUNT_OF_DOWNLOAD_THREADS; ++i) { if(NULL != lhThreadArry[i]) CloseHandle(lhThreadArry[i]); } if(NULL != lhMonitoringThread) CloseHandle(lhMonitoringThread); if(NULL != ldCriticalSectionData.lhFile) { CloseHandle(ldCriticalSectionData.lhFile); } if(NULL != lhConfigFile) { CloseHandle(lhConfigFile); } if(NULL != hInetConnect) { InternetCloseHandle(hInetConnect); } if(NULL != hInetOpen) { InternetCloseHandle(hInetOpen); } getchar(); return 0; } DWORD WINAPI DownloadThreadProc( __in LPVOID lpParameter ) { PDOWNLOAD_THREAD_PARAMETER lpoParamerInfo = (PDOWNLOAD_THREAD_PARAMETER)lpParameter; HINTERNET lhInetRequest = NULL; do { if(NULL == lpoParamerInfo) break; // convert the range start position to character char lscRangeStartPosition[30] = {0}; if(0 != _i64toa_s((__int64)(lpoParamerInfo->llStartPos), lscRangeStartPosition, sizeof(lscRangeStartPosition), 10)) { break; } // convert the range end position to character char lscRangeEndPosition[30] = {0}; if(0 != _i64toa_s((__int64)(lpoParamerInfo->llStartPos + lpoParamerInfo->llBlockSize - 1), lscRangeEndPosition, sizeof(lscRangeEndPosition), 10)) { break; } // additional header: set the file data range. string loAdditionalHeader = "Range: bytes="; loAdditionalHeader += lscRangeStartPosition; // start position of remaining loAdditionalHeader += "-"; loAdditionalHeader += lscRangeEndPosition; loAdditionalHeader += "\r\n"; // open request with "GET" verb to get the remaining file data const char* lplpszAcceptTypes[] = {"* /*", NULL}; lhInetRequest = HttpOpenRequestA(lpoParamerInfo->lhConnent, "GET", lpoParamerInfo->lscUrlPath, "HTTP/1.1", NULL, lplpszAcceptTypes, 0, 0); if(NULL == lhInetRequest) { // GetLastError(); break; } // send request with additional header if(FALSE == HttpSendRequestA(lhInetRequest, loAdditionalHeader.c_str(), loAdditionalHeader.size(), NULL, 0)) { // GetLastError(); break; } // loop to read the data from HTTP and store to file BYTE lpbBufferToReceiveData[2048]; // 存放读取数据的Buffer DWORD ldwCbBuffer = 2048; DWORD ldwCrtCbReaded; // 本次实际读取的字节数 DWORD ldwCbWritten = 0; // 本次实际写入到文件的字节数 bool lbIsOk = false; LONGLONG llDumpStartPos = lpoParamerInfo->llStartPos; LARGE_INTEGER ldTempLargeInteger; do { // read data if (FALSE == InternetReadFile(lhInetRequest, lpbBufferToReceiveData, ldwCbBuffer, &ldwCrtCbReaded)) { cout << "read data failed." << endl; break; } if(ldwCrtCbReaded == 0) // all data haved been read. { cout << "start position: " << llDumpStartPos << " block size: " << lpoParamerInfo->llBlockSize << " finish. " << endl; break; } EnterCriticalSection(&gdCriticalSection); // set position ldTempLargeInteger.QuadPart = lpoParamerInfo->llStartPos; if(FALSE == SetFilePointerEx(lpoParamerInfo->lpdCriticalSectionData->lhFile, ldTempLargeInteger, NULL, FILE_BEGIN)) { break; } // write to file if(FALSE == WriteFile(lpoParamerInfo->lpdCriticalSectionData->lhFile, lpbBufferToReceiveData, ldwCrtCbReaded, &ldwCbWritten, NULL) || ldwCbWritten != ldwCrtCbReaded) { cout << "A exception happens when write data to file" << endl; break; } lpoParamerInfo->llStartPos += ldwCrtCbReaded; lpoParamerInfo->llBlockSize -= ldwCrtCbReaded; lpoParamerInfo->lpdCriticalSectionData->llFileValidDataSize += ldwCrtCbReaded; LeaveCriticalSection(&gdCriticalSection); // clear data in buffer ZeroMemory(lpbBufferToReceiveData, ldwCrtCbReaded); } while (true); } while (false); if(NULL != lhInetRequest) InternetCloseHandle(lhInetRequest); return 0; } DWORD WINAPI MonitoringThreadProc( __in LPVOID lpParameter ) { PMONITORING_THREAD_PARAMETER lpoParameterInfo = (PMONITORING_THREAD_PARAMETER)lpParameter; if(NULL != lpoParameterInfo) { while(false == *lpoParameterInfo->lpbIsAllDownloadThreadFinish) { EnterCriticalSection(&gdCriticalSection); // show current schedule cout << "current schedule: " << lpoParameterInfo->lpdCriticalSectionData->llFileValidDataSize * 100 / lpoParameterInfo->llFileSize << "% ..." << endl; LeaveCriticalSection(&gdCriticalSection); Sleep(500); }; } return 0; } // Get the size of specified url path // return -1 means error. LONGLONG GetFileSizeOfSession(IN HINTERNET nhInetConnect, IN const char* nscUrlPath) { LONGLONG llReturn = -1; HINTERNET lhRequest = NULL; do { if(NULL == nhInetConnect) break; if(NULL == nscUrlPath) break; // open request with "HEAD" verb to get the remaining file data //const char* lplpszAcceptTypes[] = {"* /*", NULL}; lhRequest = HttpOpenRequestA(nhInetConnect, "HEAD", nscUrlPath, "HTTP/1.1", NULL, NULL, 0, 0); if(NULL == nhInetConnect) { // GetLastError(); break; } // send request with additional header if(FALSE == HttpSendRequestA(lhRequest, NULL, 0, NULL, 0)) { // GetLastError(); break; } // query the length of file which will be download char lscFileSizeBuffer[30] = {0}; DWORD ldwCbBuffer = sizeof(lscFileSizeBuffer); if(FALSE == HttpQueryInfoA(lhRequest, HTTP_QUERY_CONTENT_LENGTH, lscFileSizeBuffer, &ldwCbBuffer, 0)) { DWORD lldw = GetLastError(); break; } llReturn = _atoi64(lscFileSizeBuffer); } while (false); if(NULL != lhRequest) { InternetCloseHandle(lhRequest); } return llReturn; }
不足之处:
1、所有下载线程都需要写入文件,就存在等待问题了,要是能提供一个线程,专门处理排队好的数据,然后进行写入,应该会更好。
2、很多原因会导致部分线程很快接收数据结束了,这种情况下要是能从其他线程中再分出下载块提供给这些线程使用就能充分利用所有线程了。
3、仍然是没有检测服务器是否支持断点续传以及当前网络问题。
4、总线程数目没有用很好分方法去计算出来。
总结:
关于网络下载这一块,其实要考虑的因素很多,比如是否真的需要多线程下载?多线程下载会占用很高的带宽,如果某些软件升级的时候占用很宽的带宽,用户是否容易接受呢?另外要考虑计算机硬件是否真的支持多线程,单核还是多核?还有线程开辟的数目,以及下载对象的大小等。所有这些情况都要综合考虑,所以如果要做一个专业的下载软件,其实还是相当复杂的。