libcurl programming

Compiling

On windows platform, goto the unpack folder, such as d:/libcurl/curl, find the winbuild diretory. Open the vs command line window and use “nmake makefile.vc” to compile the code, here is the sample to compile the libcurl library to x86 debug static library.

nmake makefile.vc vc=10 mode=static machine=x86 debug=yes

Linking

If you want to link the static library of libcurl to your project, you need to define the macro “CURL_STATICLIB” before including relate header, such as “<curl/curl.h>”, “<curl/easy.h>”.
#define CURL_STATICLIB
#include <curl/curl.h>
 

Macro

LIBCURL_VERSION_NUM: version for libcurl, value “0x070c03” means version 7.12.3
CURL_STATICLIB: using static library

With C++

There's basically only one thing to keep in mind when using C++ instead of C when interfacing libcurl:
The callbacks CANNOT be non-static class member functions .
 

API

There are three category for libcurl, easy, mult and share.
#include <curl/easy.h>
#include <curl/multi.h>

Easy interafce

curl_easy_init

This function must be the first function to call, and it returns a CURL easy handle that you must use as input to other easy-functions. curl_easy_init contains curl_global_init invoking automatically if you have not yet called it. But this is lethal in multi-threaded cases, since curl_global_init is not thread-safe.

You are strongly advised to not allow this automatic behaviour, by calling curl_global_init yourself properly.

If it returns NULL to curl pointer, something went wrong and you can not use the other funcitons.

curl_easy_cleanup

It is oppsoite of the curl_easy_init function and it should be the last function call for an easy sesstion. Any uses of the handlel after this function has been called and hanve returned, are illegal. This function return None.

curl_easy_escape
URL encoding. For the character in the url string inputed as parameter, that is not a-z, A-Z, 0-9, '-', '.', '_' or '~' are converted to their "URL escaped" version (%NN where NN is a two-digit hexadecimal number). Here is sample code for it.
CURL * curl = nullptr;
char * esc = nullptr;
CURLcode ret;
ret = curl_global_init(CURL_GLOBAL_ALL);
curl = curl_easy_init();
esc = curl_easy_escape(curl, "http://lcoalhost", 0);
cout<<esc<<endl;
curl_free((void*)esc);
curl_easy_cleanup(curl);
Output is: http%3A%2F%2Flcoalhost

curl_easy_getinfo

curl_easy_setopt
CURLOPT_URL
Pass in a pointer to the actual URL to deal with. The parameter should be a char * to a zero terminated string which must be URL-encoded in the following format:
scheme://host:port/path
 

Practise

Using C++ non-static functions for callbacks?

libcurl is a C library, it doesn't know anything about C++ member functions.
You can overcome this "limitation" with a relative ease using a static member function that is passed a pointer to the class:

// f is the pointer to your object.
static YourClass::func(void *buffer, size_t sz, size_t n, void *f)
{
// Call non-static member function.
static_cast<YourClass*>(f)->nonStaticFunction();
}

// This is how you pass pointer to the static function:
curl_easy_setopt(hcurl, CURLOPT_WRITEFUNCTION, YourClass::func);
curl_easy_setopt(hcurl, CURLOPT_WRITEDATA, this);

Get the remote file’s size

option setting
curl_easy_setopt(handle, CURLOPT_URL, url);
curl_easy_setopt(handle, CURLOPT_PROXY, szProxy); // for proxy
curl_easy_setopt(handle, CURLOPT_NOBODY, 1);
curl_easy_setopt(handle, CURLOPT_ERRORBUFFER, szErr);
ret = curl_easy_perform(handle);
get information
double dval;
if (CURLE_OK != curl_easy_getinfo(curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &dval))
{
cerr<<"Failed to get CURLINFO_CONTENT_LENGTH_DOWNLOAD"<<endl;
curl_easy_cleanup(curl);
return;
}
cout<<"CURLINFO_CONTENT_LENGTH_DOWNLOAD :"<<dval<<endl;

Get the remote file’s create time

option setting
curl_easy_setopt(handle, CURLOPT_FILETIME, 1L); // for curl_easy_getinfo() for CURLINFO_FILETIME
get information
long lval;
if (CURLE_OK != curl_easy_getinfo(curl, CURLINFO_FILETIME, &lval) || lval == -1)
{
cerr<<"Failed to get CURLINFO_FILETIME"<<endl;
curl_easy_cleanup(curl);
return;
}
time_t t(lval);
tm * ptm = localtime((const time_t*)&t);
cout<<"CURLINFO_FILETIME :"<<asctime(ptm)<<endl;

Show progress for downloading

option setting using default display by cURL, 0L means showing progress, 1L means not to show it.
curl_easy_setopt(handle, CURLOPT_NOPROGRESS, 0L);
change to use
 
Check/Confirm the remote file exist
Two flags could show whether the remote file exists by the url, CURLINFO_RESPONSE_CODE and CURLINFO_CONTENT_LENGTH_DOWNLOAD.
If the file doesn’t exist by the url, 404 will be set as CURLINFO_RESPONSE_CODE and –1 for CURLINFO_CONTENT_LENGTH_DOWNLOAD. Otherwise, 200 for CURLINFO_RESPONSE_CODE.
 
Resume uploading file
the option CURLOPT_RESUME_FROM by curl_easy_setopt
CURLOPT_RESUME_FROM_LARGE
Pass a curl_off_t as parameter. It contains the offset in number of bytes that you want the transfer to start from. (Added in 7.11.0)
 

AGet is similar to FlashGet for win32. It can download the large file by multithread. Here is the theory for aget:
If you're downloading a file of size less than 512K, I suggest using small segments, i.e 4-5 however, if you are downloading large files, you can increase the segmentation Aget first sends a HEAD request to retrieve the length of the file, and divides it into equal segments according to the number user has requested. Then for each segment, it connects to the server and gets only the part, which it is to download. Therefore, if you're downloading a small file, it is suggested that you decrease the number of segmentation, likewise, if it's a large file, you're urged to increase the number of threads.
 

Todo
using libcurl as spider and 爬虫
 


 


你可能感兴趣的:(programming)