来源:杨他她
本着分享精神,不说废话了,直奔主题:
1,先获取淘宝客链接一枚:
$clickurl="http://s.click.taobao.com/t?e=m%3D2%26s%3D1xJRigdN6vgcQipKwQzePOeEDrYVVa64REOHN%2B0iJT23bLqV5UHdqdSm9rmNrfhQMlIj6E1wLr6Z1upWVE%2FY63jUGTUkifoD6Iu7YpSAkMmDDvvObppylOm%2B2Cp2Y4AZdQRGST%2FOE66WnyaKIIfB45Ka7uvYZB3KIXgUnhszXk7H%2FWo6QkJXpnEKIlIBiOAf%2BiEiH3X0n4yiZ%2BQMlGz6FQ%3D%3D";
复制代码
2,PHP获取初步跳转后的URL:
$headers = get_headers($clickurl, TRUE);
$tu = $headers['Location'];
复制代码
得到链接如下:
$tu="http://s.click.taobao.com/t_js?tu=http%3A%2F%2Fs.click.taobao.com%2Ft%3Fe%3Dm%253D2%2526s%253D1xJRigdN6vgcQipKwQzePOeEDrYVVa64REOHN%252B0iJT23bLqV5UHdqdSm9rmNrfhQMlIj6E1wLr6Z1upWVE%252FY63jUGTUkifoD6Iu7YpSAkMmDDvvObppylOm%252B2Cp2Y4AZdQRGST%252FOE66WnyaKIIfB45Ka7uvYZB3KIXgUnhszXk7H%252FWo6QkJXpnEKIlIBiOAf%252BiEiH3X0n4yiZ%252BQMlGz6FQ%253D%253D%26ref%3D%26et%3DU1NBEMyybRSMZqT%252FAdx5AObU6XqsSK9q";
复制代码
因此链接中带有一个参数tu,后面要用到,所以将此链接取名为tu,继续第三步。
3,取得tu链接中的tu参数,也就是等号后面的内容:
http%3A%2F%2Fs.click.taobao.com%2Ft%3Fe%3Dm%253D2%2526s%253D1xJRigdN6vgcQipKwQzePOeEDrYVVa64REOHN%252B0iJT23bLqV5UHdqdSm9rmNrfhQMlIj6E1wLr6Z1upWVE%252FY63jUGTUkifoD6Iu7YpSAkMmDDvvObppylOm%252B2Cp2Y4AZdQRGST%252FOE66WnyaKIIfB45Ka7uvYZB3KIXgUnhszXk7H%252FWo6QkJXpnEKIlIBiOAf%252BiEiH3X0n4yiZ%252BQMlGz6FQ%253D%253D%26ref%3D%26et%3DU1NBEMyybRSMZqT%252FAdx5AObU6XqsSK9q
复制代码
对编码熟悉的同学应该看出来了这其实是一条URL地址被escape编码了,我们使用PHP自定义一个解码函数对其进行处理一下,网上找到的unescape解码函数:
function unescape($str) {
$ret = '';
$len = strlen($str);
for ($i = 0; $i < $len; $i ++)
{
if ($str[$i] == '%' && $str[$i + 1] == 'u')
{
$val = hexdec(substr($str, $i + 2, 4));
if ($val < 0x7f)
$ret .= chr($val);
else
if ($val < 0x800)
$ret .= chr(0xc0 | ($val >> 6)) .
chr(0x80 | ($val & 0x3f));
else
$ret .= chr(0xe0 | ($val >> 12)) .
chr(0x80 | (($val >> 6) & 0x3f)) .
chr(0x80 | ($val & 0x3f));
$i += 5;
} else
if ($str[$i] == '%')
{
$ret .= urldecode(substr($str, $i, 3));
$i += 2;
} else
$ret .= $str[$i];
}
return $ret;
}
复制代码
解码后得到的URL地址为:
$ref="http://s.click.taobao.com/t?e=m%3D2%26s%3D1xJRigdN6vgcQipKwQzePOeEDrYVVa64REOHN%2B0iJT23bLqV5UHdqdSm9rmNrfhQMlIj6E1wLr6Z1upWVE%2FY63jUGTUkifoD6Iu7YpSAkMmDDvvObppylOm%2B2Cp2Y4AZdQRGST%2FOE66WnyaKIIfB45Ka7uvYZB3KIXgUnhszXk7H%2FWo6QkJXpnEKIlIBiOAf%2BiEiH3X0n4yiZ%2BQMlGz6FQ%3D%3D&ref=&et=Tu9eFLz3gxx7bGejK8KgtemqA%2BR0RX35";
复制代码
这地址跟我们获取的淘宝客链接几乎相同,就是后面多了两个参数,一个ref,一个et,这里我们把这个地址命名为ref。
淘宝客链接的跳转其实就是一个封装的JS程序,通过JS发起带有header参数的请求从而达到跳转的,这个header参数中最重要的就是referer,下面我就用PHP模拟请求一下这个地址:
curl_setopt($ch, CURLOPT_URL, $ref);
curl_setopt($ch, CURLOPT_REFERER, $tu);
curl_setopt($ch, CURLOPT_HEADER, true); //获取header信息
$text= curl_exec($ch);
curl_close($ch);
echo $text;//输出header信息
复制代码
请求后的结果输出跳转后的302头部信息,里面已经有我们想要的淘宝商品实际地址了。
以下是完整的提取淘宝URL函数:
function geturl($clickurl){
$headers = get_headers($clickurl, TRUE);
$tu = $headers['Location'];
$eturl = unescape($tu);
$u = parse_url($eturl);
$param = $u['query'];
$ref = str_replace('tu=', '', $param);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $ref);
curl_setopt($ch, CURLOPT_REFERER, $tu);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ch, CURLOPT_MAXREDIRS,2);
$out = curl_exec($ch);
$dd = curl_getinfo($ch);
curl_close($ch);
$item_url = $dd['url'];
return $item_url;
}
复制代码
另外说明一下,curl中如果这个CURLOPT_FOLLOWLOCATION参数无效,很有可能是PHP运行模式问题,可以使用其他方法替代。示例:[code]function geturl($clickurl){
$headers = get_headers($clickurl, TRUE);
$tu = $headers['Location'];
$eturl = unescape($tu);
$u = parse_url($eturl);
$param = $u['query'];
$ref = str_replace('tu=', '', $param);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $ref);
curl_setopt($ch, CURLOPT_REFERER, $tu);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ch, CURLOPT_MAXREDIRS,2);
$out = curl_exec($ch);
$dd = curl_getinfo($ch);
curl_close($ch);
if($dd['redirect_url']){
$item_url= $dd['redirect_url'];
}else{
if($dd['url']){
$item_url= $dd['url'];
}else{
$chl = curl_init();
curl_setopt($chl, CURLOPT_URL, $et);
curl_setopt($chl, CURLOPT_REFERER, $tu);
curl_setopt($chl, CURLOPT_HEADER, true);
curl_setopt($chl, CURLOPT_NOBODY,1);
curl_setopt($chl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($chl, CURLOPT_FOLLOWLOCATION,true);
curl_setopt($chl, CURLOPT_MAXREDIRS,2);
$out = curl_exec($chl);
curl_close($chl);
$item_url= get_word($out,'Location: ','&ali_trackid');
}
}
return $item_url;
}
http://bbs.yangtata.com/forum.php?mod=viewthread&tid=4182