使用Java窃取sina大片
 
sina有很多视频,可是都只能在页面中看,而不能下载,经过思考后,决定用java把真实的地址找出来,窃取sian大片的真实地址,后面再用Java命令行工具下载,呵呵!
 
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import lavasoft.common.toolkit.HttpTookit;

import java.io.UnsupportedEncodingException;
import java.net.URLEncoder;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
* 窃取sian大片的真实地址,一个小demo :)
*
* @author leizhimin 2009-7-3 21:33:42
*/

public class MyPickerUrl {
         private static Log log = LogFactory.getLog(MyPickerUrl. class);

         /**
         * 根据sina视频播放地址获取视频真实地址列表
         *
         * @param playrul sina视频播放地址
         * @return 视频真实地址列表
         */

         public static List pickupUrl(String playrul) {
                List result = new ArrayList(1);
                 if (playrul == null) {
                        log.error( "你输入的URL为空,请重新输入后再来提取视频真实地址!");
                         return result;
                }
                String _decurl = null;
                 try {
                        _decurl = URLEncoder.encode(playrul, "UTF-8");
                } catch (UnsupportedEncodingException e) {
                        log.error( "URL:" + playrul + "转码为UTF-8的HTTP请求编码异常!,获取视频真实URL可能失败!", e);
                }
                String url = "http://www.flvcd.com/parse.php?kw=" + _decurl + "&flag=&format=";
                String html = HttpTookit.doGet(url, null);
                Pattern p = Pattern.compile("target=\"_blank\" class=\"link\">(.+?)");
                Matcher m = p.matcher(html);
                while (m.find()) {
                        result.add(m.group(1));
                        System.out.println(m.group(1));
                }
                return result;
        }

        public static void main(String[] args) throws UnsupportedEncodingException {
                pickupUrl("http://movie.video.sina.com.cn/teleplay/ldqksj/001.html");
        }

}
 
lavasoft.common.toolkit.HttpTookit类在前面的博文中已经给出,可以查阅!
 
运行结果:
http://lz1.dhot.v.iask.com/f/1/6f72b9555b1de7989d56eb53f0ce218519100388.hlv
http://lz2.dhot.v.iask.com/f/1/0b60a9f8433b6094b16cc76e9588cc1819092103.hlv

Process finished with exit code 0
 
呵呵,真实地址都出来了,谁都会下载了。爽吧!!!!
 
我继续使用wget的命令行,下载,窗口显示如下:
C:\>wget -c --tries=5 --timeout=60 http://lz6.dhot.v.iask.com/f/1/7db2921af8899f
611150469660fd69f84726043.flv
--00:35:13--    http://lz6.dhot.v.iask.com/f/1/7db2921af8899f611150469660fd69f8472
6043.flv
                     => `7db2921af8899f611150469660fd69f84726043.flv'
Resolving lz6.dhot.v.iask.com... 202.100.78.116
Connecting to lz6.dhot.v.iask.com|202.100.78.116|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13606365 (13M) [video/x-flv]

45% [=================>                                            ] 6,175,040     88.2K/s    eta 87s
 
使用Java窃取sina大片_第1张图片
 
如果你要将下载存储指定到一个目录,则需要加一个-P参数即可,注意参数的大小写是区分的,例如:
C:\> wget -c -P C:\aac --tries=5 --timeout=60 http://lz4.dhot.v.iask.com/f/1/daba
5a0cf5749a729fff54d6020af7c67940685.flv
--19:32:29--    http://lz4.dhot.v.iask.com/f/1/daba5a0cf5749a729fff54d6020af7c6794
0685.flv
                     => `C:/aac/daba5a0cf5749a729fff54d6020af7c67940685.flv'
Resolving lz4.dhot.v.iask.com... 202.100.78.114
Connecting to lz4.dhot.v.iask.com|202.100.78.114|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13176851 (13M) [video/x-flv]

12% [====>                                                                     ] 1,609,344        134K/s    eta 89s        ^
 
 
 
 
本代码纯属无聊时玩玩,请勿用于任何商业活动!否则,后果自负!