springboot+python爬虫+定时爬取汇率+mysql

    想法来源:这几天实习没有什么活儿,因为毕业设计题目是小程序,所以就看起了api,想着为什么不自己在服务器上爬取呢,然后就各种查资料

    用到的方法:springboot+python爬虫+定时爬取汇率+mysql

    思路:利用springboot定时任务,每一分钟调用python爬虫,(因为一般应用对汇率更新不会很严格,可以1小时更新一次,防止ip被网站封,也可以帮别人节约点服务器资源),然后java读取命令行结果,在java中保存到数据库

    效果:通过接口(/api/rate/{id})去获取对应货币最新的汇率,比如:GET .../api/rate/1331得到的就是韩元的汇率

                         springboot+python爬虫+定时爬取汇率+mysql_第1张图片

python代码:比较简单,直接爬取的是http://srh.bankofchina.com/search/whpj/search.jsp的数据,通过post不同的id去爬取

from bs4 import BeautifulSoup
from urllib import request
from urllib import parse
import sys

url = "http://srh.bankofchina.com/search/whpj/search.jsp"
Form_Data = {}
Form_Data['erectDate'] = ''
Form_Data['nothing'] = ''
# Form_Data['pjname'] = '1316'
data=[1,2];
    
i=sys.argv[1];    
def func(i):
    Form_Data['pjname']=i;
    data = parse.urlencode(Form_Data).encode('utf-8')
    html = request.urlopen(url,data).read()
    soup = BeautifulSoup(html,'html.parser')

    div = soup.find('div', attrs = {'class':'BOC_main publish'})
    table = div.find('table')
    tr = table.find_all('tr')
    td = tr[1].find_all('td')
    try :
        print(td[0].get_text(),td[7].get_text(),td[3].get_text())
        # return td[3].get_text();
    except IndexError:
    #    return -1;
        print (-1)

func(i);

java思路:通过命令行调用python a.py id去调用爬虫,核心调用方式代码:

/**
 * @author chengxumin
 * @date 2018/7/18
 */
@Component
public class Utils {
    public  String getTimeById(int id) throws Exception{
        String args1="python "+"E:\\sbootsecurity\\SpringBoot-Learning\\timedemo\\src\\main\\java\\com\\nickc\\timedemo\\controller\\b.py "+id;
        Process process=Runtime.getRuntime().exec(args1);
        BufferedReader in=new BufferedReader(new InputStreamReader(process.getInputStream(),"GBK"));
        String line;
        String result="";
        while ((line=in.readLine())!=null){
            result+=line;
        }
        in.close();
        process.waitFor();
        System.out.println(result);
        if (result.charAt(0)=='-'){
            return "";
        }
        return result;
    }
}

springboot定时任务片段:

 @Scheduled(fixedRate = 10000)
    public void SpiderAndSave()throws Exception{
        for (int i = 1314; i < 1340; i++) {
            String a=utils.getTimeById(i);
            String[] args1=a.split(" ");
            if (a.length()>0&&a.charAt(0)!='-'){
                Rate rate=new Rate();
                rate.setId((long)i);
                rate.setName(args1[0]);
                rate.setTime(args1[1]+" "+args1[2]);
                rate.setRate(Double.parseDouble(args1[3]));
                rateService.save(rate);
            }
        }
    }

api controller代码片段:

@RequestMapping(value = "/{tid}",method = RequestMethod.GET)
    public Rate getTimeById(@PathVariable long tid)throws Exception{
        Rate rate=new Rate();
        rate=rateService.getLatestById(tid);
        if (null!=rate){
            return rate;
        }
        return null;
    }

源代码:https://github.com/nickcxm/timedemo 各位大佬,我代码写的不怎么规范,这个需求肯定也有更好的方法解决,有什么写的不好的地方或者更好的方法请在评论下跟我说哦,谢谢大家了。

你可能感兴趣的:(springboot)