自己的手机上下了个天气工具,感觉非常棒!自己就想实现一下!
目前的想法就是从提供天气信息的网站上获得相关信息并提取!
http://www.weather.com.cn/
这个网站提供的天气信息非常好,格式也非常符合标准!
信息抓取工具:HttpWatcher
装好HttpWatcher后,在网页中启动HttpWatcher,在IE中输入http://www.weather.com.cn/,并选中一个城市,闭关会抓取网页信息。
这个就不详述了。。。
通过对抓取到的信息的分析,得到一个城市的天气的网页URL为http://www.weather.com.cn/weather/101190101.shtml
我的是江苏南京南京(不要奇怪为什么南京要重复两次,因为南京市还有江宁,浦口。。。)
101190101是什么呢?肯定是所选城市的唯一ID啦~
其实根据实际,我们也应该让用户选择所关心的城市。
所以,我们应该先得到关于所感兴趣的城市的代码。
根据分析,一个所选地区的ID有三层
第一层:所属省份
第二层:所属城市
第三层:所在区
具体的组织结构就像我们的资源管理器
中国
|____陕西
| |____西安
| | |____西安
| | |____ 户县
|_____江苏
| |___南京
|____南京
|____浦口
|____江宁
通过http://www.weather.com.cn/data/citydata/china.html可以得到省份信息
当然是实用URLConnection得到,当然也可以实用HttpClient!
我们得到的信息:
{"10101":"北京","10102":"上海","10103":"天津","10104":"重庆","10105":"黑龙江","10106":"吉林","10107":"辽宁","10108":"内蒙古","10109":"河北","10110":"山西","10111":"陕西","10112":"山东","10113":"新疆","10114":"西藏","10115":"青海","10116":"甘肃","10117":"宁夏","10118":"河南","10119":"江苏","10120":"湖北","10121":"浙江","10122":"安徽","10123":"福建","10124":"江西","10125":"湖南","10126":"贵州","10127":"四川","10128":"广东","10129":"云南","10130":"广西","10131":"海南","10132":"香港","10133":"澳门","10134":"台湾"}
实用正则表达式提取出来就好了~
得到城市,区都是一样的!
今天先写到这里!
(*^__^*) 嘻嘻……
全部代码:
其中URLs保存了我们需要使用到的URL
MyRegex保存了我们需要提取信息的正则表达式
package WeatherReport.util;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class ProvinceAndCity {
public static Map<String, String> GetProvince() {
Map<String, String> provinceMap = new HashMap<String, String>();
// Map<String, String> tmpMap = new HashMap<String, String>();
// 得到省份
String content = WebHelper.getHtmlDoc(URLs.ProvinceURL);
String[] provinces = content.split(",");
// 构建提取省份的正则表达式
Pattern proPattern = Pattern.compile(MyRegex.proRegex);
for (String province : provinces) {
Matcher proMatcher = proPattern.matcher(province);
while (proMatcher.find()) {
String id = proMatcher.group(1);
String pro = proMatcher.group(2);
provinceMap.put(pro, id);
// System.out.println(id+":"+pro);
}
}
CityMapUtil.savePropertiesInHashMap("config/province.txt", provinceMap);
return provinceMap;
}
// 得到城市信息,需要先得到省份信息
public static Map<String, String> GetCity() {
// 得到省份信息
Map<String, String> proMap = GetProvince();
Set<String> proSet = proMap.keySet();
Iterator<String> iter = proSet.iterator();
Map<String, String> cityMap = new HashMap<String, String>();
while (iter.hasNext()) {
Map<String, String> tmpMap = new HashMap<String, String>();
String proID = iter.next();
String cityURL = URLs.CityURL + proID + ".html";
String content = WebHelper.getHtmlDoc(cityURL);
String[] cities = content.split(",");
// 构建提取城市的正则表达式
Pattern cityPattern = Pattern.compile(MyRegex.cityRegex);
for (String city : cities) {
Matcher proMatcher = cityPattern.matcher(city);
while (proMatcher.find()) {
String id = proMatcher.group(1);
String pro = proMatcher.group(2);
cityMap.put(proID + id, pro);
tmpMap.put(pro, proID + id);
// System.out.println(proID+id+":"+pro);
}
}
CityMapUtil.savePropertiesInHashMap("config/" + proID + ".txt",
tmpMap);
}
return cityMap;
}
// 根据指定的省份得到相应的城市
public static Map<String, String> GetCity(String provinceID) {
Map<String, String> cityMap = new HashMap<String, String>();
String cityURL = URLs.CityURL + provinceID + ".html";
String content = WebHelper.getHtmlDoc(cityURL);
String[] cities = content.split(",");
// 构建提取城市的正则表达式
Pattern cityPattern = Pattern.compile(MyRegex.cityRegex);
for (String city : cities) {
Matcher proMatcher = cityPattern.matcher(city);
while (proMatcher.find()) {
String id = proMatcher.group(1);
String cityName = proMatcher.group(2);
cityMap.put(cityName, provinceID+id);
}
}
return cityMap;
}
// 得到区域信息
public static Map<String, String> GetState() {
// 先得到城市信息
Map<String, String> cityMap = GetCity();
Set<String> citySet = cityMap.keySet();
Iterator<String> iter = citySet.iterator();
Map<String, String> stateMap = new HashMap<String, String>();
while (iter.hasNext()) {
Map<String, String> tmpMap = new HashMap<String, String>();
String cityID = iter.next();
String stateURL = URLs.StateURL + cityID + ".html";
String content = WebHelper.getHtmlDoc(stateURL);
// System.out.println("stateURL:"+stateURL);
String[] states = content.split(",");
// 构建提取城市的正则表达式
Pattern cityPattern = Pattern.compile(MyRegex.stateRegex);
for (String state : states) {
Matcher stateMatcher = cityPattern.matcher(state);
while (stateMatcher.find()) {
String id = stateMatcher.group(1);
String name = stateMatcher.group(2);
stateMap.put(cityID + id, name);
tmpMap.put(name, cityID + id);
// System.out.println(cityID+id+":"+name);
}
}
CityMapUtil.savePropertiesInHashMap("config/" + cityID + ".txt",
tmpMap);
}
return stateMap;
}
// 根据制定的城市id,得到区域信息
public static Map<String, String> GetState(String cityID) {
Map<String, String> stateMap = new HashMap<String, String>();
String stateURL = URLs.StateURL + cityID + ".html";
String content = WebHelper.getHtmlDoc(stateURL);
// System.out.println("stateURL:"+stateURL);
String[] states = content.split(",");
// 构建提取城市的正则表达式
Pattern cityPattern = Pattern.compile(MyRegex.stateRegex);
for (String state : states) {
Matcher stateMatcher = cityPattern.matcher(state);
while (stateMatcher.find()) {
String id = stateMatcher.group(1);
String name = stateMatcher.group(2);
stateMap.put(name, cityID+id);
}
}
return stateMap;
}
public static void main(String[] args) {
// System.out.println(ProvinceAndCity.GetState());
// GetState();
System.out.println(GetCity("10130"));
System.out.println(GetState("1013014"));
}
}