最近一个项目要对艺龙,携程等酒店预订网站进行价格信息采集.因为这两个网站都不是省油的灯啊,多次提交ajax表单,参数编码,获取特定的城市和酒店代码等等手段,可谓过五关斩六将,总算是把数据采集回来,但是又面临一个难题:采集回来的信息是json格式的,那就涉及到json数据转换成java对象了
首先要了解json数据的格式
[{"CityType":"hotel","TabList":[{"TabId":"1","Name":"\u70ED\u95E8","NameEn":"Hot","CityList":[{"ProvinceId":null,"CityId":"0101","CityCode":"0101","CityNameCn":"\u5317\u4EAC","CityNameEn":"Beijing","CityThreeSign":"","CityType":"hotel","OldEnglishName":"peking"},{"ProvinceId":null,"CityId":"0201","CityCode":"0201","CityNameCn":"\u4E0A\u6D77","CityNameEn":"Shanghai","CityThreeSign":"","CityType":"hotel","OldEnglishName":""},{"ProvinceId":null,"CityId":"2001","CityCode":"2001","CityNameCn":"\u5E7F\u5DDE","CityNameEn":"Guangzhou","CityThreeSign":"","CityType":"hotel","OldEnglishName":"canton"}]}]}];
jsonViewer是个好东西,不过HTTPAnalysis直接就集成了jsonViewer,不过还是前者比较轻巧:
json和java之间的互换可以使用json-lib, 基本的使用可以看文档就行.网上也有一些集成的类,我建议直接看它的源代码,了解他是如何通过反射和使用临时对象解析json数据,并封装成java对象.
这个json数据时list的嵌套,如果要对整个json文件解析,需要使用如下方法,需要构造跟json数据对应的Tab.class 和City.class,而且字段名称必须一致(骆驼首字母小写),提供默认构造方法,这样就可以将json数据作为一个完整的对象,想怎么干都行了.
HashMap<String,Class> map = new HashMap<String,Class>(); map.put("tabList", Tab.class); map.put("cityList", City.class); /** * 从一个JSON数组得到一个java对象集合,其中对象中包含有集合属性 * @param object * @param clazz * @param map 集合属性的类型 (key : 集合属性名, value : 集合属性类型class) eg: ("beansList" : Bean.class) * @return */ public static List getDTOList(String jsonString, Class clazz, Map map){ setDataFormat2JAVA(); JSONArray array = JSONArray.fromObject(jsonString); List list = new ArrayList(); for(Iterator iter = array.iterator(); iter.hasNext();){ JSONObject jsonObject = (JSONObject)iter.next(); list.add(JSONObject.toBean(jsonObject, clazz, map)); } return list; }
由于我的需求只是需要重json数据中获取特定的信息对象,如上面的cityList,像tabList等对象我没必要去构造和获取,而且在实际使用过程中我发现
于是乎,我就写了个直接读取和解析一个片段json数据的类,有了这个类,加上对象映射,基本上没什么解析不了
package com.gxy.weixin.util; import java.io.BufferedReader; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Iterator; import java.util.LinkedList; import java.util.List; import java.util.regex.Matcher; import java.util.regex.Pattern; import net.sf.json.JSONArray; import net.sf.json.JSONObject; public class JSonStrUtils { /** * json 数据括号自动补全 * **/ public static String autoComplete(String targetJson) { LinkedList<Character> stack = new LinkedList<Character>(); String returnStr = ""; char[] charArray = targetJson.toCharArray(); for (int i = 0; i < charArray.length; i++) { if (charArray[i] == '[' || charArray[i] == '{') {// 入栈 stack.addFirst(charArray[i]); } else if (charArray[i] == ']') { // 判断是否闭合 char last = stack.peekFirst(); if (last != '[') {// 不闭合,补 } returnStr += '}'; } else// 闭合 { stack.pollFirst(); } } else if (charArray[i] == '}') { // 判断是否闭合 char last = stack.peekFirst(); if (last != '{') {// 不闭合,补 } returnStr += ']'; } else// 闭合 { stack.pollFirst(); } } returnStr += charArray[i]; } for (char c : stack) { System.out.println("left in stack:" + c); } return returnStr; } /*** * 用于处理json数据中出现变量名非首字母小写的情况,利用正则匹配变量,然后把首字母变成小写 * */ public static String dealWithFirstChar(String jsonInput) { String originalInput = jsonInput; StringBuilder inputStr = new StringBuilder(jsonInput); String regex = "\"(\\w+)\":"; Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE); Matcher m = p.matcher(inputStr); List<String> result = new ArrayList<String>(); while (m.find()) { String valueName = m.group(1); String newValueName = null; char[] words = valueName.toCharArray(); if (Character.isUpperCase(words[0])) {// 首字母大写,不符合变量命名规范 words[0] = Character.toLowerCase(words[0]); newValueName = new String(words); // System.out.println("orignal value:"+valueName+" new value :"+ // newValueName); // String regexWord = "\""+valueName+"\":"; String regx1 = "\"" + valueName + "\":"; String replace = "\"" + newValueName + "\":"; originalInput = originalInput.replaceAll(regx1, replace); } result.add(valueName); inputStr.delete(0, m.end(0)); m = p.matcher(inputStr); } return originalInput; } /*** * 用户将unicode编码转换成汉字 * */ public static String UnicodeToString(String str) { Pattern pattern = Pattern.compile("(\\\\u(\\p{XDigit}{4}))"); Matcher matcher = pattern.matcher(str); char ch; while (matcher.find()) { ch = (char) Integer.parseInt(matcher.group(2), 16); str = str.replace(matcher.group(1), ch + ""); } return str; } /*** * 用户将汉字转换成unicode编码 * */ public static String toUNICODE(String s) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < s.length(); i++) { if (s.charAt(i) <= 256) { sb.append("\\u00"); } else { sb.append("\\u"); } sb.append(Integer.toHexString(s.charAt(i)).toUpperCase()); } return sb.toString(); } /** * 读取json文件,转换为字符串 * */ public static String readJSonFile(String fileName) { InputStream in = null; String jsonStr = ""; try { in = JSonStrUtils.class.getResourceAsStream(fileName); BufferedReader reader = new BufferedReader( new InputStreamReader(in)); String temp = reader.readLine(); while (temp != null) { jsonStr += temp; temp = reader.readLine(); } in.close(); } catch (IOException e) { e.printStackTrace(); System.out.println("read json file failed :" + fileName); } return jsonStr; } /*** * 在指定的JSONObject 中寻找特定属性的第一个值 * */ public static List<String> findTargetProperty(JSONObject object, String propertyName, boolean isRecursive) { List<String> values = new ArrayList<String>(); for (Iterator entries = object.names().iterator(); entries.hasNext();) { String name = (String) entries.next(); Object value = object.get(name); if (name.equals(propertyName)) {// 找到目标属性 values.add(value.toString()); } else if (isRecursive) { if (value instanceof JSONObject) { values.addAll(findTargetProperty((JSONObject) value, propertyName, isRecursive)); } else if (value instanceof JSONArray) { JSONArray array = (JSONArray) value; for (Iterator iter = array.iterator(); iter.hasNext();) { JSONObject jsonObject = (JSONObject) iter.next(); values.addAll(findTargetProperty(jsonObject, propertyName, isRecursive)); } } } } return values; } /*** * 从指定的JSonArray中寻找指定名称的子json数组 * */ public static List<JSONArray> findTargetJSonArray(JSONArray array, String targetName) { List<JSONArray> arrays = new ArrayList<JSONArray>(); for (Iterator iter = array.iterator(); iter.hasNext();) {// 遍历数组中的JSONObject JSONObject jsonObject = (JSONObject) iter.next(); if (jsonObject == null || jsonObject.isNullObject()) { continue; } for (Iterator entries = jsonObject.names().iterator(); entries .hasNext();) { String name = (String) entries.next(); Object value = jsonObject.get(name); if (value instanceof JSONArray) {// 属性为数组 // 先判断是否目标数组 if (name.equals(targetName)) { arrays.add((JSONArray) value); } // 递归查找 arrays.addAll(findTargetJSonArray((JSONArray) value, targetName)); } } } return arrays; } }