做数据转换的时候,什么样的脏数据都有可能发生,不要期待一切都如你所愿。
1. 写文件的时候一定要注意传来字段的制表符问题
读文件我们readline 然后用\t来读数据
写文件的时候,我们用\n来换行。
如果遇到下面的情况就有些会出现问题了,字段中包含制表符,这样做数据转换的时候就会发生错位。
{"code":"CUXZJS","refer":"\r\nDV8HFI","referPid":null,"people":[],"iosPushToken":""}
2. 用java的小伙伴们,如果用split函数的时候,要注意
如果一条数据是这样的
A\tB\tC\t\t 注意这是五个字段A,B,C,D, E 但是D,E传来的是空字符串
String a = "A B C ";
String[] arrStrings = a.split("\t");
这样简单的split,不是完全匹配,最后数组里只有[A,B,C]三个元素
所以要完全匹配需要使用split(regex,-1)
String a = "A B C ";
String[] arrStrings = a.split("\t",-1);
这样数组会匹配到[A, B, C, , ]
查看源码定义
public String[] split(String regex, int limit)
The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit
n is greater than zero then the pattern will be applied at most
n - 1 times, the array's length will be no greater than
n, and the array's last entry will contain all input beyond the last matched delimiter. If
n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If
n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
<未完待续......>