最近有个spark任务涉及到scala操作json,大概流程是这样:从hbase取数据,每条数据先parse json,然后删除一个多余的key,最后在弄成json字符串,输出到hdfs。

json大概长这样,{“@type”:{"version":"1.0.2","name":"application-content","data":[]},"key-to-remove":[{"blah":"more blah"}],"@value":[]}

逻辑不复杂,读取hbase的部分在此略去,json相关代码如下,用fastjson解析json

package dev.json

import com.alibaba.fastjson.JSON

object Course1 {
def main(args: Array[String]): Unit = {
val key = "key-to-remove"
val s =
"""
|{"@type":{"version":"1.0.2","name":"application-content","data":[]},"key-to-remove":[{"blah":"more blah"}],"@value":[]}
|""".stripMargin
val obj = JSON.parseObject(s)
obj.remove(key)
val out = obj.toJSONString
println(out)
}
}

然后就是一顿报错

Exception in thread "main" com.alibaba.fastjson.JSONException: expect ':' at 2, actual "

at com.alibaba.fastjson.parser.DefaultJSONParser.parseObject(DefaultJSONParser.java:296)

at com.alibaba.fastjson.parser.DefaultJSONParser.parse(DefaultJSONParser.java:1401)

at com.alibaba.fastjson.parser.DefaultJSONParser.parse(DefaultJSONParser.java:1367)

at com.alibaba.fastjson.JSON.parse(JSON.java:183)

at com.alibaba.fastjson.JSON.parse(JSON.java:193)

at com.alibaba.fastjson.JSON.parse(JSON.java:149)

at com.alibaba.fastjson.JSON.parseObject(JSON.java:254)

at dev.json.Course1$.main(Course1.scala:12)

at dev.json.Course1.main(Course1.scala)

以前用fastjson从来没碰到这样的问题,一顿百度,然后才知道是里面包含了@type的key,autotype is not supported,阿里出于安全考虑,@type容易注入一些不安全操作,所以抛出错误。查了一些资料,总算是修复了,需要加上一些选项,把所在包添加白名单,从而关闭对@type的解析。代码如下:

package dev.json

import com.alibaba.fastjson.JSON
import com.alibaba.fastjson.parser.{Feature, ParserConfig}

object Course1 {
// 添加包白名单
ParserConfig.getGlobalInstance.addAccept("dev.json")

def main(args: Array[String]): Unit = {
val key = "key-to-remove"
val s =
"""
|{"@type":{"version":"1.0.2","name":"application-content","data":[]},"key-to-remove":[{"blah":"more blah"}],"@value":[]}
|""".stripMargin
// 关闭特殊key检查
val obj = JSON.parseObject(s, Feature.DisableSpecialKeyDetect)
obj.remove(key)
val out = obj.toJSONString
println(out)
}
}

然后结果就可以正常解析,输出如下:

{"@value":[],"@type":{"data":[],"name":"application-content","version":"1.0.2"}}

 

这次报错,加上前段时间因为fastjson漏洞事件,公司要求紧急升级fastjson版本,瞬间对fastjson印象不那么好了,说不定哪天就全面禁止在项目中使用fastjson。所以顺便尝试了jackson的scala版本,json4s。json4s使用起来也不那么顺手,但也够用。上面的功能,用json4s重写了一个版本。

 
  
package dev.json

import org.json4s.DefaultFormats
import org.json4s.JsonDSL._
import org.json4s.jackson.JsonMethods._

object Course2 {

implicit val formats = DefaultFormats

def main(args: Array[String]): Unit = {
val key = "key-to-remove"
val s =
"""
|{"@type":{"version":"1.0.2","name":"application-content","data":[]},"key-to-remove":[{"blah":"more blah"}],"@value":[]}
|""".stripMargin
val obj = parse(s)
if (null != obj) {
val obj2 = obj.removeField(_._1.equals(key))
val out = compact(render(obj2))
println(out)
}
}
}
scala json fastjson json4s
文章转载:http://www.shaoqun.com/a/464614.html