spark sql 怎样处理日期类型

spark sql 怎样处理日期类型、时间类型

json 每个对象 不能换行

##问题描述

json File 日期类型 怎样处理?怎样从字符型,转换为Date或DateTime类型?

json文件如下,有字符格式的日期类型

```

{ "name" : "Andy", "age" : 30, "time" :"2015-03-03T08:25:55.769Z"}

{ "name" : "Justin", "age" : 19, "time" : "2015-04-04T08:25:55.769Z" }

{ "name" : "pan", "age" : 49, "time" : "2015-05-05T08:25:55.769Z" }

{ "name" : "penny", "age" : 29, "time" : "2015-05-05T08:25:55.769Z" }

```

默认推测的Schema:

```

root

 |-- _corrupt_record: string (nullable = true)

 |-- age: long (nullable = true)

 |-- name: string (nullable = true)

 |-- time200: string (nullable = true)

```


测试代码

```

    val fileName = "person.json"

    val sc = SparkUtils.getScLocal("json file 测试")

    val sqlContext = new org.apache.spark.sql.SQLContext(sc)

    val jsonFile = sqlContext.read.json(fileName)

    jsonFile.printSchema()

```

##解决方案

### 方案一、json数据 时间为 long 秒或毫秒


### 方案二、自定义schema

```

    val fileName = "person.json" 

    val sc = SparkUtils.getScLocal("json file 测试")

    val sqlContext = new org.apache.spark.sql.SQLContext(sc)

    val schema: StructType = StructType(mutable.ArraySeq(

      StructField("name", StringType, true),

      StructField("age", StringType, true),

      StructField("time", TimestampType, true)));

    val jsonFile = sqlContext.read.schema(schema).json(fileName)

    jsonFile.printSchema()

    jsonFile.registerTempTable("person")

    val now: Timestamp = new Timestamp(System.currentTimeMillis())

 

    val teenagers = sqlContext.sql("SELECT * FROM person WHERE age >= 20 AND age <= 30 AND time <='" +now+"'")

    teenagers.foreach(println)

    val dataFrame = sqlContext.sql("SELECT * FROM person WHERE age >= 20 AND age <= 30 AND time <='2015-03-03 16:25:55.769'")

    dataFrame.foreach(println)

```

###方案三、sql建表 

创建表sql

```

CREATE TEMPORARY TABLE person IF NOT EXISTS 

[(age: long ,name:string ,time:Timestamp)] 

USING org.apache.spark.sql.json

OPTIONS (  path 'person.json')

语法

CREATE [TEMPORARY] TABLE [IF NOT EXISTS] 

   [(col-name data-type [, …])] 

  USING [OPTIONS ...] 

  [AS ]

```

### 方案四、用textfile convert


来自为知笔记(Wiz)


你可能感兴趣的:(sql,spark,怎样处理日期类型)