函数说明: 读取Shapefile构建DataFrame对象
支持版本: v1.0.0
Spark SQL 举例说明:
var spatialRDD = new SpatialRDD[Geometry]
spatialRDD.rawSpatialRDD = ShapefileReader.readToGeometryRDD(sparkSession.sparkContext, shapefileInputLocation)
var rawSpatialDf = Adapter.toDf(spatialRDD,sparkSession)
rawSpatialDf.createOrReplaceTempView("rawSpatialDf")
var spatialDf = sparkSession.sql("""
| ST_GeomFromWKT(rddshape), _c1, _c2
| FROM rawSpatialDf
""".stripMargin)
spatialDf.show()
spatialDf.printSchema()
!!!注意
文件扩展名 .shp, .shx, .dbf 必须小写. 例如:有一个 myShapefile 文件夹, 文件目录结构:
- shapefile1
- shapefile2
- myshapefile
- myshapefile.shp
- myshapefile.shx
- myshapefile.dbf
- myshapefile...
!!!注意
请确认必须使用 ST_GeomFromWKT 去实例化geometry几何列,否则在SedonaSQL将不能够使用几何属性
如果读取的文件包涵 non-ASCII 格式 需要设置编码格式
通过 sedona.global.charset
设置,在调用前 ShapefileReader.readToGeometryRDD
.
举例:
System.setProperty("sedona.global.charset", "utf8")
函数说明: 通过geohash字符串创建Geometry几何,参数geohash长度
语法: ST_GeomFromGeoHash(geohash: string, precision: int)
支持版本: v1.1.1
Spark SQL 举例说明:
SELECT ST_GeomFromGeoHash('s00twy01mt', 4) AS geom
结果:
+--------------------------------------------------------------------------------------------------------------------+
|geom |
+--------------------------------------------------------------------------------------------------------------------+
|POLYGON ((0.703125 0.87890625, 0.703125 1.0546875, 1.0546875 1.0546875, 1.0546875 0.87890625, 0.703125 0.87890625)) |
+--------------------------------------------------------------------------------------------------------------------+
函数说明: 通过GeoJson创建Geometry几何
语法: ST_GeomFromGeoJSON (GeoJson:string)
支持版本: v1.0.0
Spark SQL 举例说明:
var polygonJsonDf = sparkSession.read.format("csv").option("delimiter","\t").option("header","false").load(geoJsonGeomInputLocation)
polygonJsonDf.createOrReplaceTempView("polygontable")
polygonJsonDf.show()
var polygonDf = sparkSession.sql(
"""
| SELECT ST_GeomFromGeoJSON(polygontable._c0) AS countyshape
| FROM polygontable
""".stripMargin)
polygonDf.show()
!!!注意
SedonaSQL读取GeoJson的方式与SparkSql是有区别的
函数说明: 通过GML创建Geometry几何.
语法: ST_GeomFromGML (gml:string)
支持版本: v1.3.0
Spark SQL 举例说明:
SELECT ST_GeomFromGML('-71.16028,42.258729 -71.160837,42.259112 -71.161143,42.25932 ') AS geometry
函数说明: 通过KML创建Geometry.
语法: ST_GeomFromKML (kml:string)
支持版本: v1.3.0
Spark SQL 举例说明:
SELECT ST_GeomFromKML('-71.1663,42.2614 -71.1667,42.2616 ') AS geometry
函数说明: 通过Wkt创建Geometry几何,srid在不指定的情况下默认=0. 也可以通过 ST_GeomFromWKT
语法:
ST_GeomFromText (Wkt:string)
ST_GeomFromText (Wkt:string, srid:integer)
支持版本: v1.0.0
可选参数srid在 v1.3.1
版本加入
Spark SQL 举例说明:
SELECT ST_GeomFromText('POINT(40.7128 -74.0060)') AS geometry
函数说明: 通过WKB字符串或者WKB二进制格式创建Geometry几何
语法:
ST_GeomFromWKB (Wkb:string)
ST_GeomFromWKB (Wkb:binary)
支持版本: v1.0.0
Spark SQL 举例说明:
SELECT ST_GeomFromWKB(polygontable._c0) AS polygonshape
FROM polygontable
函数说明: 通过Wkt创建Geometry几何,srid在不指定的情况下默认=0.
语法:
ST_GeomFromWKT (Wkt:string)
ST_GeomFromWKT (Wkt:string, srid:integer)
支持版本: v1.0.0
可选参数srid在 v1.3.1
版本加入
Spark SQL 举例说明:
SELECT ST_GeomFromWKT(polygontable._c0) AS polygonshape
FROM polygontable
SELECT ST_GeomFromWKT('POINT(40.7128 -74.0060)') AS geometry
函数说明: 通过WKT创建LineString
语法:
ST_LineFromText (Wkt:string)
支持版本: v1.2.1
Spark SQL 举例说明:
SELECT ST_LineFromText(linetable._c0) AS lineshape
FROM linetable
SELECT ST_LineFromText('Linestring(1 2, 3 4)') AS line
函数说明: 通过Text创建LineString,并设置分隔符号
语法: ST_LineStringFromText (Text:string, Delimiter:char)
支持版本: v1.0.0
Spark SQL 举例说明:
SELECT ST_LineStringFromText(linestringtable._c0,',') AS linestringshape
FROM linestringtable
SELECT ST_LineStringFromText('-74.0428197,40.6867969,-74.0421975,40.6921336,-74.0508020,40.6912794', ',') AS linestringshape
函数说明: 通过Wkt构建多线MultiLineString. srid在不指定的情况下默认=0.
语法:
ST_MLineFromText (Wkt:string)
ST_MLineFromText (Wkt:string, srid:integer)
支持版本: v1.3.1
Spark SQL 举例说明:
SELECT ST_MLineFromText('MULTILINESTRING((1 2, 3 4), (4 5, 6 7))') AS multiLine;
SELECT ST_MLineFromText('MULTILINESTRING((1 2, 3 4), (4 5, 6 7))',4269) AS multiLine;
函数说明: 通过Wkt构建MultiPolygon. srid在不指定的情况下默认=0.
语法:
ST_MPolyFromText (Wkt:string)
ST_MPolyFromText (Wkt:string, srid:integer)
支持版本: v1.3.1
Spark SQL 举例说明:
SELECT ST_MPolyFromText('MULTIPOLYGON(((-70.916 42.1002,-70.9468 42.0946,-70.9765 42.0872 )))') AS multiPolygon
SELECT ST_MPolyFromText('MULTIPOLYGON(((-70.916 42.1002,-70.9468 42.0946,-70.9765 42.0872 )))',4269) AS multiPolygon
函数说明: 通过X,Y创建Point
语法: ST_Point (X:decimal, Y:decimal)
支持版本: v1.0.0
从 v1.4.0
版本移除了可选参数 Z.
如果从老版本升级过来,请用ST_PointZ去创建3D points.
Spark SQL 举例说明:
SELECT ST_Point(CAST(pointtable._c0 AS Decimal(24,20)), CAST(pointtable._c1 AS Decimal(24,20))) AS pointshape
FROM pointtable
函数说明: 通过X,Y,Z创建Point,可选参数srid. srid在不指定的情况下默认=0.
语法: ST_PointZ (X:decimal, Y:decimal, Z:decimal)
语法: ST_PointZ (X:decimal, Y:decimal, Z:decimal, srid:integer)
支持版本: v1.4.0
Spark SQL 举例说明:
SELECT ST_PointZ(1.0, 2.0, 3.0) AS pointshape
函数说明: 通过Text创建Point,参数分隔符
语法: ST_PointFromText (Text:string, Delimiter:char)
支持版本: v1.0.0
Spark SQL 举例说明:
SELECT ST_PointFromText(pointtable._c0,',') AS pointshape
FROM pointtable
SELECT ST_PointFromText('40.7128,-74.0060', ',') AS pointshape
函数说明: 通过MinX, MinY, MaxX, MaxY创建Polygon.
语法: ST_PolygonFromEnvelope (MinX:decimal, MinY:decimal, MaxX:decimal, MaxY:decimal)
支持版本: v1.0.0
Spark SQL 举例说明:
SELECT *
FROM pointdf
WHERE ST_Contains(ST_PolygonFromEnvelope(1.0,100.0,1000.0,1100.0), pointdf.pointshape)
函数说明: 通过Text创建Polygon,参数分隔符.路径(几何点)必须闭合
语法: ST_PolygonFromText (Text:string, Delimiter:char)
支持版本: v1.0.0
Spark SQL 举例说明:
SELECT ST_PolygonFromText(polygontable._c0,',') AS polygonshape
FROM polygontable
SELECT ST_PolygonFromText('-74.0428197,40.6867969,-74.0421975,40.6921336,-74.0508020,40.6912794,-74.0428197,40.6867969', ',') AS polygonshape