sparkSQL中的.where里面的“=”的使用

我们必须要使用===而不是=或者==

我们来看一个例子:

假如这么一个表,我们想进行条件查询

+---+-----+---+----+-------+
| id| name|age|addr| salary|
+---+-----+---+----+-------+
|  1|zhang| 49|  bj|10000|
|  2| wang| 34|  sh| 1000|
|  3|   li| 28|  sz| 5000|

(1)===

df2.select($"name",$"addr").where($"name" === "li").show()

结果:

+----+----+
|name|addr|
+----+----+
|  li|  sz|
+----+----+

(2)==

scala> df2.select($"name",$"addr").where($"name" == "li").show()
:29: error: overloaded method value where with alternatives:
  (conditionExpr: String)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] 
  (condition: org.apache.spark.sql.Column)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
 cannot be applied to (Boolean)
       df2.select($"name",$"addr").where($"name" == "li").show()

(3)=

scala> df2.select($"name",$"addr").where($"name" = "li").show()
:29: error: missing argument list for method $ in class StringToColumn
Unapplied methods are only converted to functions when a function type is expected.
You can make this conversion explicit by writing `$ _` or `$(_)` instead of `$`.
       df2.select($"name",$"addr").where($"name" = "li").show()

 

你可能感兴趣的:(大数据)