文章目录
• 1 !
• 2 %
• 3 &
• 4 *
• 5 +
• 6 -
• 7 /
• 8 <
• 9 <=
• 10 <=>
• 11 =
• 12 ==
• 13 >
• 14 >=
• 15 ^
• 16 abs
• 17 acos
• 18 add_months
• 19 and
• 20 approx_count_distinct
• 21 approx_percentile
• 22 array
• 23 array_contains
• 24 ascii
• 25 asin
• 26 assert_true
• 27 atan
• 28 atan2
• 29 avg
• 30 base64
• 31 bigint
• 32 bin
• 33 binary
• 34 bit_length
• 35 boolean
• 36 bround
• 37 cast
• 38 cbrt
• 39 ceil
• 40 ceiling
• 41 char
• 42 char_length
• 43 character_length
• 44 chr
• 45 coalesce
• 46 collect_list
• 47 collect_set
• 48 concat
• 49 concat_ws
• 50 conv
• 51 corr
• 52 cos
• 53 cosh
• 54 cot
• 55 count
• 56 count_min_sketch
• 57 covar_pop
• 58 covar_samp
• 59 crc32
• 60 cube
• 61 cume_dist
• 62 current_database
• 63 current_date
• 64 current_timestamp
• 65 date
• 66 date_add
• 67 date_format
• 68 date_sub
• 69 date_trunc
• 70 datediff
• 71 day
• 72 dayofmonth
• 73 dayofweek
• 74 dayofyear
• 75 decimal
• 76 decode
• 77 degrees
• 78 dense_rank
• 79 double
• 80 e
• 81 elt
• 82 encode
• 83 exp
• 84 explode
• 85 explode_outer
• 86 expm1
• 87 factorial
• 88 find_in_set
• 89 first
• 90 first_value
• 91 float
• 92 floor
• 93 format_number
• 94 format_string
• 95 from_json
• 96 from_unixtime
• 97 from_utc_timestamp
• 98 get_json_object
• 99 greatest
• 100 grouping
• 101 grouping_id
• 102 hash
• 103 hex
• 104 hour
• 105 hypot
• 106 if
• 107 ifnull
• 108 in
• 109 initcap
• 110 inline
• 111 inline_outer
• 112 input_file_block_length
• 113 input_file_block_start
• 114 input_file_name
• 115 instr
• 116 int
• 117 isnan
• 118 isnotnull
• 119 isnull
• 120 java_method
• 121 json_tuple
• 122 kurtosis
• 123 lag
• 124 last
• 125 last_day
• 126 last_value
• 127 lcase
• 128 lead
• 129 least
• 130 left
• 131 length
• 132 levenshtein
• 133 like
• 134 ln
• 135 locate
• 136 log
• 137 log10
• 138 log1p
• 139 log2
• 140 lower
• 141 lpad
• 142 ltrim
• 143 map
• 144 map_keys
• 145 map_values
• 146 max
• 147 md5
• 148 mean
• 149 min
• 150 minute
• 151 mod
• 152 monotonically_increasing_id
• 153 month
• 154 months_between
• 155 named_struct
• 156 nanvl
• 157 negative
• 158 next_day
• 159 not
• 160 now
• 161 ntile
• 162 nullif
• 163 nvl
• 164 nvl2
• 165 octet_length
• 166 or
• 167 parse_url
• 168 percent_rank
• 169 percentile
• 170 percentile_approx
• 171 pi
• 172 pmod
• 173 posexplode
• 174 posexplode_outer
• 175 position
• 176 positive
• 177 pow
• 178 power
• 179 printf
• 180 quarter
• 181 radians
• 182 rand
• 183 randn
• 184 rank
• 185 reflect
• 186 regexp_extract
• 187 regexp_replace
• 188 repeat
• 189 replace
• 190 reverse
• 191 right
• 192 rint
• 193 rlike
• 194 rollup
• 195 round
• 196 row_number
• 197 rpad
• 198 rtrim
• 199 second
• 200 sentences
• 201 sha
• 202 sha1
• 203 sha2
• 204 shiftleft
• 205 shiftright
• 206 shiftrightunsigned
• 207 sign
• 208 signum
• 209 sin
• 210 sinh
• 211 size
• 212 skewness
• 213 smallint
• 214 sort_array
• 215 soundex
• 216 space
• 217 spark_partition_id
• 218 split
• 219 sqrt
• 220 stack
• 221 std
• 222 stddev
• 223 stddev_pop
• 224 stddev_samp
• 225 str_to_map
• 226 string
• 227 struct
• 228 substr
• 229 substring
• 230 substring_index
• 231 sum
• 232 tan
• 233 tanh
• 234 timestamp
• 235 tinyint
• 236 to_date
• 237 to_json
• 238 to_timestamp
• 239 to_unix_timestamp
• 240 to_utc_timestamp
• 241 translate
• 242 trim
• 243 trunc
• 244 ucase
• 245 unbase64
• 246 unhex
• 247 unix_timestamp
• 248 upper
• 249 uuid
• 250 var_pop
• 251 var_samp
• 252 variance
• 253 weekofyear
• 254 when
• 255 window
• 256 xpath
• 257 xpath_boolean
• 258 xpath_double
• 259 xpath_float
• 260 xpath_int
• 261 xpath_long
• 262 xpath_number
• 263 xpath_short
• 264 xpath_string
• 265 year
• 266 |
• 267 ~
! expr :逻辑非。
expr1 % expr2 - 返回 expr1/expr2 的余数.
例子:
|
expr1 & expr2 - 返回 expr1 和 expr2 的按位AND的结果。
例子:
|
expr1 * expr2 - 返回 expr1*expr2.
例子:
|
expr1 + expr2 - 返回 expr1+expr2.
例子:
|
expr1 - expr2 - 返回 expr1-expr2.
例子:
|
expr1 / expr2 - 返回 expr1/expr2,返回结果总是浮点数。
例子:
|
expr1 < expr2 - 如果 expr1 小于 expr2 则返回 true.
参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:
|
expr1 <= expr2 - 如果 expr1 小于等于 expr2。
例子:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:
|
expr1 <=> expr2 - 返回的结果和 EQUAL(=) 一样。如果操作符两边都是 null,该操作符返回 true;仅一边为null则返回false。
参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:
|
expr1 = expr2 - 如果 expr1 等于 expr2 则返回true,否则返回false。
参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:
|
expr1 == expr2 - 如果 expr1 等于 expr2 则返回true,否则返回false。
参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:
|
expr1 > expr2 - 如果 expr1 大于 expr2 则返回 true。
参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:
|
expr1 >= expr2 - 如果 expr1 大于等于 expr2 则返回 true。
参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:
|
expr1 ^ expr2 - 返回 expr1 和 expr2 的按位异或的结果。
例子:
|
abs(expr) - 返回数值的绝对值。
例子:
|
acos(expr) - 如果 -1 <= expr <= 1,则返回 expr 的反余弦,否则返回 NaN。
例子:
|
add_months(start_date, num_months)
例子:
|
Since: 1.5.0
expr1 and expr2 - 逻辑 AND.
approx_count_distinct(expr[, relativeSD]) - 通过 HyperLogLog ++ 返回估计的基数. relativeSD
定义允许的最大估计误差。
approx_percentile(col, percentage [, accuracy]) - 返回给定百分比处数值列 col 的近似百分位数值。百分比的值必须是 0.0 到 1.0 之间。
例子:
|
array(expr, ...) - 返回给定值组成的数组。
例子:
|
array_contains(array, value) - 如果数组包含了 value,则返回 true。
例子:
|
ascii(str) - 返回 str 的第一个字符的 ascii 数值。
例子:
|
asin(expr) - 如果 -1 <= expr <= 1,则返回 expr 的反正弦,否则返回 NaN。
例子:
|
assert_true(expr) - 如果 expr 表达式的返回值不是 true 则抛出异常。
例子:
|
atan(expr) - 返回 expr 的反正切。
例子:
|
atan2(expr1, expr2) - 返回平面的正 x 轴与由坐标(expr1,expr2)点之间的弧度角度。
例子:
|
avg(expr) - 返回 expr 表达式的平均值。
base64(bin) - 将参数从二进制文件转换为 base64 的字符串。
例子:
|
bigint(expr) - 将值 expr 转换为 bigint 数据类型。
bin(expr) - 返回 long 类型的参数 expr 的二进制字符串表示形式。
例子:
|
binary(expr) - 将值 expr 转换为 binary 数据类型。
bit_length(expr) - 返回字符串数据的位长度或二进制数据的位数。
例子:
|
boolean(expr) - 将值 expr 转换为 boolean 数据类型。
bround(expr, d) - 使用 HALF_EVEN 舍入模式返回 expr 四舍五入至 d 位小数点的数据。
例子:
|
cast(expr AS type) - 将 expr 转换成 type 类型的数据。
例子:
|
cbrt(expr) - 返回 expr 的立方根。
例子:
|
ceil(expr) - 返回不小于 expr 的最小整数。
例子:
|
ceiling(expr) - 返回不小于 expr 的最小整数。
例子:
|
char(expr) - 返回二进制等效于 expr 的 ASCII 字符。 如果 n 大于256,则结果等于 chr(n%256)
例子:
|
char_length(expr) - 返回字符串数据的字符长度或二进制数据的字节数。 字符串数据的长度包括尾随空格,二进制数据的长度包括二进制零。
例子:
|
character_length(expr) - 返回字符串数据的字符长度或二进制数据的字节数。 字符串数据的长度包括尾随空格,二进制数据的长度包括二进制零。
例子:
|
chr(expr) - 返回二进制等效于 expr 的 ASCII 字符。 如果 n 大于256,则结果等于 chr(n%256)
例子:
|
coalesce(expr1, expr2, ...) - 返回第一个非空参数(如果存在)。 否则,返回 null。
例子:
|
collect_list(expr) - 收集并返回非唯一元素列表。
collect_set(expr) - 收集并返回唯一元素列表。
concat(str1, str2, ..., strN) - 返回由 str1, str2, ..., strN 组成的字符串。
例子:
|
concat_ws(sep, [str | array(str)]+) - 返回由 sep 分隔组成的字符串连接。
例子:
|
conv(num, from_base, to_base) - 将 num 从 from_base 进制转换为 to_base 进制。
例子:
|
corr(expr1, expr2) - Returns Pearson coefficient of correlation between a set of number pairs.
cos(expr) - 返回 expr 的余弦。
例子:
|
cosh(expr) - 返回 expr 的双曲余弦。
例子:
|
cot(expr) - 返回 expr 的余切值。
例子:
|
count(*) - Returns the total number of retrieved rows, including rows containing null.
count(expr) - Returns the number of rows for which the supplied expression is non-null.
count(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-null.
count_min_sketch(col, eps, confidence, seed) - Returns a count-min sketch of a column with the given esp,
confidence and seed. The result is an array of bytes, which can be deserialized to aCountMinSketch
before usage. Count-min sketch is a probabilistic data structure used for
cardinality estimation using sub-linear space.
covar_pop(expr1, expr2) - Returns the population covariance of a set of number pairs.
covar_samp(expr1, expr2) - Returns the sample covariance of a set of number pairs.
crc32(expr) - Returns a cyclic redundancy check value of the expr
as a bigint.
Examples:
|
cume_dist() - Computes the position of a value relative to all values in the partition.
current_database() - Returns the current database.
Examples:
|
current_date() - Returns the current date at the start of query evaluation.
Since: 1.5.0
current_timestamp() - Returns the current timestamp at the start of query evaluation.
Since: 1.5.0
date(expr) - Casts the value expr
to the target data type date
.
date_add(start_date, num_days) - Returns the date that is num_days
after start_date
.
Examples:
|
Since: 1.5.0
date_format(timestamp, fmt) - Converts timestamp
to a value of string in the format specified by the date format fmt
.
Examples:
|
Since: 1.5.0
date_sub(start_date, num_days) - Returns the date that is num_days
before start_date
.
Examples:
|
Since: 1.5.0
date_trunc(fmt, ts) - Returns timestamp ts
truncated to the unit specified by the format model fmt
.fmt
should be one of ["YEAR", "YYYY", "YY", "MON", "MONTH", "MM", "DAY", "DD", "HOUR", "MINUTE", "SECOND", "WEEK", "QUARTER"]
Examples:
|
Since: 2.3.0
datediff(endDate, startDate) - Returns the number of days from startDate
to endDate
.
Examples:
|
Since: 1.5.0
day(date) - Returns the day of month of the date/timestamp.
Examples:
|
Since: 1.5.0
dayofmonth(date) - Returns the day of month of the date/timestamp.
Examples:
|
Since: 1.5.0
dayofweek(date) - Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday).
Examples:
|
Since: 2.3.0
dayofyear(date) - Returns the day of year of the date/timestamp.
Examples:
|
Since: 1.5.0
decimal(expr) - Casts the value expr
to the target data type decimal
.
decode(bin, charset) - Decodes the first argument using the second argument character set.
Examples:
|
degrees(expr) - Converts radians to degrees.
Arguments:
Examples:
|
dense_rank() - Computes the rank of a value in a group of values. The result is one plus the
previously assigned rank value. Unlike the function rank, dense_rank will not produce gaps
in the ranking sequence.
double(expr) - Casts the value expr
to the target data type double
.
e() - Returns Euler's number, e.
Examples:
|
elt(n, input1, input2, ...) - Returns the n
-th input, e.g., returns input2
when n
is 2.
Examples:
|
encode(str, charset) - Encodes the first argument using the second argument character set.
Examples:
|
exp(expr) - Returns e to the power of expr
.
Examples:
|
explode(expr) - Separates the elements of array expr
into multiple rows, or the elements of map expr
into multiple rows and columns.
Examples:
|
explode_outer(expr) - Separates the elements of array expr
into multiple rows, or the elements of map expr
into multiple rows and columns.
Examples:
|
expm1(expr) - Returns exp(expr
) - 1.
Examples:
|
factorial(expr) - Returns the factorial of expr
. expr
is [0..20]. Otherwise, null.
Examples:
|
find_in_set(str, str_array) - Returns the index (1-based) of the given string (str
) in the comma-delimited list (str_array
).
Returns 0, if the string was not found or if the given string (str
) contains a comma.
Examples:
|
first(expr[, isIgnoreNull]) - Returns the first value of expr
for a group of rows.
If isIgnoreNull
is true, returns only non-null values.
first_value(expr[, isIgnoreNull]) - Returns the first value of expr
for a group of rows.
If isIgnoreNull
is true, returns only non-null values.
float(expr) - Casts the value expr
to the target data type float
.
floor(expr) - Returns the largest integer not greater than expr
.
Examples:
|
format_number(expr1, expr2) - Formats the number expr1
like '#,###,###.##', rounded to expr2
decimal places. If expr2
is 0, the result has no decimal point or fractional part.
This is supposed to function like MySQL's FORMAT.
Examples:
|
format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.
Examples:
|
from_json(jsonStr, schema[, options]) - Returns a struct value with the given jsonStr
and schema
.
Examples:
|
Since: 2.2.0
from_unixtime(unix_time, format) - Returns unix_time
in the specified format
.
Examples:
|
Since: 1.5.0
from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.
Examples:
|
Since: 1.5.0
get_json_object(json_txt, path) - Extracts a json object from path
.
Examples:
|
greatest(expr, ...) - Returns the greatest value of all parameters, skipping null values.
Examples:
|
hash(expr1, expr2, ...) - Returns a hash value of the arguments.
Examples:
|
hex(expr) - Converts expr
to hexadecimal.
Examples:
|
hour(timestamp) - Returns the hour component of the string/timestamp.
Examples:
|
Since: 1.5.0
hypot(expr1, expr2) - Returns sqrt(expr1
2 + expr2
2).
Examples:
|
if(expr1, expr2, expr3) - If expr1
evaluates to true, then returns expr2
; otherwise returns expr3
.
Examples:
|
ifnull(expr1, expr2) - Returns expr2
if expr1
is null, or expr1
otherwise.
Examples:
|
expr1 in(expr2, expr3, ...) - Returns true if expr
equals to any valN.
Arguments:
Examples:
|
initcap(str) - Returns str
with the first letter of each word in uppercase.
All other letters are in lowercase. Words are delimited by white space.
Examples:
|
inline(expr) - Explodes an array of structs into a table.
Examples:
|
inline_outer(expr) - Explodes an array of structs into a table.
Examples:
|
input_file_block_length() - Returns the length of the block being read, or -1 if not available.
input_file_block_start() - Returns the start offset of the block being read, or -1 if not available.
input_file_name() - Returns the name of the file being read, or empty string if not available.
instr(str, substr) - Returns the (1-based) index of the first occurrence of substr
in str
.
Examples:
|
int(expr) - Casts the value expr
to the target data type int
.
isnan(expr) - Returns true if expr
is NaN, or false otherwise.
Examples:
|
isnotnull(expr) - Returns true if expr
is not null, or false otherwise.
Examples:
|
isnull(expr) - Returns true if expr
is null, or false otherwise.
Examples:
|
java_method(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection.
Examples:
|
json_tuple(jsonStr, p1, p2, ..., pn) - Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.
Examples:
|
kurtosis(expr) - Returns the kurtosis value calculated from values of a group.
lag(input[, offset[, default]]) - Returns the value of input
at the offset
th row
before the current row in the window. The default value of offset
is 1 and the default
value of default
is null. If the value of input
at the offset
th row is null,
null is returned. If there is no such offset row (e.g., when the offset is 1, the first
row of the window does not have any previous row), default
is returned.
last(expr[, isIgnoreNull]) - Returns the last value of expr
for a group of rows.
If isIgnoreNull
is true, returns only non-null values.
last_day(date) - Returns the last day of the month which the date belongs to.
Examples:
|
Since: 1.5.0
last_value(expr[, isIgnoreNull]) - Returns the last value of expr
for a group of rows.
If isIgnoreNull
is true, returns only non-null values.
lcase(str) - Returns str
with all characters changed to lowercase.
Examples:
|
lead(input[, offset[, default]]) - Returns the value of input
at the offset
th row
after the current row in the window. The default value of offset
is 1 and the default
value of default
is null. If the value of input
at the offset
th row is null,
null is returned. If there is no such an offset row (e.g., when the offset is 1, the last
row of the window does not have any subsequent row), default
is returned.
least(expr, ...) - Returns the least value of all parameters, skipping null values.
Examples:
|
left(str, len) - Returns the leftmost len
(len
can be string type) characters from the string str
,if len
is less or equal than 0 the result is an empty string.
Examples:
|
length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.
Examples:
|
levenshtein(str1, str2) - Returns the Levenshtein distance between the two given strings.
Examples:
|
str like pattern - Returns true if str matches pattern, null if any arguments are null, false otherwise.
Arguments:
pattern - a string expression. The pattern is a string which is matched literally, with
exception to the following special symbols:
_ matches any one character in the input (similar to . in posix regular expressions)
% matches zero or more characters in the input (similar to .* in posix regular
expressions)
The escape character is '\'. If an escape character precedes a special symbol or another
escape character, the following character is matched literally. It is invalid to escape
any other character.
Since Spark 2.0, string literals are unescaped in our SQL parser. For example, in order
to match "\abc", the pattern should be "\abc".
When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it fallbacks
to Spark 1.6 behavior regarding string literal parsing. For example, if the config is
enabled, the pattern to match "\abc" should be "\abc".
Examples:
|
Note:
Use RLIKE to match with standard regular expressions.
ln(expr) - Returns the natural logarithm (base e) of expr
.
Examples:
|
locate(substr, str[, pos]) - Returns the position of the first occurrence of substr
in str
after position pos
.
The given pos
and return value are 1-based.
Examples:
|
log(base, expr) - Returns the logarithm of expr
with base
.
Examples:
|
log10(expr) - Returns the logarithm of expr
with base 10.
Examples:
|
log1p(expr) - Returns log(1 + expr
).
Examples:
|
log2(expr) - Returns the logarithm of expr
with base 2.
Examples:
|
lower(str) - Returns str
with all characters changed to lowercase.
Examples:
|
lpad(str, len, pad) - Returns str
, left-padded with pad
to a length of len
.
If str
is longer than len
, the return value is shortened to len
characters.
Examples:
|
ltrim(str) - Removes the leading space characters from str
.
ltrim(trimStr, str) - Removes the leading string contains the characters from the trim string
Arguments:
Examples:
|
map(key0, value0, key1, value1, ...) - Creates a map with the given key/value pairs.
Examples:
|
map_keys(map) - Returns an unordered array containing the keys of the map.
Examples:
|
map_values(map) - Returns an unordered array containing the values of the map.
Examples:
|
max(expr) - Returns the maximum value of expr
.
md5(expr) - Returns an MD5 128-bit checksum as a hex string of expr
.
Examples:
|
mean(expr) - Returns the mean calculated from values of a group.
min(expr) - Returns the minimum value of expr
.
minute(timestamp) - Returns the minute component of the string/timestamp.
Examples:
|
Since: 1.5.0
expr1 mod expr2 - Returns the remainder after expr1
/expr2
.
Examples:
|
monotonically_increasing_id() - Returns monotonically increasing 64-bit integers. The generated ID is guaranteed
to be monotonically increasing and unique, but not consecutive. The current implementation
puts the partition ID in the upper 31 bits, and the lower 33 bits represent the record number
within each partition. The assumption is that the data frame has less than 1 billion
partitions, and each partition has less than 8 billion records.
month(date) - Returns the month component of the date/timestamp.
Examples:
|
Since: 1.5.0
months_between(timestamp1, timestamp2) - Returns number of months between timestamp1
and timestamp2
.
Examples:
|
Since: 1.5.0
named_struct(name1, val1, name2, val2, ...) - Creates a struct with the given field names and values.
Examples:
|
nanvl(expr1, expr2) - Returns expr1
if it's not NaN, or expr2
otherwise.
Examples:
|
negative(expr) - Returns the negated value of expr
.
Examples:
|
next_day(start_date, day_of_week) - Returns the first date which is later than start_date
and named as indicated.
Examples:
|
Since: 1.5.0
not expr - Logical not.
now() - Returns the current timestamp at the start of query evaluation.
Since: 1.5.0
ntile(n) - Divides the rows for each window partition into n
buckets ranging
from 1 to at most n
.
nullif(expr1, expr2) - Returns null if expr1
equals to expr2
, or expr1
otherwise.
Examples:
|
nvl(expr1, expr2) - Returns expr2
if expr1
is null, or expr1
otherwise.
Examples:
|
nvl2(expr1, expr2, expr3) - Returns expr2
if expr1
is not null, or expr3
otherwise.
Examples:
|
octet_length(expr) - Returns the byte length of string data or number of bytes of binary data.
Examples:
|
expr1 or expr2 - Logical OR.
parse_url(url, partToExtract[, key]) - Extracts a part from a URL.
Examples:
|
percent_rank() - Computes the percentage ranking of a value in a group of values.
percentile(col, percentage [, frequency]) - Returns the exact percentile value of numeric columncol
at the given percentage. The value of percentage must be between 0.0 and 1.0. The
value of frequency should be positive integral
percentile(col, array(percentage1 [, percentage2]...) [, frequency]) - Returns the exact
percentile value array of numeric column col
at the given percentage(s). Each value
of the percentage array must be between 0.0 and 1.0. The value of frequency should be
positive integral
percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric
column col
at the given percentage. The value of percentage must be between 0.0
and 1.0. The accuracy
parameter (default: 10000) is a positive numeric literal which
controls approximation accuracy at the cost of memory. Higher value of accuracy
yields
better accuracy, 1.0/accuracy
is the relative error of the approximation.
When percentage
is an array, each value of the percentage array must be between 0.0 and 1.0.
In this case, returns the approximate percentile array of column col
at the given
percentage array.
Examples:
|
pi() - Returns pi.
Examples:
|
pmod(expr1, expr2) - Returns the positive value of expr1
mod expr2
.
Examples:
|
posexplode(expr) - Separates the elements of array expr
into multiple rows with positions, or the elements of map expr
into multiple rows and columns with positions.
Examples:
|
posexplode_outer(expr) - Separates the elements of array expr
into multiple rows with positions, or the elements of map expr
into multiple rows and columns with positions.
Examples:
|
position(substr, str[, pos]) - Returns the position of the first occurrence of substr
in str
after position pos
.
The given pos
and return value are 1-based.
Examples:
|
positive(expr) - Returns the value of expr
.
pow(expr1, expr2) - Raises expr1
to the power of expr2
.
Examples:
|
power(expr1, expr2) - Raises expr1
to the power of expr2
.
Examples:
|
printf(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.
Examples:
|
quarter(date) - Returns the quarter of the year for date, in the range 1 to 4.
Examples:
|
Since: 1.5.0
radians(expr) - Converts degrees to radians.
Arguments:
Examples:
|
rand([seed]) - Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).
Examples:
|
randn([seed]) - Returns a random value with independent and identically distributed (i.i.d.) values drawn from the standard normal distribution.
Examples:
|
rank() - Computes the rank of a value in a group of values. The result is one plus the number
of rows preceding or equal to the current row in the ordering of the partition. The values
will produce gaps in the sequence.
reflect(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection.
Examples:
|
regexp_extract(str, regexp[, idx]) - Extracts a group that matches regexp
.
Examples:
|
regexp_replace(str, regexp, rep) - Replaces all substrings of str
that match regexp
with rep
.
Examples:
|
repeat(str, n) - Returns the string which repeats the given string value n times.
Examples:
|
replace(str, search[, replace]) - Replaces all occurrences of search
with replace
.
Arguments:
search
is not found in str
, str
is returned unchanged.replace
is not specified or is an empty string, nothing replacesstr
.Examples:
|
reverse(str) - Returns the reversed given string.
Examples:
|
right(str, len) - Returns the rightmost len
(len
can be string type) characters from the string str
,if len
is less or equal than 0 the result is an empty string.
Examples:
|
rint(expr) - Returns the double value that is closest in value to the argument and is equal to a mathematical integer.
Examples:
|
str rlike regexp - Returns true if str
matches regexp
, or false otherwise.
Arguments:
regexp - a string expression. The pattern string should be a Java regular expression.
Since Spark 2.0, string literals (including regex patterns) are unescaped in our SQL
parser. For example, to match "\abc", a regular expression for regexp
can be
"^\abc$".
There is a SQL config 'spark.sql.parser.escapedStringLiterals' that can be used to
fallback to the Spark 1.6 behavior regarding string literal parsing. For example,
if the config is enabled, the regexp
that can match "\abc" is "^\abc$".
Examples:
|
Note:
Use LIKE to match with simple string pattern.
round(expr, d) - Returns expr
rounded to d
decimal places using HALF_UP rounding mode.
Examples:
|
row_number() - Assigns a unique, sequential number to each row, starting with one,
according to the ordering of rows within the window partition.
rpad(str, len, pad) - Returns str
, right-padded with pad
to a length of len
.
If str
is longer than len
, the return value is shortened to len
characters.
Examples:
|
rtrim(str) - Removes the trailing space characters from str
.
rtrim(trimStr, str) - Removes the trailing string which contains the characters from the trim string from the str
Arguments:
Examples:
|
second(timestamp) - Returns the second component of the string/timestamp.
Examples:
|
Since: 1.5.0
sentences(str[, lang, country]) - Splits str
into an array of array of words.
Examples:
|
sha(expr) - Returns a sha1 hash value as a hex string of the expr
.
Examples:
|
sha1(expr) - Returns a sha1 hash value as a hex string of the expr
.
Examples:
|
sha2(expr, bitLength) - Returns a checksum of SHA-2 family as a hex string of expr
.
SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 0 is equivalent to 256.
Examples:
|
shiftleft(base, expr) - Bitwise left shift.
Examples:
|
shiftright(base, expr) - Bitwise (signed) right shift.
Examples:
|
shiftrightunsigned(base, expr) - Bitwise unsigned right shift.
Examples:
|
sign(expr) - Returns -1.0, 0.0 or 1.0 as expr
is negative, 0 or positive.
Examples:
|
signum(expr) - Returns -1.0, 0.0 or 1.0 as expr
is negative, 0 or positive.
Examples:
|
sin(expr) - Returns the sine of expr
, as if computed by java.lang.Math.sin
.
Arguments:
Examples:
|
sinh(expr) - Returns hyperbolic sine of expr
, as if computed by java.lang.Math.sinh
.
Arguments:
Examples:
|
size(expr) - Returns the size of an array or a map. Returns -1 if null.
Examples:
|
skewness(expr) - Returns the skewness value calculated from values of a group.
smallint(expr) - Casts the value expr
to the target data type smallint
.
sort_array(array[, ascendingOrder]) - Sorts the input array in ascending or descending order according to the natural ordering of the array elements.
Examples:
|
soundex(str) - Returns Soundex code of the string.
Examples:
|
space(n) - Returns a string consisting of n
spaces.
Examples:
|
spark_partition_id() - Returns the current partition id.
split(str, regex) - Splits str
around occurrences that match regex
.
Examples:
|
sqrt(expr) - Returns the square root of expr
.
Examples:
|
stack(n, expr1, ..., exprk) - Separates expr1
, ..., exprk
into n
rows.
Examples:
|
std(expr) - Returns the sample standard deviation calculated from values of a group.
stddev(expr) - Returns the sample standard deviation calculated from values of a group.
stddev_pop(expr) - Returns the population standard deviation calculated from values of a group.
stddev_samp(expr) - Returns the sample standard deviation calculated from values of a group.
str_to_map(text[, pairDelim[, keyValueDelim]]) - Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for pairDelim
and ':' for keyValueDelim
.
Examples:
|
string(expr) - Casts the value expr
to the target data type string
.
struct(col1, col2, col3, ...) - Creates a struct with the given field values.
substr(str, pos[, len]) - Returns the substring of str
that starts at pos
and is of length len
, or the slice of byte array that starts at pos
and is of length len
.
Examples:
|
substring(str, pos[, len]) - Returns the substring of str
that starts at pos
and is of length len
, or the slice of byte array that starts at pos
and is of length len
.
Examples:
|
substring_index(str, delim, count) - Returns the substring from str
before count
occurrences of the delimiter delim
.
If count
is positive, everything to the left of the final delimiter (counting from the
left) is returned. If count
is negative, everything to the right of the final delimiter
(counting from the right) is returned. The function substring_index performs a case-sensitive match
when searching for delim
.
Examples:
|
sum(expr) - Returns the sum calculated from values of a group.
tan(expr) - Returns the tangent of expr
, as if computed by java.lang.Math.tan
.
Arguments:
Examples:
|
tanh(expr) - Returns the hyperbolic tangent of expr
, as if computed byjava.lang.Math.tanh
.
Arguments:
Examples:
|
timestamp(expr) - Casts the value expr
to the target data type timestamp
.
tinyint(expr) - Casts the value expr
to the target data type tinyint
.
to_date(date_str[, fmt]) - Parses the date_str
expression with the fmt
expression to
a date. Returns null with invalid input. By default, it follows casting rules to a date if
the fmt
is omitted.
Examples:
|
Since: 1.5.0
to_json(expr[, options]) - Returns a json string with a given struct value
Examples:
|
Since: 2.2.0
to_timestamp(timestamp[, fmt]) - Parses the timestamp
expression with the fmt
expression to
a timestamp. Returns null with invalid input. By default, it follows casting rules to
a timestamp if the fmt
is omitted.
Examples:
|
Since: 2.2.0
to_unix_timestamp(expr[, pattern]) - Returns the UNIX timestamp of the given time.
Examples:
|
Since: 1.6.0
to_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.
Examples:
|
Since: 1.5.0
translate(input, from, to) - Translates the input
string by replacing the characters present in the from
string with the corresponding characters in the to
string.
Examples:
|
trim(str) - Removes the leading and trailing space characters from str
.
trim(BOTH trimStr FROM str) - Remove the leading and trailing trimStr
characters from str
trim(LEADING trimStr FROM str) - Remove the leading trimStr
characters from str
trim(TRAILING trimStr FROM str) - Remove the trailing trimStr
characters from str
Arguments:
Examples:
|
trunc(date, fmt) - Returns date
with the time portion of the day truncated to the unit specified by the format model fmt
.fmt
should be one of ["year", "yyyy", "yy", "mon", "month", "mm"]
Examples:
|
Since: 1.5.0
ucase(str) - Returns str
with all characters changed to uppercase.
Examples:
|
unbase64(str) - Converts the argument from a base 64 string str
to a binary.
Examples:
|
unhex(expr) - Converts hexadecimal expr
to binary.
Examples:
|
unix_timestamp([expr[, pattern]]) - Returns the UNIX timestamp of current or specified time.
Examples:
|
Since: 1.5.0
upper(str) - Returns str
with all characters changed to uppercase.
Examples:
|
uuid() - Returns an universally unique identifier (UUID) string. The value is returned as a canonical UUID 36-character string.
Examples:
|
var_pop(expr) - Returns the population variance calculated from values of a group.
var_samp(expr) - Returns the sample variance calculated from values of a group.
variance(expr) - Returns the sample variance calculated from values of a group.
weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.
Examples:
|
Since: 1.5.0
CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When expr1
= true, returns expr2
; else when expr3
= true, returns expr4
; else returns expr5
.
Arguments:
Examples:
|
xpath(xml, xpath) - Returns a string array of values within the nodes of xml that match the XPath expression.
Examples:
|
xpath_boolean(xml, xpath) - Returns true if the XPath expression evaluates to true, or if a matching node is found.
Examples:
|
xpath_double(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
Examples:
|
xpath_float(xml, xpath) - Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
Examples:
|
xpath_int(xml, xpath) - Returns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
Examples:
|
xpath_long(xml, xpath) - Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
Examples:
|
xpath_number(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
Examples:
|
xpath_short(xml, xpath) - Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
Examples:
|
xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the XPath expression.
Examples:
|
year(date) - Returns the year component of the date/timestamp.
Examples:
|
Since: 1.5.0
expr1 | expr2 - Returns the result of bitwise OR of expr1
and expr2
.
Examples:
|
~ expr - Returns the result of bitwise NOT of expr
.
Examples:
|