Hive函数及语法说明_高中生题库网|高考真题|高考试题-「密云二中」

-

2021年2月8日发(作者：再看一眼)

无线增值产品部

Hive

函数及语法说明

版本：

Hive 0.7.0.001

eagooqi

2011-7-19

版本

V1.0

日期

2010-07-20

2012-1-4

修订人

eagooqi

Eagooqi

描述

初稿

、

cube

函数说明

.................................................. .................................................. ......................

内置函数

........................... .................................................. .....................................

增加

oracle

函数

........................ .................................................. ..........................

增加业务函数

................................................ .................................................. ......

扩展函数开发规范

.............................................. .................................................. ........

语法说明

.................................................. .................................................. ....................

内置语法

........................... .................................................. ...................................

增加语法

.................................................. .................................................. ............

扩展语法开发规范（语法转换）

.

........................................ ........................................

ORACLE sql

对应的

hSQL

语法支持

............................................... ...............................

函数说明

参考链接：

/confluence/display/Hive/LanguageManual

CLI

下

使用以下命令显示最新系统函数说明

SHOW FUNCTIONS;

DESCRIBE FUNCTION ;

DESCRIBE FUNCTION EXTENDED ;

内置函数

数值函数

Mathematical Functions

返回类型

BIGINT

函数名称

(

参数

)

round(double a)

描述

返回

double

类型的四舍五入的

BIGINT

值

Returns the rounded BIGINT value of the double

DOUBLE

round(double a,

int d)

返回

a

的

d

进制的四舍五入值

Returns the double rounded to d decimal places

返回小于等于

a

的最大整数值

Returns the maximum BIGINT value that is equal or

less than the double

BIGINT

floor(double a)

BIGINT

ceil(double a),

ceiling(double

a)

返回

a

的最小的大于等于

a

的整数值

Returns the minimum BIGINT value that is equal or

greater than the double

Returns a random number (that changes from row to

row) that is distributed uniformly from 0 to 1.

Specifiying the seed will make sure the generated

random number sequence is deterministic.

double

rand(), rand(int

seed)

double

exp(double a)

自然指数函数

Returns e^a where e is the base of the

natural logarithm

double

ln(double a)

自然对数函数

Returns the natural logarithm of the

argument

double

log10(double a)

以

10

为底对数函数

Returns the base-10 logarithm of

the argument

double

log2(double a)

以

2

为底对数函数

Returns the base-2 logarithm of

the argument

double

log(double

base, double a)

对数函数

Return the base

argument

幂运算函数

Return a^p

double

pow(double a,

double p)

power(double

a, double p)

double

sqrt(double a)

开平方函数

Returns the square root of a

string

bin(BIGINT a)

二进制函数

Returns the number in binary format (see

[

/doc/refman/5.0/en/string- functi

#function_bin

])

string

hex(BIGINT a)

hex(string a)

十六进制函数

If the argument is an int, hex returns

the number as a string in hex format. Otherwise if the

number is a string, it converts each character into its

hex representation and returns the resulting string.

(see

[

/doc/refman/5.0/en/string- functi

#function_hex

])

string

unhex(string a)

反转十六进制函数

Inverse of hex. Interprets each

pair of characters as a hexidecimal number and

converts to the character represented by the number.

string

conv(BIGINT

num, int

from_base, int

to_base)

进制转换函数

Converts a number from a given base

to another (see

[

/doc/refman/5.0/en/mat hematic

#function_conv

])

绝对值函数

Returns the absolute value

正取余函数

Returns the positive value of a mod b

double

int double

abs(double a)

pmod(int a, int

b) pmod(double

a, double b)

double

double

sin(double a)

asin(double a)

正弦函数

Returns the sine of a (a is in radians)

反正弦函数

Returns the arc sin of x if -1<=a<=1 or

null otherwise

double

double

cos(double a)

acos(double a)

余弦函数

Returns the cosine of a (a is in radians)

反余弦函数

Returns the arc cosine of x if -1<=a<=1 or

null otherwise

int double

positive(int a)

positive(double

a)

positive

函数

Returns a

int double

negative(int a)

negative(dou bl

negative

函数

Retur ns -a

e a)

集合函数

Collection Functions

返回类

型

int

size(Map

)

int

size(Array)

返回

Array

类型的元素个数

返回

Map

类型的元素个数

函数名称

(

参数

)

描述

类型转换函数

Type Conversion Functions

返回类型

函数名称

(

参数

)

cast(expr as

)

描述

转换

expr

为

类型

例如

. cast('1' as

BIGINT)

转换

string '1'

为整数

.

如果转换失败

返回

null .

日期函数

Date Functions

string

from_unixtime(bigint

unixtime[, string

format])

Converts the number of seconds from unix epoch

(1970-01-01 00:00:00 UTC) to a string

representing the timestamp of that moment in the

current system time zone in the format of

bigint

unix_timestamp()

Gets current time stamp using the default time

zone.

bigint

unix_timestamp(string

date)

Converts time string in format

yyyy-MM-dd

HH:mm:ss

to Unix time stamp, return 0 if fail:

unix_timestamp('2009-03-20 11:30:01') =

1237573801

bigint

unix_timestamp(string

date, string pattern)

Convert time string with given pattern (see

[

/j2se/1.4.2/docs/api/java/text/

]) to Unix time stamp,

return 0 if fail: unix_timestamp('2009-03-20',

'yyyy-MM- dd') = 1237532400

string

to_date(string

timestamp)

int

year(string date)

Returns the date part of a timestamp string:

to_date(

Returns the year part of a date or a timestamp

string: year(

year(

int

month(string date)

Returns the month part of a date or a timestamp

string: month(

month(

int

day(string date)

dayofmonth(date)

Return the day part of a date or a timestamp

string: day(

day(

int

hour(string date)

Returns the hour of the timestamp:

hour('2009-07-30 12:58:59') = 12, hour('12:58:59')

= 12

int

int

int

minute(string date)

second(string date)

weekofyear(string date)

Returns the minute of the timestamp

Returns the second of the timestamp

Return the week number of a timestamp string:

weekofyea r(

weekofyear(

int

datediff(string enddate,

string startdate)

int

date_add(string

startdate, int days)

int

date_sub(string

startdate, int days)

Return the number of days from startdate to

enddate: datediff('2009-03-01', '2009-02-27') = 2

Add a number of days to startdate:

date_add('2008-12-31', 1) = '2009-01-01'

Subtract a number of days to startdate:

date_sub('2008-12-31', 1) = '2008-12-30'

条件判断函数

Conditional Functions

T

T

T

T

CASE a WHEN b THEN c [WHEN d

THEN e]* [ELSE f] END

CASE WHEN a THEN b [WHEN c

THEN d]* [ELSE e] END

if(boolean testCondition, T

valueTrue, T valueFalseOrNull)

COALESCE(T v1, T v2, ...)

Return valueTrue when testCondition is

true, returns valueFalseOrNull otherwise

Return the first v that is not NULL, or NULL

if all v's are NULL

When a = b, returns c; when a = d, return e;

else return f

When a = true, returns b; when c = true,

return d; else return e

字符串函数

String Functions

返回类型

int

string

string

 函数名称

(

参数

)

length(string A)

reverse(string A)

concat(string A,

string B...)

描述

返回字符串
A

的长度

返回

A

的反序字符串

按照参数顺序返回连接的字符串

.

例如

.

concat('foo', 'bar')

返回

'foobar'.

注

:

该函数

接受任意长度参数

.

string

concat_ws(string

SEP, string A, string

B...)

string

substr(string A, int

start)

substring(string A,

int start)

string

substr(string A, int

start, int len)

substring(string A,

int start, int len)

返回

A

的子串从

start

位置开始直到结尾，

例如

. substr('foobar', 4)

返回

'bar' (

参见

[

/doc/refman/5.0/en/stri

#function_substr

])

返回

A

的子串从

start

位置开始直到

len

长

度

例如

substr('foobar', 4, 1)

返回

'b' (

参见

[

/ doc/refman/5.0/en/stri

#function_substr

])

如同

concat(),

但间隔连接自定义的分隔符

SEP.

string

upper(string A)

ucase(string A)

Returns the string resulting from converting

all characters of A to upper case e.g.

upper('fOoBaR') results in 'FOOBAR'

string

lower(string A)

lcase(string A)

Returns the string resulting from converting

all characters of B to lower case e.g.

lower('fOoBaR') results in 'foobar'

string

trim(string A)

Returns the string resulting from trimming

spaces from both ends of A e.g. trim(' foobar

') results in 'foobar'

string

ltrim(string A)

Returns the string resulting from trimming

spaces from the beginning(left hand side) of

A e.g. ltrim(' foobar ') results in 'foobar '

string

rtrim(string A)

Returns the string resulting from trimming

spaces from the end(right hand side) of A

e.g. rtrim(' foobar ') results in ' foobar'

string

regexp_replace(strin

g A, string B, string

C)

Returns the string resulting from replacing

all substrings in B that match the Java

regular expression syntax(See Java regular

expressions syntax) with C e.g.

regexp_replace(

'fb.' Note that some care is necessary in

using predefined character classes: using

's' as the second argument will match the

letter s; '

s' is necessary to match whitespace, etc.

string

regexp_extract(strin

g subject, string

pattern, int index)

Returns the string extracted using the

pattern. e.g. regexp_extract('foothebar',

'foo(.*?)(bar)', 2) returns 'bar.' Note that

some care is necessary in using predefined

character classes: using 's' as the second

argument will match the letter s; '

s' is necessary to match whitespace, etc.

The 'index' parameter is the Java regex

Matcher group() method index. See

docs/api/java/util/regex/ for

more information on the 'index' or Java

regex group() method.

string

parse_url(string

urlString, string

partToExtract [,

string keyToExtract])

Returns the specified part from the URL.

Valid values for partToExtract include

HOST, PATH, QUERY, REF, PROTOCOL,

AUTHORITY, FILE, and USERINFO. e.g.

parse_url('/pa th1/?

k1=v1&k2=v2#Ref1', 'HOST') returns

''. Also a value of a particular

key in QUERY can be extracted by providing

the key as the third argument, e.g.

parse_url('/path1/?

k1=v1&k2=v2#Ref1', 'QUERY', 'k1') returns

'v1'.

string

get_json_object(strin

g json_string, string

path)

Extract json object from a json string based

on json path specified, and return json string

of the extracted json object. It will return null

if the input json string is invalid.

NOTE: The

json path can only have the characters

[0-9a-z_], i.e., no upper-case or special

characters. Also, the keys *cannot

start

with numbers.* This is due to restrictions on

Hive column names.

string

string

space(int n)

repeat(string str, int

n)

返回

n

个空格字符串

返回

n

次重复

str

连接的字符串

int

string

ascii(string str)

lpad(string str, int

len, string pad)

返回

st r

的首个字符的

ascii

码对应数字

Returns str, left-padded with pad to a length

of len

Returns str, right-padded with pad to a

length of len

按照

pat

正则拆分

str (pat

是正则表达式

)

string

rpad(string str, int

len, string pad)

array

split(string str, string

pat)

int

find_in_set(string str,

string strList)

Returns the first occurance of str in strList

where strList is a comma- delimited string.

Returns null if either argument is null.

Returns 0 if the first argument contains any

commas. e.g. find_in_set('ab',

'abc,b,ab,c,def') returns 3

array

ring>>

sentences(string str,

string lang, string

locale)

Tokenizes a string of natural language text

into words and sentences, where each

sentence is broken at the appropriate

sentence boundary and returned as an array

of words. The 'lang' and 'locale' are optional

arguments. e.g. sentences('Hello there! How

are you?') returns ( (

array

ring,double>>

ngrams(array

>, int N, int

K, int pf)

Returns the top-k N-grams from a set of

tokenized sentences, such as those

returned by the sentences() UDAF. See

[Hive-StatisticsAndDataMining] for more

information.

array

ring,double>>

context_ngrams(arra
 y>,

array, int K,

int pf)

Returns the top-k contextual N-grams from a

set of tokenized sentences, given a string of

[Hive- StatisticsAndDataMining] for more

information.

其他函数

Misc. Functions

xpath

The following functions are described in [Hive- LanguageManual-XPathUDF]:

?

xpath, xpath_short, xpath_int, xpath_long, xpath_float, xpath_double, xpath_number,

xpath_string

get_json_object

A limited version of JSONPath is supported:

?

?

?

?

$$ : Root object

. : Child operator

[] : Subscript operator for array

* : Wildcard for []

Syntax not supported that's worth noticing:

?

?

: Zero length string as key

.. : Recursive descent

?

?

?

?

?

@ : Current object/element

() : Script expression

?() : Filter (script) expression.

[,] : Union operator

[start:] : array slice operator

Example: src_json table is a single column (json), single row table:

json

{

:



{

:[{

:8,

:


},{

:9,


:

}],

:{

:19.95,

:

}

},

:

,

:

}

The fields of the json object can be extracted using these queries:

hive>

SELECT

get_json_object(src_,

'$$.owner')

FROM

src_json;

amy

hive>

SELECT

get_json_object(src_,

'$$.[0]')

FROM

src_json;

{

:8,

:

 }

hive> SELECT get_json_object(src_, '$$.non_exist_key') FROM

src_json;

NULL

内置聚合函数

Built-in Aggregate Functions (UDAF)

返回类型

bigint

 函数名称

(

参数

)

count(*), count(expr),

count(DISTINCT expr[,

expr_.])

描述

count(*)

–

返回总行数，包括空

值

NULL; count(expr) - Returns

the number of rows for which

the supplied expression is

non-NULL; count(DISTINCT

expr[, expr]) - Returns the

number of rows for which the

supplied expression(s) are

unique and non-NULL.

double

sum(col),

sum(DISTINCT col)

Returns the sum of the

elements in the group or the

sum of the distinct values of

the column in the group

double

avg(col), avg(DISTINCT

col)

Returns the average of the

elements in the group or the

average of the distinct values

of the column in the group

double

min(col)

Returns the minimum of the

column in the group

double

max(col)

Returns the maximum value of

the column in the group

double

var_pop(col)

Returns the variance of a

numeric column in the group

double

var_samp(col)

Returns the unbiased sample

variance of a numeric column

in the group

double

stddev_pop(col)

Returns the standard deviation

of a numeric column in the

group

double

stddev_samp(col)

Returns the unbiased sample

standard deviation of a

numeric column in the group

double

covar_pop(col1, col2)

Returns the population

covariance of a pair of numeric

columns in the group

double

covar_samp(col1, col2)

Returns the sample covariance

of a pair of a numeric columns

in the group

double

corr(col1, col2)

Returns the Pearson

coefficient of correlation of a

pair of a numeric columns in

the group

double

percentile(col, p)

Returns the exact p^th^

percentile of an integer column

in the group (does not work

with floating point types). p

must be between 0 and 1.

array

percentile(col,

array(p~1,, [, p,,2,,]...))

Returns the exact percentiles

p,,1,,, p,,2,,, ... of an integer

column in the group (does not

work with floating point types).

p,,i~ must be between 0 and 1.

double

percentile_approx(col, p

[, B])

Returns an approximate p^th^

percentile of a numeric column

(including floating point types)

in the group. The B parameter

controls approximation

accuracy at the cost of

memory. Higher values yield

better approximations, and the

default is 10,000. When the

number of distinct values in col

is smaller than B, this gives an

exact percentile value.

array

percentile_approx(col,

array(p~1,, [, p,,2_]...) [,

B])

Same as above, but accepts

and returns an array of

percentile values instead of a

single one.

array

'x','y'

}>

histogram_numeric(col,

b)

Computes a histogram of a

numeric column in the group

using b non-uniformly spaced

bins. The output is an array of

size b of double-valued (x,y)

coordinates that represent the

bin centers and heights

array

collect_set(col)

Returns a set of objects with

duplicate elements eliminated

?

增加

oracle

函数

返回类型

T

函数名称

(

参数

)

Nvl(T1,T2)

描述

如果

T 1

为空，返回

T2

，否则返回

T1

；

其中

T1

和

T2

的类型必须一样。

参数类型同返回值的类型一致。

（

若

参数类型不

一致，

函数的行为不可预见，

可能执

行正确，

可能执行失败

）

String

s1 || s2

如同

concat(s1,s2)

T

nvl2(T1,T2,T3)

T1

为

NULL

，

则返回

T2

，

否则返回

T3

；

其中

T1

、

T2

、

T3

的类型必须一样。

参

数类型同返回值的类型一致。

T

decode (T cond, T params[])

判断

cond

是否与

params

的基数位置

的值是否相等，是则返回偶数位置值；

否则返回

params

最后位置的值。

string

systimestamp

返回当前的系统日期、

时间

（不支持

时区）；输

出时间格式为：

yyyy-MM-dd

hh24:mi:ss:ff3

，当

前仅支持

24

小时制
 

string

sysdate

返回当前的系统日期

,

输

出时间格式为：

yyyy-MM-dd

int

INSTR((strin g,str[,start][

,appea

r])

返回

string

中

str

出现的位置。

start

代表开始搜

索的位置，可选参数，默认为

1

（

1

表示第一个

字符位置）；

appear

指示搜索第几

次出现的，可

选参数，默认为

1

；若没有匹配到，

返回

0

。（

start

和

appear

取值为大于

0

的整数）

int/long/f

loa

t/double

Mod

（

n1,n2

）

返回一个

n1

除以

n2

的余数。
 支持的

类型有

Byte

、

Short

、

In teger

、

Long

、

Float

、

Double

。

返回值的正负和

n1

相关。使用此函

数需要注意

的

2

个问题：

1

、

对小数执行

mod

计算可能会产生

精度丢失，

（如

mod(3.97384626 ,3.14159265

35897384

627

，返回结果为

0.0)

）此问题由

java

环境本身导致，若

需要高精度

的浮点计算，慎用此函数；

2

、

传入比

MAX_LONG

还大的整数作为

参数，

则参数会被自动升级成

Double

类

型，函数也可

以正常计算结果，

但返回的结果是小

数类型。

int

Bitand(expr1,expr2)

指定按位进行

AND

运算的两个数值。

当前只


支持

Byte

、

Shor t

、

Integer

、

Long

类

型。如果

expr1

和

expr2

的位都是

1

，

相应的结果位就

是
1;

否

则相应的结果位是

0

。若

expr1

和

expr2

类型不

一样，直接进行

and

计算

string

next_day(date,day)

计算给出日期

date

之后的下一个星

期

day

的日

期。

day

是数字，

1-7

分别表示星期

日

-

六；返

回日期的格式为“

YYYY-MM- DD

”

int

to_number(string)

将给出的字符转换为数字；

string

必须为全数字

串。

（

Oracle

中的

to_number

很复

杂，可变参数

且支持多种类型。

当前仅支持整数类

型，包括

short

、

int

、

long

）

string

to_char(date,format)

将日期

date

转化为一个字符串；

date

的格式固

定为

yyyy-mm-dd hh24:mi:ss:ff3

，

输出的格式由

format

指定。

format

当前支持的格

式如下（不

区分大小写）：

yyyymmdd

，

年月日；

yyyymm

，年月；

mm

，月

dd

，日

yyyy-mm-dd

yyyy-mm
yyyymmddhh24miss

，年月日时分秒

（

24

小时制）

yyyy-mm-dd hh24:mi:ss

hh24miss

yyyymmddhh24missff3

，年月日时分

秒毫秒（

24

小时制）

-

-

-

-

-

-

-

-

本文更新与2021-02-08 07:52，由作者提供，不代表本网站立场，转载请注明出处：https://www.bjmy2z.cn/gaokao/613126.html

返回列表：英语

上一篇：PS术语及操作指令中英对照
下一篇：全新版大学英语综合教程4课后词汇填空及翻译

当前您在：主页 > 英语 >

Hive函数及语法说明

-

-

-

-

-

-

-

-

-

返回列表：英语

Hive函数及语法说明的相关文章

爱心与尊严的高中作文题库

爱心与尊严高中作文题库

爱心与尊重的作文题库

爱心责任100字作文题库

爱心责任心的作文题库

爱心责任作文题库

爱心长在作文题库

爱心中国感恩励志作文题

爱心助考作文题库

爱心助农作文题库

爱心尊重宽容拒绝作文题

爱心尊重作文题库

爱心作文题库好段

爱心作文题库120字

爱心作文题库读者

爱心作文题库分论点

爱心作文题库简短

爱心作文有哪些题库

爱需要被尊重作文题库

爱需要传递200字作文题库

爱需要公平作文题库

爱需要行动作文800高中作

爱需要行动作文题库

爱需要交流与沟通作文题

当前您在： 主页 > 英语 >

-

-

-

-

-

-

-

-

-

Hive函数及语法说明的相关文章

当前您在：主页 > 英语 >