Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()—

这篇具有很好参考价值的文章主要介绍了Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

前言：在对表数据进行批量处理过程中，常常碰上某个字段是一个array或者map形式的字段，一列数据的该字段信息同时存在多个值，当我们需要取出该数组中的每一个值实现一一对应关系的时候，可以考虑使用lateral view explode()/posexplode() 进行处理。

一、提要：explode()本身是Hive的自带函数，使用它可以将array或者map中的值逐行输出。

select explode(array('a','b','c','d','e'));

Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数

select explode(map('A','a','B','b','C','c'));

Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数

二、应用：lateral view explode()在工作环境中的使用。（lateral view用于和split、explode等UDTF一起使用的，能将一行数据拆分成多行数据，在此基础上可以对拆分的数据进行聚合，lateral view首先为原始表的每行调用UDTF，UDTF会把一行拆分成一行或者多行，lateral view在把结果组合，产生一个支持别名表的临时表。）

如下国家地区表dim_country，需要实现将shortname（array类型）中的国家简写逐个拆分出来实现与id及level的一一对应。

Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数

如果只是用explode()会出现报错，因为explode()只能实现将shortname单字段进行炸开，一行变多行，但是id和level只有一行，故无法形成新表，这时候就需要用到lateral view。使用lateral view explode()可以将炸出的多行以临时表tmp的形式存在，其它的字段会进行相应的复制，实现与tmp的一一对应。

select id,level,country 
from dim_country
lateral view explode(shortname)tmp as country ;

Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数

（tips:即使数组中各元素相同，也都会一一列出）

三、补充：split()函数的应用

如果需要lateral view explode()的字段类型是string，则需要用split()函数将之拆分成数组，然后再用lateral view explode()处理。同上dim_country表，如果shortname是string类型，则HQL如下,结果相同。

select id,level,country 
from dim_country
lateral view explode(split(shortname,’,’))tmp as country ;

正常数组类型两侧会带有[]，但有些时候也可能是string类型，如果是string类型且两侧带有[]，则需要先用substring()将两侧的[]去掉，可以考虑用如下方式：

SELECT regexp_replace(substr(shortname,2),substr(substr(shortname,2),-1,1),'');
或 
SELECT substr(shortname,2,length(shortname)-2);

&相关资料：试验过程的建表及插入SQL：

建表语句：
CREATE table dim_country (id bigint,level int,shortname ARRAY<string>);
插入语句（两种形式）：
insert into table dim_country values (1,1,array('US','CA','UK'));
insert into dim_country(id,level,shortname) select 1,1,array('US','CA','AE');
insert into dim_country(id,level,shortname) select 1,1,array('US','US');

四、进阶：lateral view explode()在map类型上的使用问题及解决方法（lateral view posexplod() ）。

1、建表

CREATE table dim_map (id bigint,level int,shortname MAP<string,string>)
ROW FORMAT DELIMITED fields terminated by '\t'
collection items terminated by ','
map keys terminated by ':';

2、插入数据

insert into table dim_map 
values (1,1,str_to_map('yuwen:aaa,shuxue:bbb,yingyu:ccc,wuli:ddd,huaxue:eee'));

insert into dim_map(id,level,shortname) 
values (1,1,str_to_map('yuwen:100,shuxue:98,yingyu:88,wuli:89,huaxue:100'));

Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数

3、explode()实现对map类型字段的炸开，形成key和value两段（但只能是map类型字段本身，不能另加别的字段）

SELECT explode(shortname) from dim_map;

Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数

4、如果要加另外的字段，需要用到lateral view posexplod() 函数。

SELECT id,level,single_id,single_time from temp.dim_map
LATERAL VIEW posexplode(map_keys(shortname))t as single_id_index,single_id
LATERAL VIEW posexplode(map_values(shortname))t 
as single_time_index,single_time
where single_id_index = single_time_index;

Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数

posexplode()可以炸开一个数组，会形成两列，一列是自增序列，一列是炸出来的值。上诉方法则是通过自增序列实现一一对应结合。

posexplode()可以通过与split(space(n))结合，可以实现插入固定的行数。

5、补充：map_key()和map_value()可以直接对取出一个map中的所有key和所有value

SELECT id,level,map_keys(shortname), map_values(shortname) from temp.dim_map;

Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数

五、使用lateral view注意事项：

1. lateral view使用在from后，where前，参照如下格式；

lateral view explode(数组字段)临时表名 as 别名 ;

2. from后可带多个lateral view；

3. 如果要拆分的字段有null值，需要使用lateral view outer 替代，避免数据缺失。

参考链接：

Hive SQL中的lateral view explode_IT农民工1的博客-CSDN博客

http://t.zoukankan.com/share23-p-10173311.html

Hive-sql中的explode()函数和posexplode()函数_sql中explode_有语忆语的博客-CSDN博客文章来源地址https://www.toymoban.com/news/detail-442873.html

到了这里，关于Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！

Hive SQL 中ARRAY或MAP类型数据处理：lateral view explode()/posexplode()——行转列函数

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏

支付宝扫一扫领取红包，优惠每天领

二维码1

二维码2