Hive语法(二)
Load加载数据
默认路径 /opt/soft/hive312/warehouse
可以使用hdfs dfs -put 上传
Load操作
Load data [local] inpath 'filepath' [overwrite] into table tablename;
指定local
将在本地文件系统中查找文件路径
若指定相对路径,将相对于用户的当前工作目录进行解释
用户也可以为本地文件指定完整的URI-----例如:file://opt/file.txt
没有指定local
如果filepath指向的是一个完整的URI,会直接使用这个URI;
如果没有指定数据库,Hive会使用在hadoop配置文件中参数fs.default.name指定的
本地指的是node1
create table if not exists employee(
name string,
workplace array<string>,
gender_age struct<gender:string,age:int>,
skills_score map<string,int>,
depart_title map<string,string>
)
row format delimited fields terminated by '|'
collection items terminated by ','
map keys terminated by ':'
lines terminated by '\n';
# 本地加载(本质是hadoop dfs -put 上传操作)复制
load data local inpath '/opt/stufile/emp.txt' into table employee;
# 从HDFS加载 (本质是hadoop fs -mv 操作)移动
load data inpath "hdfspath" into table employee;
Insert插入数据
Hive官方推荐加载数据的方式
也可以使用insert语法把数据插入到指定的表中(应为insert操作底层走MapReduce操作,效率很低)
最常用的配合是把查询返回的结果插入到另一张表中(insert+select)。
insert into table table_name select statement from table2_name
注意:查询返回的字段必须和插入表字段一致
select查询数据
SELECT[ALL丨DISTINCT] select—expr,select—expr,....
FROM table_reference
[WHERE where—condition]
[GROUP BY col_list]
[ORDER BY col_list]
[LIMT[offset,]rows];
SELECT currernt_database();----查询当前数据库
创建分区表
create table employee2(
name string,
work_place array<string>,
gender_age struct<gender:string,age:int>,
skills_score map<string,int>,
depart_title map<string,string>
)
partitioned by (age int)
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':'
lines terminated by '\n';
partitioned by (age int) 含义是:创建分区 以age分区
分区表插入数据
0: jdbc:hive2://192.168.95.150:10000> load data local inpath '/opt/employee.txt' into table employee2 partition(age=20);
0: jdbc:hive2://192.168.95.150:10000> load data local inpath '/opt/employee.txt' into table employee2 partition(age=30);
查看分区表信息
show partitions employee2;
多字段分区
create table employee3(
name string,
work_place array<string>,
gender_age struct<gender:string,age:int>,
skills_score map<string,int>,
depart_title map<string,string>
)
partitioned by (age int , gender string)
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':'
lines terminated by '\n';
插入数据
0: jdbc:hive2://192.168.95.150:10000> load data local inpath '/opt/employee.txt' into table employee3 partition(age=20,gender='0');
0: jdbc:hive2://192.168.95.150:10000> load data local inpath '/opt/employee.txt' into table employee3 partition(age=20,gender='1');
数据表
数据表分为内部表和外部表
内部表(管理表)
HDFS中为所属数据库目录下的子文件夹
数据完全由Hive管理,删除表(元数据)会删除数据
外部表(External Tables)
数据保存在指定位置的HDFS路径中
Hive不完全管理数据,删除表(元数据)不会删除数据文章来源:https://www.toymoban.com/news/detail-403568.html
创建外部
create external table if not exists employee(
name string,
work_place array<string>,
gender_age struct<gender:string,age:int>,
skills_score map<string,int>,
depart_title map<string,string>
)
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':'
lines terminated by '\n
location '/tmp/hivedata/employee';
注意:创建外部表要在create后面加上一个 external
文章来源地址https://www.toymoban.com/news/detail-403568.html
到了这里,关于Hive---Hive语法(二)的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!