测试数据
- t1 表
select * from t1;
+--------+----------+---------+--------+
| t1.id | t1.name | t1.age | t1.dt |
+--------+----------+---------+--------+
| 1 | aa | 12 | 01 |
| 1 | aa | 12 | 02 |
| 2 | aa | 14 | 01 |
| 2 | bb | 14 | 02 |
| 3 | cc | 16 | 02 |
| NULL | aa | 12 | 01 |
+--------+----------+---------+--------+
- t2 表
select * from t2;
+--------+----------+---------+--------+
| t2.id | t2.name | t2.age | t2.dt |
+--------+----------+---------+--------+
| 1 | 1 | aa | 12 |
| 2 | 1 | aa | 12 |
| 1 | NULL | aa | 12 |
| 1 | 2 | aa | 14 |
| 2 | 2 | bb | 14 |
| 2 | 3 | cc | 16 |
+--------+----------+---------+--------+
关联查询
- t1 left join t2
select * from t1 a left join t2 b on a.id=b.id;
+-------+---------+--------+-------+-------+---------+--------+-------+
| a.id | a.name | a.age | a.dt | b.id | b.name | b.age | b.dt |
+-------+---------+--------+-------+-------+---------+--------+-------+
| 3 | cc | 16 | 02 | NULL | NULL | NULL | NULL |
| NULL | aa | 12 | 01 | NULL | NULL | NULL | NULL |
| 1 | aa | 12 | 01 | 1 | NULL | aa | 12 |
| 1 | aa | 12 | 01 | 1 | 1 | aa | 12 |
| 1 | aa | 12 | 01 | 1 | 2 | aa | 14 |
| 1 | aa | 12 | 02 | 1 | NULL | aa | 12 |
| 1 | aa | 12 | 02 | 1 | 1 | aa | 12 |
| 1 | aa | 12 | 02 | 1 | 2 | aa | 14 |
| 2 | aa | 14 | 01 | 2 | 3 | cc | 16 |
| 2 | aa | 14 | 01 | 2 | 1 | aa | 12 |
| 2 | aa | 14 | 01 | 2 | 2 | bb | 14 |
| 2 | bb | 14 | 02 | 2 | 3 | cc | 16 |
| 2 | bb | 14 | 02 | 2 | 1 | aa | 12 |
| 2 | bb | 14 | 02 | 2 | 2 | bb | 14 |
+-------+---------+--------+-------+-------+---------+--------+-------+
- t1 left join t2 on a.id=b.id where a.dt =‘01’,保留表谓词下推,map端提前过滤
select * from t1 a left join t2 b on a.id=b.id where a.dt ='01';
+-------+---------+--------+-------+-------+---------+--------+-------+
| a.id | a.name | a.age | a.dt | b.id | b.name | b.age | b.dt |
+-------+---------+--------+-------+-------+---------+--------+-------+
| NULL | aa | 12 | 01 | NULL | NULL | NULL | NULL |
| 1 | aa | 12 | 01 | 1 | NULL | aa | 12 |
| 1 | aa | 12 | 01 | 1 | 1 | aa | 12 |
| 1 | aa | 12 | 01 | 1 | 2 | aa | 14 |
| 2 | aa | 14 | 01 | 2 | 3 | cc | 16 |
| 2 | aa | 14 | 01 | 2 | 1 | aa | 12 |
| 2 | aa | 14 | 01 | 2 | 2 | bb | 14 |
+-------+---------+--------+-------+-------+---------+--------+-------+
- t1 left join t2 on a.id=b.id and a.dt =‘01’,保留表非谓词下推,reduce端过滤
select * from t1 a left join t2 b on a.id=b.id and a.dt ='01';
+-------+---------+--------+-------+-------+---------+--------+-------+
| a.id | a.name | a.age | a.dt | b.id | b.name | b.age | b.dt |
+-------+---------+--------+-------+-------+---------+--------+-------+
| 1 | aa | 12 | 02 | NULL | NULL | NULL | NULL |
| 2 | bb | 14 | 02 | NULL | NULL | NULL | NULL |
| 3 | cc | 16 | 02 | NULL | NULL | NULL | NULL |
| NULL | aa | 12 | 01 | NULL | NULL | NULL | NULL |
| 1 | aa | 12 | 01 | 1 | 2 | aa | 14 |
| 1 | aa | 12 | 01 | 1 | NULL | aa | 12 |
| 1 | aa | 12 | 01 | 1 | 1 | aa | 12 |
| 2 | aa | 14 | 01 | 2 | 2 | bb | 14 |
| 2 | aa | 14 | 01 | 2 | 3 | cc | 16 |
| 2 | aa | 14 | 01 | 2 | 1 | aa | 12 |
+-------+---------+--------+-------+-------+---------+--------+-------+
说明:保留表应用谓词下推,提前过滤,会把不符合条件的数据提前过滤掉;保留表不应用谓词下推,不提前过滤,只能在join发生时,不符合条件的数据不参与关联计算;
- t1 left join t2 on a.id=b.id and b.dt =‘12’,空表谓词下推,map端过滤
select * from t1 a left join t2 b on a.id=b.id and b.dt ='12';
+-------+---------+--------+-------+-------+---------+--------+-------+
| a.id | a.name | a.age | a.dt | b.id | b.name | b.age | b.dt |
+-------+---------+--------+-------+-------+---------+--------+-------+
| 2 | aa | 14 | 01 | 2 | 1 | aa | 12 |
| 2 | bb | 14 | 02 | 2 | 1 | aa | 12 |
| 3 | cc | 16 | 02 | NULL | NULL | NULL | NULL |
| NULL | aa | 12 | 01 | NULL | NULL | NULL | NULL |
| 1 | aa | 12 | 01 | 1 | NULL | aa | 12 |
| 1 | aa | 12 | 01 | 1 | 1 | aa | 12 |
| 1 | aa | 12 | 02 | 1 | NULL | aa | 12 |
| 1 | aa | 12 | 02 | 1 | 1 | aa | 12 |
+-------+---------+--------+-------+-------+---------+--------+-------+
- t1 left join t2 on a.id=b.id where b.dt =‘12’,空表非谓词下推,reduce端过滤
select * from t1 a left join t2 b on a.id=b.id where b.dt ='12';
+-------+---------+--------+-------+-------+---------+--------+-------+
| a.id | a.name | a.age | a.dt | b.id | b.name | b.age | b.dt |
+-------+---------+--------+-------+-------+---------+--------+-------+
| 2 | aa | 14 | 01 | 2 | 1 | aa | 12 |
| 2 | bb | 14 | 02 | 2 | 1 | aa | 12 |
| 1 | aa | 12 | 01 | 1 | NULL | aa | 12 |
| 1 | aa | 12 | 01 | 1 | 1 | aa | 12 |
| 1 | aa | 12 | 02 | 1 | NULL | aa | 12 |
| 1 | aa | 12 | 02 | 1 | 1 | aa | 12 |
+-------+---------+--------+-------+-------+---------+--------+-------+
说明:空表应用谓词下推,提前过滤,会把不符合条件的数据提前过滤掉;空表不应用谓词下推,不提前过滤,只能在join完成时,过滤不符合条件的数据;
总结 : 是否应用谓词下推,最后产生的结果往往不同,这里需要特别注意。文章来源:https://www.toymoban.com/news/detail-697366.html
参考:
一文弄懂Hive中谓词下推(on与where的区别)文章来源地址https://www.toymoban.com/news/detail-697366.html
到了这里,关于hive 谓词下推实例分析(on与where的区别)的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!