hive修复数据-Toy模板网

这篇具有很好参考价值的文章主要介绍了hive修复数据。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

项目场景：

提示：hive中一不小心将表drop掉了，通过select发现表示没有数据的：

hive中一不小心将表drop掉了，通过select发现表示没有数据的，不想重新在导入数据，因为发现hive的目录下是存在数据的

问题描述

提示：这里描述项目中遇到的问题：
由于这里的字段我是用关键字date作为字段名，后面发现在shell脚本中是无法执行成功的，会报错，在datagrip中只要加date是可以执行成功的

但是我这边是需要写shell脚本，让其实现自动化的过程，因此我需要重新创建一张表，将date字段名改为cur_date非关键字

由于不小心将hive表drop掉了，但是发现hdfs路径下表数据还是存在的，因为当时建表的时候就是建的外部表
建表语句如下：

drop table if exists hr_cn.ods_cn_attendance_day_print_full;
create external table IF NOT EXISTS hr_cn.ods_cn_attendance_day_print_full
(
    id               string comment '',
    staff_id         string comment '员工id。不要用来关联员工数据，不准确',
    print_number     string comment '打卡号。关联员工数据，与cn_staff表finger_print_number字段关联',
    date string comment '打卡时间',
    type             int
    status           int
    comment          string
    work_time_type   int
) comment '设置的打卡时间'
    partitioned by (dt string)
    row format delimited fields terminated by '\001'
        NULL DEFINED AS ''
    LOCATION '/warehouse/hr_cn/ods/ods_cn_attendance_day_print_full';

执行了drop操作

解决方案：

不用再重新采集数据，直接使用hive修复数据即可

1.修改字段名（date修改为cur_date）

drop table if exists hr_cn.ods_cn_attendance_day_print_full;
create external table IF NOT EXISTS hr_cn.ods_cn_attendance_day_print_full
(
    id               string comment '',
    staff_id         string comment '员工id。不要用来关联员工数据，不准确',
    print_number     string comment '打卡号。关联员工数据，与cn_staff表finger_print_number字段关联',
    cur_date string comment '打卡时间',
    type             int
    status           int
    comment          string
    work_time_type   int
) comment '设置的打卡时间'
    partitioned by (dt string)
    row format delimited fields terminated by '\001'
        NULL DEFINED AS ''
    LOCATION '/warehouse/hr_cn/ods/ods_cn_attendance_day_print_full';

2.使用hive修改命令即可文章来源地址https://www.toymoban.com/news/detail-535000.html

msck repair table ods_cn_attendance_day_print_full;

到了这里，关于hive修复数据的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！

hive修复数据

项目场景：

hive中一不小心将表drop掉了，通过select发现表示没有数据的，不想重新在导入数据，因为发现hive的目录下是存在数据的

问题描述

解决方案：

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏

支付宝扫一扫领取红包，优惠每天领

二维码1

二维码2