MySQL 百万级/千万级表全量更新-Toy模板网

这篇具有很好参考价值的文章主要介绍了MySQL 百万级/千万级表全量更新。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

业务需求：今天从生成测试环境迁移了一批百万级/千万级表的数据，领导要求将这批数据进行脱敏处理（将真实姓名、电话、邮箱、身份证号等敏感信息进行替换）。迁移数据记录数如下（小于百万级的全量更新不是本文重点）:

表名	表名含义	行记录数
base_house	房屋表	4201183
base_license	预售证表	17653209
base_contract	网签合同	1500579

解决办法：

第一种：使用Update 语句进行全表更新。

结论：update 更新时间慢长，单表百万级数据更新少说几小时起步。这种方案放弃

第二种：使用存储过程，进行批量更新。

实战:

create procedure batch_update_house()
begin
	-- 定义变量
	declare i int default 1;
    -- 批次更新大小 10000
	declare pageSize int default 10000;
	declare j int default 1;
	-- 421 为房屋总记录 数/10000 向上取整
	while i < 421 do
		if i = 1 then
            -- 温馨提示：update 语句不能直接使用limit，必须使用子查询
			update base_house set real_name ="******", real_phone="136****0511"  where id in(select id from (select id from base_house limit  i, pageSize ) as temp);
		else
			set j = i * pageSize + 1;
            -- 温馨提示：update 语句不能直接使用limit，必须使用子查询
			update base_house set real_name ="******", real_phone="136****0511"  where id in(select id from (select id from base_house limit  j, pageSize ) as temp);
		end if;
		set i = i + 1;
	end while;
end

功能说明： batch_update_house 房屋全量更新存储过程。

结论：伴随limit 偏移量量增大，每次花费寻找起始位置行的时间会延长，但能够避免超长时间执行。（入选清洗方案）

第三种：使用中间表，通过大表关联中间表进行update,通过where 条件命中索引，可以提升批量更新寻找起始位置行的时间

第一步：设计房屋清洗中间表：base_middle_house，设计两个字段（id 主键自增长、house_id 房屋关联主键，唯一主键）

create table "base_middle_house" (
    `id` int(11) not null auto_increment comment '主键',
    `house_id` varchar(200) not null comment '房屋关联主键'，
    primary key (`id`),
    unique key `unique_house_id` (`house_id`)
) engine=InnoDB auto_increment=1 default charset=utf8mb4

第二步：创建批量更新存储过程

create procedure batch_update_middle_house()
begin
	-- 定义变量
	declare i int default 1;
	declare pageSize int default 100000;
	declare j int default 1;
	
	while i < 43 do
		if i = 1 then
			update base_house h, base_middle_house t set  h.real_name ="******", h.real_phone="136****0511"  where t.house_id = h.id and t.id >=1 amd t.id <=100000;
		else
			set j = (i -1) * pageSize + 1;
			update base_house h, base_middle_house t set h.real_name ="******", h.real_phone="136****0511"  where t.house_id = h.id and t.id >=j amd t.id <= i*pageSize;
		end if;
		set i = i + 1;
	end while;
end

实战：执行batch_update_middle_house 存储过程，base_house 全表4201183 记录数在1小时10分钟内全部清洗完毕。

结论：第三种方案针对第二种方案对于寻找起始行位置通过索引进行了优化，批量更新的时间也有明细的提示，达到单台服务器每分钟处理5000条记录数据。初步达到领导的要求。（清洗方案s首推）文章来源地址https://www.toymoban.com/news/detail-454553.html

到了这里，关于MySQL 百万级/千万级表全量更新的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！