1.背景
统计连续登录天数超过3天的用户,输出信息包括:用户id,登录天数,起始时间,结束时间;
2.准备数据
-- 建表
create table if not exists user_login_3days(
user_id STRING,
login_date date
);
--插入数据
insert into user_login_3days values ('01','2023-08-02');
insert into user_login_3days values ('01','2023-08-03');
insert into user_login_3days values ('01','2023-08-04');
insert into user_login_3days values ('01','2023-11-02');
insert into user_login_3days values ('01','2023-12-09');
insert into user_login_3days values ('02','2023-01-01');
insert into user_login_3days values ('02','2023-04-23');
insert into user_login_3days values ('03','2023-09-10');
insert into user_login_3days values ('03','2023-09-11');
insert into user_login_3days values ('03','2023-09-12');
insert into user_login_3days values ('04','2023-04-23');
insert into user_login_3days values ('04','2023-04-24');
insert into user_login_3days values ('05','2023-09-11');
insert into user_login_3days values ('06','2023-09-12');
-- 查询数据数据
select * from user_login_3days order by user_id;
3.解决思路以及实现
思路1:row_number()
- 1.通过对用户id进行开窗函数row_number,对登陆时间进行降序排列
- 2.使用date_sub(login_date,rn)函数进行日期求出差值日期
- 3.对user_id和diff_date分组求出时间的区间范围
- 4.对结果进行过滤操作
SELECT
t2.user_id,
count(1) as login_times,
min(t2.login_date) as start_date,
max(t2.login_date) as end_date
FROM
(
SELECT
t1.user_id,
t1.login_date,
date_sub(t1.login_date,rn) as diff_date
FROM
(
SELECT
user_id,
login_date,
row_number() over(partition by user_id order by login_date asc) as rn
FROM user_login_3days
) t1
) t2
group by t2.user_id, t2.diff_date
having login_times >= 3;
思路2:lag()/lead()文章来源:https://www.toymoban.com/news/detail-671849.html
- 1.通过对用户id进行开窗函数lag/lead,求出前面第二个的日期与当前的日期差以及后面一个日期与当前日期的差值
- 2.对结果进行过滤操作
SELECT
user_id,
lag_login_date,
login_date
FROM
(SELECT
user_id,
login_date,
lag(login_date,2,login_date) over(partition by user_id order by login_date) as lag_login_date,
lead(login_date,1,login_date) over(partition by user_id order by login_date) as lead_login_date
FROM user_login_3days
) t1
where datediff(login_date,lag_login_date) =2
4.总结
连续登陆问题解决的关键在于:如何判断连续?
通过对user_id分组排序后,使用登陆日期减去序号rn。如果连续,则得到的这个日期会相同。文章来源地址https://www.toymoban.com/news/detail-671849.html
到了这里,关于HQL解决连续三天登陆问题的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!