题目1:每日留存率(一个user_register表,另一个是user_active表)统计指定日期新客数及这些新客的次日留存、七日留存
步骤:1.算每日新增客数2. 每日留存数量3.留存率
SELECT register_date, interval, active_num/new_register_num as retention rateFROM (SELECT register_date, count(*) as new_register_numFROM user_registerWHERE register_date ='2020-09-01') as t1JOIN (SELECT register_date,datediff(active_date,register_date) as interval ,count(*) as active_numFROM user_registerJOIN user_activeON user_register.user_id=user_active.user_idWHERE register_date = '2020-09-01' and datediff(active_date,register_date)<=7GROUP BY active_dateORDER BY active_date)as t2ON t1.register_date=t2.register_dateORDER BY interval

题目2: 计算用户留存率,留存率=(昨天活跃用户∩今天活跃用户)/昨天活跃用户
用户活跃表:dws_user_active_t
ds bigint comment ‘日期,分区,yyyymmdd’,
device int comment ‘设备类型’,
user_id bigint comment ‘用户id’,
active_score double comment ‘活跃度’
计算留存率=(昨天活跃用户∩今天活跃用户)/昨天活跃用户
步骤:
1.昨天活跃用户2.昨天活跃用户∩今天活跃用户3./
select t1.ds,'第二天仍活跃用户'/'昨天活跃用户' as '留存率'from (select ds,count(distinct user_id) as '昨天活跃用户'from dws_user_active_twhere ds = '2020-12-11' and active_score > 0) t1join (select d1.ds, count(distinct d2.user_id) as '第二天仍活跃用户'from dws_user_active_t d1join dws_user_active_t d2on datediff(d2.ds,d1.ds)=1 and d1.user_id = d2.user_idwhere d1.ds = '2020-12-11' and d1.active_score > 0 and d2.active_score > 0) t2on t1.ds = t2.ds
优化-相关子查询
select ds,count(distinct user_id) as '活跃用户',(select count(distinct t2.user_id) as '第二天仍活跃用户'from dws_user_active_t t2where datediff(ds,t.ds)=1 and t1.active_score > 0 and t2.active_score > 0),'第二天仍活跃用户'/'昨天活跃用户' as '留存率'from dws_user_active_t t1where ds = '2020-12-11' and active_score > 0;
