数据准备
a,2017-02-05,200a,2017-02-06,300a,2017-02-07,200a,2017-02-08,400a,2017-02-10,600b,2017-02-05,200b,2017-02-06,300b,2017-02-08,200b,2017-02-09,400b,2017-02-10,600c,2017-01-31,200c,2017-02-01,300c,2017-02-02,200c,2017-02-03,400c,2017-02-10,600a,2017-03-01,200a,2017-03-02,300a,2017-03-03,200a,2017-03-04,400a,2017-03-05,600
求连续N天的销售记录
一 先建一个 ClickHouse 表
create tabletb_shop(name String , ctime Date ,money Float64)engine=MergeTree ---> 引擎primary key (name,ctime) ----> 主键order by (name,ctime) ; ----> 分组
二 导入数据
cat shop.csv | clickhouse-client -q "insert into tb_shop FORMAT CSV"把shop.csv 下的数据 导入到 tb_shop 表中 --> CSV 数据格式select * from tb_shop; 查看表数据
三 把数据转化成数组
selectname,groupArray(ctime) ---> 时间进行排序 转化成数组fromtb_shopgroup by name ;
四 把数组进行索引编号排序
selectname ,groupArray(ctime) arr , ----> 时间进行排序arrayEnumerate(arr) arr_index ----> 求出数组的索引进行编号排序fromtb_shopgroup by name ;
五 把数组和后面的索引值进行拼接 使用 array join 方法
弄了一个子查询
selectname ,ct ,idxfrom(selectname ,groupArray(ctime) arr ,arrayEnumerate(arr) arr_indexfromtb_shopgroup by name)array join ----> 把数组和索引进行拼接arr as ct ,arr_index as idxorder by name ----> 按名字进行排序;
六 查看相同天数的连续的登录天数
selectname ,ct ,idx ,subtractDays(ct , idx) as diff ---> 时间减去索引的个数from(selectname ,groupArray(ctime) arr ,arrayEnumerate(arr) arr_indexfromtb_shopgroup by name)array joinarr as ct ,arr_index as idxorder by name;
七 求出最终结果 连续N天的销售记录
selectname ,count(1) days ---> 登录天数的总和from(selectname ,ct ,idx ,subtractDays(ct , idx) as difffrom(selectname ,groupArray(ctime) arr ,arrayEnumerate(arr) arr_indexfromtb_shopgroup by name)t1array joinarr as ct ,arr_index as idxorder by name)t2group by name , diff ----> name 和 diff 进行分组having days >3 ----> 天数大于3天order by name , days desc -----> name 和days 进行降序排序limit 1 by name ; -----> 名字进行分组后取第一个数据
