推荐 最新
攻城狮无远

【Databend】行列转化:一行变多行和简单分列

数据准备和需求行列转化在实际工作中很常见,其中最常见的有一行变多行,有下面一份数据:drop table if exists fact_suject_data; create table if not exists fact_suject_data ( student_id int null comment '编号', subject_level varchar null comment '科目等级', subject_level_json variant null comment '科目等级json数据' ); insert into fact_suject_data(student_id, subject_level,subject_level_json) values (12,'china e,english d,math e','{"china": "e","english": "d","math": "e"}'); insert into fact_suject_data(student_id, subject_level,subject_level_json) values (2,'china b,english b','{"china": "b","english": "b"}'); insert into fact_suject_data(student_id, subject_level,subject_level_json) values (3,'english a,math c','{"english": "a","math": "c"}'); insert into fact_suject_data(student_id, subject_level,subject_level_json) values (4,'china c,math a','{"china": "c","math": "a"}'); insert into fact_suject_data(student_id, subject_level,subject_level_json) values (5,'china d,english a,math c','{"china": "d","english": "a","math": "c"}'); insert into fact_suject_data(student_id, subject_level,subject_level_json) values (6,'china c,english a,math d','{"china": "c","english": "a","math": "d"}'); insert into fact_suject_data(student_id, subject_level,subject_level_json) values (7,'china a,english e,math b','{"china": "a","english": "e","math": "b"}'); insert into fact_suject_data(student_id, subject_level,subject_level_json) values (8,'china d,english e,math e','{"china": "d","english": "e","math": "e"}'); insert into fact_suject_data(student_id, subject_level,subject_level_json) values (9,'china c,english e,math c','{"china": "c","english": "e","math": "c"}');需求是将学生学科等级和等级分隔成多行,效果如下:生成序列和分隔函数Databend 生成序列有专门的函数 generate_series(, [, <step_interval>]),生成从指定点开始,在另一个指定点结束的数据集,并且可以选择增量值。适用的数据类型有 整数、日期和时间戳。select generate_series as n from generate(1, 10); select generate_series as n from generate(1, 10, 2); +---+ | n | +---+ | 1 | +---+ | 3 | +---+ | 5 | +---+ | 7 | +---+ | 9 | +---+ select generate_series as n from generate_series('2024-01-01'::date, '2024-01-07'::date); +---------------+ | calendar_date | +---------------+ | 2024-01-01 | +---------------+ | 2024-01-02 | +---------------+ | 2024-01-03 | +---------------+ | 2024-01-04 | +---------------+ | 2024-01-05 | +---------------+ | 2024-01-06 | +---------------+ | 2024-01-07 | +---------------+split(<input_string>,):使用指定的分隔符拆分字符串,并将结果部分作为数组返回。split_part(<input_string>,, ):使用指定的分隔符拆分字符串并返回指定的部分。unnest(array):将数组拆分成多行。select subject_level , split(subject_level, ',') as split_char , split_part(subject_level, ',', 1) as part1 from (select 'china e,english d,math e' as subject_level) as a +--------------------------+----------------------------------+------------+ | subject_level | split_char | part1 | +--------------------------+----------------------------------+------------+ | china e,english d,math e | ['china e','english d','math e'] | china e | +--------------------------+----------------------------------+------------+ select subject_level , unnest(split(subject_level, ',')) as unne_char from (select 'china e,english d,math e' as subject_level) as a; +--------------------------+------------+ | subject_level | unne_char | +--------------------------+------------+ | china e,english d,math e | china e | +--------------------------+------------+split_part() 函数与 Mysql 中的 substring_index() 类似。根据分隔符变多行根据上面函数讲解,方法一:我们可以使用 split(<input_string>,) 和 unnest(array) 函数实现。select t1.student_id,t1.subject_level ,unnest(split(t1.subject_level,',')) as subject_level1 from fact_suject_data as t1 order by t1.student_id;方法二:也可以使用 split_part(<input_string>,, ) 单独实现。select t1.student_id , t1.subject_level , t2.n , split_part(t1.subject_level, ',', t2.n) as subject_level1 from fact_suject_data as t1 left join (select generate_series as n from generate_series(1, 30)) t2 on t2.n <= (length(t1.subject_level) - length(replace(t1.subject_level, ',', '')) + 1) order by t1.student_id;通过 generate_series() 生成的序列数值作为 split_part() 的分隔参数即可实现,与 Mysql 行列变换《你想要的都有》中分隔原理一致。JSON 数据简单分列对于 subject_level_json 列数据,我们可以使用 json 独有的函数实现分列透视的效果。select subject_level_json , replace(json_path_query(subject_level_json, '$.china'), '"', '') as china , get(subject_level_json, 'math') as math , get(subject_level_json, 'english') as english from fact_suject_data as t1 order by t1.student_id;总结数据分列和一行变多行的应用非常常见,通过本文的学习,相信基本上能处理类似问题,遇到面试相关问题也能完美解决,赶紧动手实操看看效果吧!!!参考资料:Databend Array Functions:https://docs.databend.com/sql/sql-functions/array-funct

0
0
0
浏览量516
中年复健狗

如何实现物联网设备位置数据高效存储与查询?

物联网设备每隔1s上传一次自身所在位置经纬度,需求要支持任意时间段的历史数据查询,有什么建议的数据库架构方案? 之前类似需求一直用MySQL分表实现的,查询SQL比较复杂,而且查询速度也不快

0
1
0
浏览量117
攻城狮无远

Databend 技巧与实践指南

本专栏将介绍 Databend 的安装配置和使用、数据类型、数据库和表操作、日期时间函数、基础函数应用、行列转化(一行变多行和简单分列,数据透视和逆透视)、merge 的用法、多表联结以及分组集等核心话题。通过学习本专栏,读者将能够掌握 Databend 的基础和高阶技巧和最佳实践,更好地应用 Databend 来解决实际问题

0
0
0
浏览量1193