我有下面的数据在一个单一的列。| 测试名称|| - -----|| 年龄:40岁|| 市,部门|| 大学名称|| 受试者姓名|| 匹配率75%|| 电话号码"|我需要分割这些数据,并将其插入到不同的列一样,通过使用SQL-
NAME AGE CITY NAME_OF_UNI NAME_OF_SUB MATCH_PCT PHONE_NUM
0md85ypi1#
理解样本数据的真实情况是很困难的(至少对我来说是这样);有一个冒号(作为名称-值分隔符),什么都没有,有双引号。所以,我稍微调整了一下,到处都用冒号。样本数据:
SQL> select * From test;COL--------------------------------------------------------------------------------Age:40City:University Name:Subject Name:Match Percentage: 75%Phone Number:Age:12City: ZagrebUniversity Name: FERSubject Name: MathsMatch Percentage: 23%Phone Number: 003851123456
SQL> select * From test;
COL
--------------------------------------------------------------------------------
Age:40
City:
University Name:
Subject Name:
Match Percentage: 75%
Phone Number:
Age:12
City: Zagreb
University Name: FER
Subject Name: Maths
Match Percentage: 23%
Phone Number: 003851123456
查询:首先将每个列值拆分为单独的行,然后应用一个简单的case表达式,该表达式搜索冒号字符后面的任何内容。
SQL> with tsplit as 2 (select rowid rid, 3 column_value lvl, 4 trim(regexp_substr(col, '[^' || chr(10) ||']+', 1, column_value)) val 5 from test cross join 6 table(cast(multiset(select level from dual 7 connect by level <= 6 8 ) as sys.odcinumberlist)) 9 ) 10 select 11 max(case when lvl = 1 then substr(val, instr(val, ':') + 1) end) age, 12 max(case when lvl = 2 then substr(val, instr(val, ':') + 1) end) city, 13 max(case when lvl = 3 then substr(val, instr(val, ':') + 1) end) university, 14 max(case when lvl = 4 then substr(val, instr(val, ':') + 1) end) subject, 15 max(case when lvl = 5 then substr(val, instr(val, ':') + 1) end) match, 16 max(case when lvl = 6 then substr(val, instr(val, ':') + 1) end) phone 17 from tsplit 18 group by rid;AGE CITY UNIVERSITY SUBJECT MATCH PHONE--- ---------- ------------ ---------- ---------- --------------40 75%12 Zagreb FER Maths 23% 003851123456SQL>
SQL> with tsplit as
2 (select rowid rid,
3 column_value lvl,
4 trim(regexp_substr(col, '[^' || chr(10) ||']+', 1, column_value)) val
5 from test cross join
6 table(cast(multiset(select level from dual
7 connect by level <= 6
8 ) as sys.odcinumberlist))
9 )
10 select
11 max(case when lvl = 1 then substr(val, instr(val, ':') + 1) end) age,
12 max(case when lvl = 2 then substr(val, instr(val, ':') + 1) end) city,
13 max(case when lvl = 3 then substr(val, instr(val, ':') + 1) end) university,
14 max(case when lvl = 4 then substr(val, instr(val, ':') + 1) end) subject,
15 max(case when lvl = 5 then substr(val, instr(val, ':') + 1) end) match,
16 max(case when lvl = 6 then substr(val, instr(val, ':') + 1) end) phone
17 from tsplit
18 group by rid;
AGE CITY UNIVERSITY SUBJECT MATCH PHONE
--- ---------- ------------ ---------- ---------- --------------
40 75%
12 Zagreb FER Maths 23% 003851123456
SQL>
如果我的假设是错误的,那么调整query,使您不搜索冒号,而是分别搜索每个“name”字符串。但是,就总体思路而言,这可能是一种选择。
1条答案
按热度按时间0md85ypi1#
理解样本数据的真实情况是很困难的(至少对我来说是这样);有一个冒号(作为名称-值分隔符),什么都没有,有双引号。所以,我稍微调整了一下,到处都用冒号。
样本数据:
查询:首先将每个列值拆分为单独的行,然后应用一个简单的case表达式,该表达式搜索冒号字符后面的任何内容。
如果我的假设是错误的,那么调整query,使您不搜索冒号,而是分别搜索每个“name”字符串。但是,就总体思路而言,这可能是一种选择。