基本上,我想把这个(初始)转换成这个(最终)。换句话说,我想
“挤压”初始表,以便每个id只有一条记录
“扩展”初始表,这样我就不会丢失任何信息:从初始表为源和列的每个可能组合创建一个不同的列(创建c1_a,c1_b,…)。
我可以在python(可能是pandas)中将初始表作为csv使用,并手动硬编码初始表和最终表之间的Map。但是,我觉得这个解决方案一点也不优雅,我对sql/sas解决方案更感兴趣。有什么办法吗?
编辑:我想改变什么
+----+--------+------+-----+------+
| ID | source | c1 | c2 | c3 |
+----+--------+------+-----+------+
| 1 | A | 432 | 56 | 1 |
| 1 | B | 53 | 3 | 73 |
| 1 | C | 7 | 342 | 83 |
| 1 | D | 543 | 43 | 73 |
| 2 | A | 8 | 882 | 39 |
| 2 | B | 5 | 54 | 46 |
| 2 | C | 8 | 3 | 2226 |
| 2 | D | 87 | 2 | 45 |
| 3 | A | 93 | 143 | 45 |
| 3 | B | 1023 | 72 | 8 |
| 3 | C | 3 | 3 | 704 |
| 4 | A | 2 | 5 | 0 |
| 4 | B | 78 | 888 | 2 |
| 4 | C | 87 | 23 | 34 |
| 4 | D | 112 | 7 | 712 |
+----+--------+------+-----+------+
进入
+----+------+------+------+------+------+------+------+------+------+------+------+------+
| ID | c1_A | c1_B | c1_C | c1_D | c2_A | c2_B | c2_C | c2_D | c3_A | c3_B | c3_C | c3_D |
+----+------+------+------+------+------+------+------+------+------+------+------+------+
| 1 | 432 | 53 | 7 | 543 | 56 | 3 | 342 | 43 | 1 | 73 | 83 | 73 |
| 2 | 8 | 5 | 8 | 87 | 882 | 54 | 3 | 2 | 39 | 46 | 2226 | 45 |
| 3 | 93 | 1023 | 3 | | 143 | 72 | 3 | | 45 | 8 | 704 | |
| 4 | 2 | 78 | 87 | 112 | 5 | 888 | 23 | 7 | 0 | 2 | 34 | 712 |
+----+------+------+------+------+------+------+------+------+------+------+------+------+
1条答案
按热度按时间kknvjkwl1#
放弃希望?