我想把下面的数据转换成一个4个单元格的特定行的模式。请找到下面数据的样本。
text = """A | B | Lorem | Ipsum | is | simply | dummy
C | D | text | of | the | printing | and
E | F | typesetting | industry. | Lorem
G | H | more | recently | with | desktop | publishing | software | like | Aldus
I | J | Ipsum | has | been | the | industry's
K | L | standard | dummy | text | ever | since | the | 1500s
M | N | took | a
O | P | scrambled | it | to | make | a | type | specimen | book"""
我被要求转换每行只包含不超过4个单元格。任何单元格后第四个单元格应插入到下一行具有的前两个单元格类似于第一行和当前行不应也大于4个单元格。上述文本数据的转换应看起来像下面的一个。
A | B | Lorem | Ipsum
A | B | is | simply
A | B | dummy
C | D | text | of
C | D | the | printing
C | D | and
E | F | typesetting | industry.
E | F | Lorem
G | H | more | recently
G | H | with | desktop
G | H | publishing | software
G | H | like | Aldus
.
.
and so on...
我已经尝试了一些对我自己的,但我甚至没有一半的方式,根据下面的代码是不完整的。
new_text = ""
for i in text.split('\n'):
row = i.split(' | ')
if len(row) == 4:
new_text = new_text + i + '\n'
elif len(row) > 4:
for j in range(len(row)):
if j < 3:
new_text = new_text + row[0] + ' | ' + row[1] + ...
我无法弄清楚逻辑使用前两个细胞,如果细胞的数量高于4在每一行。
2条答案
按热度按时间bksxznpy1#
您可以拆分输入行,然后一次处理每行2个元素。可能的代码:
它给出了预期结果:
ivqmmu1c2#
我将使用pandas完成此任务:
输出文件: