shell 此命令如何执行:sed -n 'N ; s/#\n@/#--@/ ; P ; D' corruptData.txt

anauzrmj  于 2023-11-21  发布在  Shell
关注(0)|答案(3)|浏览(122)

我不知道为什么sed命令会使输出为bellow:
shell命令:sed -n 'N ; s/#\n@/#--@/ ; P ; D' corruptData.txt
产出:

Header Line#--@#
@Data Line #1
Data Line #2#--@
End of Data Lines#--@

字符串
corruptData.txt:

Header Line#
@#
@Data Line #1
Data Line #2#
@
End of Data Lines#
@


我想命令应该是这样执行的:

step1: read 1first line 'Header Line#', and put it into pattern space.
step2: execute script command `N` : read line 2 '@#' into pattern space.
step3: execute script command `s/#\n@/#--@/`: now the content in pattern space is turned to be `Header Line#--@#`, and it is only one line of data.
step4: execute script command `P`: print the first line of data in the pattern space, namely `Header Line#--@#`
step5: execute script command `D`: delete the first line of data in the pattern space, so the pattern space is turned to be empty.
step6: execute script command `N`: read line 3 `@Data Line #1` into pattern space. Now it's the only data in the pattern space.
step7: execute script command `s/#\n@/#--@/ ; P ; D`: print `@Data Line #1`, delete it from pattern space and now pattern space is turned to be empty again.
step8: execute script command `N`: read line 4 `Data Line #2#` into pattern space, and now it's the only data in the pattern space.
step9: execute script command `s/#\n@/#--@/ ; P ; D`: print `Data Line #2#` and delete it, and now the pattern space.
....
stepN: execute script command `N`: read next line of data into pattern space.
stepN+1: execute script command `s/#\n@/#--@/ ; P ; D`: print the read line of data and delete it from pattern space.
....


因此,输出应该是这样的:

Header Line#--@#
@Data Line #1
Data Line #2#
@
End of Data Lines#
@


从第8步开始打印错误,即Data Line #2#,但正确的打印(第3次打印)应为Data Line #2#--@
那么,命令是如何一步一步执行并做出正确输出的呢?
谢谢.

x7rlezfr

x7rlezfr1#

根据GNU sed手册对D的描述:

**如果模式空间不包含新行,则开始一个正常的新循环,就像发出了'd'命令一样。**否则,删除模式空间中的文本,直到第一个新行,然后使用结果模式空间重新开始循环,而不阅读新的输入行。

在读取并合并和打印前两行之后,模式空间中没有换行符,因此D的作用类似于d
下一个周期读取@Data Line #1N将一个换行符和Data Line #2#添加到模式空间,将其保留为

@Data Line #1\nData Line #2#

字符串
s不匹配,第一行被打印并从模式空间中删除,只留下Data Line #2#。循环重新开始,没有阅读新的一行,然后N导致一个新的一行和下一行,@,被附加到模式空间。所以现在

Data Line #2#\n@


它变成了

Data Line #2#--@


s,然后打印和删除-再次,像d一样,因为不再有换行符。下一个周期然后读取End of Data Lines#,然后执行N和其他命令。

hc8w905p

hc8w905p2#

这可以帮助你理解(GNU sed):

sed -n 'N;s/#\n@/#--@/;P;D' corruptData.txt --debug
SED PROGRAM:
  N
  s/#\n@/#--@/
  P
  D
INPUT:   'corruptData.txt' line 1
PATTERN: Header Line#
COMMAND: N
PATTERN: Header Line#\n@#
COMMAND: s/#\n@/#--@/
MATCHED REGEX REGISTERS
  regex[0] = 11-14 '#
@'
PATTERN: Header Line#--@#
COMMAND: P
Header Line#--@#
COMMAND: D
INPUT:   'corruptData.txt' line 3
PATTERN: @Data Line #1
COMMAND: N
PATTERN: @Data Line #1\nData Line #2#
COMMAND: s/#\n@/#--@/
PATTERN: @Data Line #1\nData Line #2#
COMMAND: P
@Data Line #1
COMMAND: D
PATTERN: Data Line #2#
COMMAND: N
PATTERN: Data Line #2#\n@
COMMAND: s/#\n@/#--@/
MATCHED REGEX REGISTERS
  regex[0] = 12-15 '#
@'
PATTERN: Data Line #2#--@
COMMAND: P
Data Line #2#--@
COMMAND: D
INPUT:   'corruptData.txt' line 6
PATTERN: End of Data Lines#
COMMAND: N
PATTERN: End of Data Lines#\n@
COMMAND: s/#\n@/#--@/
MATCHED REGEX REGISTERS
  regex[0] = 17-20 '#
@'
PATTERN: End of Data Lines#--@
COMMAND: P
End of Data Lines#--@
COMMAND: D

字符串

6vl6ewon

6vl6ewon3#

你的脚本太复杂了。在第一行添加第二行到模式缓冲区,然后把这两行拼起来:

$ sed '1{N;s/\n/--/}' corruptData.txt
Header Line#--@#
@Data Line #1
Data Line #2#
@
End of Data Lines#
@

字符串

相关问题