shell 用另一个引用文件中的值更新特定的行和列

8yparm6h 于 2023-06-30 发布在 Shell

关注(0)|答案(3)|浏览(118)

这是我上一个线程的后续问题（更新匹配行后的第二行引用值），具有更高级的要求。我有一个主文件main，我希望修改2个目标：（1）要在main中找到MATCH LINE短语，向下跳转2行，并将第3列替换为ref文件的第2列;（2）如果一行有write output短语，则用类似的替换来替换其第4列。所以ref有两列：第一个用于输出文件名，第二个用于替换值。请参见下面的示例和所需的输出。

主文件

one line here
This is the 'MATCH LINE'
# this is just a comment
Now this *** to be updated
write output label ***
another line here

参考文件

Out1 ONE
Out2 TWO
Out3 THREE

所需输出file 1（Out 1）

one line here
This is the 'MATCH LINE'
# this is just a comment
Now this ONE to be updated
write output label ONE
another line here

所需输出file 1（Out 2）

one line here
This is the 'MATCH LINE'
# this is just a comment
Now this TWO to be updated
write output label TWO
another line here

所需输出file 1（Out 3）

one line here
This is the 'MATCH LINE'
# this is just a comment
Now this THREE to be updated
write output label THREE
another line here

我的脚本来自Ed Morton @ed-morton，他好心地帮助我完成了上一个线程，我已经修改了它以适应新的要求，但它给了我错误。谢谢你的帮助。

#!/bin/awk -f

NR == FNR {
    lines[++numLines] = $0
    a[NR]=$2
    if ( /\047MATCH LINE\047/ ) {
        tgt1 = NR + 2
    }
    if ( /write output/ ) {
        tgt2 = NR
    }
    next
}
{
    for ( lineNr=1; lineNr<=numLines; lineNr++ ) {
        line = lines[lineNr]
        if ( lineNr == tgt1 ) {
            #sub(/NUMBER/,$2,line)
            line[$3]=a[FNR]
        }
        if ( lineNr == tgt2 ) {
            line[$4]=a[FNR]
        }
        print line > $1
    }
    close($1)
}

./tst.awk main ref

错误：
标量“line”不能用作数组
艾德建议将行拆分成数组，替换正确的索引并将它们缝合在一起;但是输出看起来很奇怪。下面是更新后的脚本和输出。

#!/bin/awk -f

NR == FNR {
    lines[++numLines] = $0
    a[NR]=$2
    if ( /\047MATCH LINE\047/ ) {
        tgt1 = NR + 2
    }
    if ( /write output/ ) {
        tgt2 = NR
    }
    next
}
{
    for ( lineNr=1; lineNr<=numLines; lineNr++ ) {
        line = lines[lineNr]
        if ( lineNr == tgt1 ) {
            #sub(/NUMBER/,$2,line)
            numFlds = split(line,flds)
            flds[3] = a[FNR]
            for ( fldNr=1; fldNr<=numFlds; fldNr++ ) {
                line = (fldNr==1 ? "" : line " ") flds[fldNr]
            }
        }
        if ( lineNr == tgt2 ) {
            numFlds = split(line,flds)
            flds[4] = a[FNR]
            for ( fldNr=1; fldNr<=numFlds; fldNr++ ) {
                line = (fldNr==1 ? "" : line " ") flds[fldNr]
            }
        }
        print line > $1
    }
    close($1)
}

输出

$ head Out*
==> Out1 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this line to be updated
write output label line
another line here

==> Out2 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this is to be updated
write output label is
another line here

shell

来源：https://stackoverflow.com/questions/76549073/update-specific-lines-and-columns-with-values-from-another-reference-file

3条答案

按热度按时间

c7rzv4ha1#

在任何POSIX awk中，更改：

line[$3]=a[FNR]

致：

match(line,/^[[:space:]]*([^[:space:]]+[[:space:]]+){2}/)
tail = substr(line,RSTART+RLENGTH)
sub(/[^[:space:]]+/,"",tail)
line = substr(line,RSTART,RLENGTH) a[FNR] tail

同样对于line[$4]=a[FNR]，只需将上面match()中的{2}改为{3}。
正如在注解中已经提到的，您的错误消息是因为line是一个标量（在本例中包含一个字符串），而您试图将其视为一个数组。如果你想把line当作一个数组，那么你必须首先在它上面运行split()，从它的内容创建一个新数组，然后在新数组中赋值，然后将数组重新组合成一个字符串存储在line中。
例如，如果你不关心保留白色（可以用GNU awks 4th arg to split()来解决），你可以将line中的第三个字段替换为：

numFlds = split(line,flds)
flds[3] = a[FNR]
line = flds[1]
for ( fldNr=2; fldNr<=numFlds; fldNr++ ) {
    line = line " " flds[fldNr]
}

我在上面使用了文字字符串替换而不是*sub()，所以即使a[FNR]包含反向引用元字符&，它也能工作。
另外，当试图修改my previous answer以解决当前问题时，您在更改时引入了一个逻辑错误。

NR==FNR {...; next }
{ ...sub(/NUMBER/,$2,line) }

致：

NR==FNR { a[FNR=$2; next }
{ ...line[$3]=a[FNR]... }

而不是：

NR==FNR {...; next }
{ ...line[$3]=$2... }

您所做的是完全不同的逻辑，将line的一部分替换为main的字符串，而不是ref的字符串。在充实了一些公共代码并将其移动到函数之后，以下是您当前问题的完整脚本：

$ cat tst.awk
NR == FNR {
    lines[++numLines] = $0
    if ( /\047MATCH LINE\047/ ) {
        tgt1 = NR + 2
    }
    if ( /write output/ ) {
        tgt2 = NR
    }
    next
}
{
    for ( lineNr=1; lineNr<=numLines; lineNr++ ) {
        line = lines[lineNr]
        if ( lineNr == tgt1 ) {
            line = rplc(line,3,$2)
        }
        if ( lineNr == tgt2 ) {
            line = rplc(line,4,$2)
        }
        print line > $1
    }
    close($1)
}

function rplc(str,tgt,val,      numFlds,flds,fldNr) {
    numFlds = split(line,flds)
    if ( tgt > numFlds ) {
        numFlds = tgt
    }
    flds[tgt] = val
    str = flds[1]
    for ( fldNr=2; fldNr<=numFlds; fldNr++ ) {
        str = str " " flds[fldNr]
    }
    return str
}

$ awk -f tst.awk main ref

$ head Out*
==> Out1 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this ONE to be updated
write output label ONE
another line here

==> Out2 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this TWO to be updated
write output label TWO
another line here

==> Out3 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this THREE to be updated
write output label THREE
another line here

赞(0）回复(0）举报 2023-06-30

g6ll5ycj2#

#!/bin/awk -f

function join(array, start, end, sep,    result, i)
{
    if (sep == "")
       sep = " "
    else if (sep == SUBSEP) # magic value
       sep = ""
    result = array[start]
    for (i = start + 1; i <= end; i++)
        result = result sep array[i]
    return result
}
/\047MATCH LINE\047/{
    mline = NR+2
}
FNR==NR{
    main[NR] = $0
    next 
}
{
    out = $1
    for (i=1; i<=length(main); i++){
        if(i == mline){
           n=split(main[i], a, " ") 
           a[3]=$2
           print join(a, 1, n) > out
        }else if (i == mline+1 && main[i] ~ /write output label .*/) {
            n=split(main[i], a, " ") 
            a[4]=$2
            print join(a, 1, n) > out
        }else{
            print main[i] > out
        }
    }
    close(out)
}

./tst.awk main ref

$ head Out*
==> Out1 <==
one line here
This is the 'MATCH LINE'  
# this is just a comment  
Now this ONE to be updated
write output label ONE    
another line here

==> Out2 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this TWO to be updated
write output label TWO
another line here
    
==> Out3 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this THREE to be updated
write output label THREE
another line here

赞(0）回复(0）举报 2023-06-30

fnx2tebb3#

一个awk的想法：

awk '
NR == FNR {
    if ( /MATCH LINE/              ) tgt = FNR + 2
    if ( FNR == tgt                ) $3  = "REPLACE_ME"       # replace 4th field with a string that you know does not exist in main
    if ( tgt > 0 && /write output/ ) $4  = "REPLACE_ME"       # replace 3rd field with the same dummy replacement string

    template = template (template != "" ? ORS : "") $0        # add current line to our template block of text
    next
}
{ if ( tgt > 0 ) {                                            # if "MATCH LINE" exists then ...
     template_copy = template                                 # copy template
     gsub(/REPLACE_ME/,$2,template_copy)                      # perform replacements against "template_copy"
     print template_copy > $1                                 # print "template_copy" to output file "$1"
     close($1)                                                # close file descriptor
  }
}
' main ref

注意事项：

如果main中不存在MATCH LINE，则不会生成输出文件
如果main可以包含字符串REPLACE_ME，则修改代码以使用您知道在main中不存在的字符串
如果字段（在main中）由除单个空格之外的其他内容（例如，制表符，多个空格）分隔，则此解决方案将 * 不 * 保持原始间距（即，制表符和多个空格将被替换为单个空格）;保持原来的间距是可行的，但需要更多的代码

该geneartes：

$ head Out*
==> Out1 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this ONE to be updated
write output label ONE
another line here

==> Out2 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this TWO to be updated
write output label TWO
another line here

==> Out3 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this THREE to be updated
write output label THREE
another line here

赞(0）回复(0）举报 2023-06-30

我来回答

shell 用另一个引用文件中的值更新特定的行和列

3条答案

相关问题

热门标签

最新问答