在perl中如何将重复频率($n)从高到低排序

2q5ifsrm  于 2022-11-15  发布在  Perl
关注(0)|答案(2)|浏览(135)

我有这段代码。这段代码很好地找到了多个文件之间的公共行。只是,我不知道如何将输出从最高重复到最低重复进行排序。我希望将文件排序为6,6,5,5,4,3,2,而不是5,3,2,6,5,4,5,6
输出. txt

For line --> five
This line occurs 5 times in the following files: - 
a.txt,
b.txt,
c.txt,
d.txt,
e.txt
For line --> three
This line occurs 3 times in the following files: - 
a.txt,
b.txt,
c.txt
For line --> two
This line occurs 2 times in the following files: - 
a.txt,
b.txt
For line --> eight
This line occurs 6 times in the following files: - 
a.txt,
b.txt,
c.txt,
d.txt,
e.txt,
f.txt
For line --> four 
This line occurs 4 times in the following files: - 
a.txt,
b.txt,
c.txt,
d.txt
For line --> six
This line occurs 5 times in the following files: - 
a.txt,
b.txt,
c.txt,
d.txt,
e.txt
For line --> seven
This line occurs 6 times in the following files: - 
a.txt,
b.txt,
c.txt,
d.txt,
e.txt,
f.txt
The total common line between files are 7

脚本文件(perl)

#!/usr/bin/perl -w
my %hash; 
my $file;
my $fh;
my $count;

for $file (@ARGV) {
    open ($fh, $file) or die "$file: $!\n";
    while(<$fh>) {
        push @{$hash{ $_}}, $file;
    } 
}
for (keys %hash) {
    $n = @{$hash{$_}};
    if(@{$hash{$_}} > 1) {
        $count ++;
        print "\n For line --> $_\n";
        print "This line occurs $n times in the following files: - \n", join(",\n", @{$hash{$_}}), "\n\n";
    }
}
print "The total common line between files are $count\n";  
exit 0;
f4t66c6m

f4t66c6m1#

您必须对键列表进行排序,而不是使用keys返回的任意顺序。

for (map  { $_->[0] }
     sort { $b->[1] <=> $a->[1] }
     map  { [ $_, scalar @{$hash{$_}} ] }
     keys %hash) {
    # ...
}
pbwdgjma

pbwdgjma2#

您可以使用下列项目:

sort { @{ $hash{$b} } <=> @{ $hash{$a} } keys %hash

您也可以使用现象级的Sort-Key发行版。

use Sort::Key qw( rukeysort );

rukeysort { 0+@{ $hash{$_} } } keys %hash

使用Schwartzian变换是suggested。我不认为这是一个很好的解决方案。
如果不进行测试,我们还不清楚施瓦茨变换是否真的能提高性能,因为它会带来额外的调用块和内存分配,很可能会使程序变得更复杂、更慢。
事实上,使用ST是否是一个好的解决方案还不清楚。如果值得使用ST,最好尽可能使用Sort::Key。它既简单又快速。

相关问题