在pig中查找avg并按升序排序

rbl8hiat  于 2021-06-21  发布在  Pig
关注(0)|答案(1)|浏览(250)

有一个有9个字段的模式,我只想取两个字段(6,7,即$5,$6),我想计算$5的平均值,我想按升序排序$6,所以如何做这个任务可以有人帮我。
输入数据:

N368SW  188 170 175 17  -1  MCO MHT 1142
N360SW  100 115 87  -10 5   MCO MSY 550
N626SW  114 115 90  13  14  MCO MSY 550
N252WN  107 115 84  -10 -2  MCO MSY 550
N355SW  104 115 85  -1  10  MCO MSY 550
N405WN  113 110 96  14  11  MCO ORF 655
N456WN  110 110 92  24  24  MCO ORF 655
N743SW  144 155 124 7   18  MCO PHL 861
N276WN  142 150 129 -2  6   MCO PHL 861
N369SW  153 145 134 30  22  MCO PHL 861
N363SW  151 145 137 5   -1  MCO PHL 861
N346SW  141 150 128 51  60  MCO PHL 861
N785SW  131 145 118 -15 -1  MCO PHL 861
N635SW  144 155 127 -6  5   MCO PHL 861
N242WN  298 300 276 68  70  MCO PHX 1848
N439WN  130 140 111 -4  6   MCO PIT 834
N348SW  140 135 124 7   2   MCO PIT 834
N672SW  136 135 122 9   8   MCO PIT 834
N493WN  151 160 136 -9  0   MCO PVD 1073
N380SW  170 155 155 13  -2  MCO PVD 1073
N705SW  164 160 147 6   2   MCO PVD 1073
N233LV  157 160 143 1   4   MCO PVD 1073
N786SW  156 160 139 6   10  MCO PVD 1073
N280WN  160 160 146 1   1   MCO PVD 1073
N282WN  104 95  81  10  1   MCO RDU 534
N694SW  89  100 77  3   14  MCO RDU 534
N266WN  94  95  82  9   10  MCO RDU 534
N218WN  98  100 77  12  14  MCO RDU 534
N355SW  47  50  35  15  18  MCO RSW 133
N388SW  44  45  30  37  38  MCO RSW 133
N786SW  46  50  31  4   8   MCO RSW 133
N707SA  52  50  33  10  8   MCO RSW 133
N795SW  176 185 153 -9  0   MCO SAT 1040
N402WN  176 185 161 4   13  MCO SAT 1040
N690SW  123 130 107 -1  6   MCO SDF 718
N457WN  135 130 105 20  15  MCO SDF 718
N720WN  144 155 131 13  24  MCO STL 880
N775SW  147 160 135 -6  7   MCO STL 880
N291WN  136 155 122 96  115 MCO STL 880
N247WN  144 155 127 43  54  MCO STL 880
N748SW  179 185 159 -4  2   MDW ABQ 1121
N709SW  176 190 158 21  35  MDW ABQ 1121
N325SW  110 105 97  36  31  MDW ALB 717
N305SW  116 110 90  107 101 MDW ALB 717
N403WN  145 165 128 -6  14  MDW AUS 972
N767SW  136 165 125 59  88  MDW AUS 972
N730SW  118 120 100 28  30  MDW BDL 777

我已经编写了这样的代码,但它不能正常工作:

a = load '/path/to/file' using PigStorage('\t');
b = foreach a generate (int)$5 as field_a:int,(chararray)$6 as   field_b:chararray;
c = group b all;
d = foreach c generate b.field_b,AVG(b.field_a);
e = order d by field_b ASC;
dump e;

我在订购时遇到错误:

grunt> a = load '/user/horton/sample_pig_data.txt' using PigStorage('\t');
grunt> b = foreach a generate (int)$5 as fielda:int,(chararray)$6 as fieldb:chararray;
grunt> describe @;
b: {fielda: int,fieldb: chararray}
grunt> c = group b all;
grunt> describe @;
c: {group: chararray,b: {(fielda: int,fieldb: chararray)}}
grunt> d = foreach c generate b.fieldb,AVG(b.fielda);                                                                                                                   
grunt> e = order d by fieldb ;
2017-01-05 15:51:29,623 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1025: 
<line 6, column 15> Invalid field projection. Projected field [fieldb] does not exist in schema: :bag{:tuple(fieldb:chararray)},:double.
Details at logfile: /root/pig_1483631021021.log

我想要像这样的输出(与输入数据无关):

(({(Bharathi),(Komal),(Archana),(Trupthi),(Preethi),(Rajesh),(siddarth),(Rajiv) }, 
  {   (72)   ,  (83) ,   (87)  ,   (75)  ,   (93)  ,  (90)  ,   (78)   ,  (89)  }),83.375)
mnemlml8

mnemlml81#

如果你已经找到了答案,最好的做法是把它贴出来,以便其他人能够更好地理解。

相关问题