为什么我的Map类在Hadoop中不获取任何记录？

我试图获取AirLineData.csv文件，以确定航班数量在不同的年份在一个特定的机场。并为该Map输出文件没有显示任何记录，即使它是采取Map输入记录为10000。
这是我的Map器类函数

public static class MapClass extends Mapper<LongWritable,Text,IntWritable,Text>
   {
      public void map(LongWritable key, Text value, Context context)
      {           
         try{
            String[] str = value.toString().split(",");  
            //String dummy_column = str[0]; //value
            int int_year = Integer.parseInt(str[1]);//key
            context.write(new IntWritable(int_year),new Text(str[0])); //key and vlaue
         }
         catch(Exception e)
         {
            System.out.println(e.getMessage());
         }
      }
   }

这是我的驱动程序类方法：

public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        //conf.set("name", "value")
        //conf.set("mapreduce.input.fileinputformat.split.minsize", "134217728");
        Job job = Job.getInstance(conf, "Frequency count of flight");
        job.setJarByClass(FlightFrequency.class);
        job.setMapperClass(MapClass.class);
        //job.setCombinerClass(ReduceClass.class);
        job.setReducerClass(ReduceClass.class);
        job.setNumReduceTasks(1);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(LongWritable.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
      }

这是我的研究结果：

[bigcdac43211@ip-10-1-1-204 ~]$ hadoop jar myjar.jar AirData training/AirLineData.csv training/out8                                           
WARNING: Use "yarn jar" to launch YARN applications.
22/11/24 08:16:58 INFO client.RMProxy: Connecting to ResourceManager at ip-10-1-1-204.ap-south-1.compute.internal/10.1.1.204:8032
22/11/24 08:16:58 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execu
te your application with ToolRunner to remedy this.
22/11/24 08:16:58 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/bigcdac43211/.staging/job_1663041244711_12176
22/11/24 08:16:59 INFO input.FileInputFormat: Total input files to process : 1
22/11/24 08:16:59 INFO mapreduce.JobSubmitter: number of splits:1
22/11/24 08:16:59 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.syste
m-metrics-publisher.enabled
22/11/24 08:16:59 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1663041244711_12176
22/11/24 08:16:59 INFO mapreduce.JobSubmitter: Executing with tokens: []
22/11/24 08:16:59 INFO conf.Configuration: resource-types.xml not found
22/11/24 08:16:59 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
22/11/24 08:16:59 INFO impl.YarnClientImpl: Submitted application application_1663041244711_12176
22/11/24 08:16:59 INFO mapreduce.Job: The url to track the job: http://ip-10-1-1-204.ap-south-1.compute.internal:6066/proxy/application_166304
1244711_12176/
22/11/24 08:16:59 INFO mapreduce.Job: Running job: job_1663041244711_12176
22/11/24 08:17:06 INFO mapreduce.Job: Job job_1663041244711_12176 running in uber mode : false
22/11/24 08:17:06 INFO mapreduce.Job:  map 0% reduce 0%
22/11/24 08:17:13 INFO mapreduce.Job:  map 100% reduce 0%
22/11/24 08:17:21 INFO mapreduce.Job:  map 100% reduce 100%
22/11/24 08:17:21 INFO mapreduce.Job: Job job_1663041244711_12176 completed successfully
22/11/24 08:17:21 INFO mapreduce.Job: Counters: 54
        File System Counters
                FILE: Number of bytes read=20
                FILE: Number of bytes written=445167
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=10585174
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=8
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
                HDFS: Number of bytes read erasure-coded=0
        Job Counters                                                                                                                          
                Launched map tasks=1                                                                                                          
                Launched reduce tasks=1                                                                                                       
                Rack-local map tasks=1                                                                                                        
                Total time spent by all maps in occupied slots (ms)=4998                                                                      
                Total time spent by all reduces in occupied slots (ms)=3389                                                                   
                Total time spent by all map tasks (ms)=4998                                                                                   
                Total time spent by all reduce tasks (ms)=3389                                                                                
                Total vcore-milliseconds taken by all map tasks=4998                                                                          
                Total vcore-milliseconds taken by all reduce tasks=3389                                                                       
                Total megabyte-milliseconds taken by all map tasks=5117952                                                                    
                Total megabyte-milliseconds taken by all reduce tasks=3470336                                                                 
        Map-Reduce Framework                                                                                                                  
                Map input records=100000                                                                                                      
                Map output records=0                                                                                                          
                Map output bytes=0                                                                                                            
                Map output materialized bytes=16                                                                                              
                Input split bytes=127                                                                                                         
                Combine input records=0                                                                                                       
                Combine output records=0                                                                                                      
                Reduce input groups=0                                                                                                         
                Reduce shuffle bytes=16                                                                                                       
                Reduce input records=0                                                                                                        
                Reduce output records=0                                                                                                       
                Spilled Records=0                                                                                                             
                Shuffled Maps =1                                                                                                              
                Failed Shuffles=0                                                                                                             
                Merged Map outputs=1                                                                                                          
                GC time elapsed (ms)=175                                                                                                      
                CPU time spent (ms)=3850                                                                                                      
                Physical memory (bytes) snapshot=771170304                                                                                    
                Virtual memory (bytes) snapshot=5179764736                                                                                    
                Total committed heap usage (bytes)=883425280                                                                                  
                Peak Map Physical memory (bytes)=581619712                                                                                    
                Peak Map Virtual memory (bytes)=2589802496                                                                                    
                Peak Reduce Physical memory (bytes)=189550592                                                                                 
                Peak Reduce Virtual memory (bytes)=2589962240                                                                                 
        Shuffle Errors                                                                                                                        
                BAD_ID=0                                                                                                                      
                CONNECTION=0                                                                                                                  
                IO_ERROR=0                                                                                                                    
                WRONG_LENGTH=0                                                                                                                
                WRONG_MAP=0                                                                                                                   
                WRONG_REDUCE=0                                                                                                                
        File Input Format Counters                                                                                                            
                Bytes Read=10585047                                                                                                           
        File Output Format Counters                                                                                                           
                Bytes Written=0

如您所见，我正在进行输入，但我的Map输出记录为0：

Map-Reduce Framework                                                                                                                  
Map input records=100000                                                                                                      
Map output records=0

这是我的示例数据（这里只显示了几行），前两列是ID、Year：

ARY04F1,2004,1,12,1,623,630,901,915,UA,462,N805UA,98,105,80,-14,-7,ORD,CLT,599,7,11,0,,0,0,0,0,0,0
ARY06F48889,2006,1,17,2,1453,1500,1557,1608,US,2176,N752UW,64,68,38,-11,-7,DCA,LGA,214,3,23,0,,0,0,0,0,0,0
ARY08F85465,2008,1,4,5,2037,2015,2144,2120,WN,3743,N276WN,127,125,109,24,22,SLC,OAK,588,8,10,0,,0,0,0,12,0,12

为什么我的Map类在Hadoop中不获取任何记录？

1条答案

相关问题

热门标签

最新问答