java hadoop：当我运行一个MapReduce程序来计算单词时，有什么问题吗？

我正在学习使用Hadoop。我有一台旧笔记本电脑，我安装了Linux Mint 21。我能够安装Hadoop。
下面的命令是正确的：当我运行hdfs dfs -ls /时，我可以看到：

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/home/dell/hadoop/share/hadoop/common/lib/hadoop-auth-2.7.3.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Found 2 items
drwxr-xr-x   - nenn supergroup          0 2023-02-11 15:30 /my_data
drwx------   - nenn supergroup          0 2023-02-11 15:21 /tmp

在my_data中，当我执行hdfs dfs -ls -R /时，我有一个txt文件，下面是我所看到的内容的摘录：

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/home/dell/hadoop/share/hadoop/common/lib/hadoop-auth-2.7.3.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
drwxr-xr-x   - nenn supergroup          0 2023-02-11 15:30 /my_data
> -rw-r--r--   1 nenn supergroup    1174876 2023-02-11 15:01 /my_data/book1.txt
drwx------   - nenn supergroup          0 2023-02-11 15:21 /tmp
drwx------   - nenn supergroup          0 2023-02-11 15:21 /tmp/hadoop-yarn
drwx------   - nenn supergroup          0 2023-02-11 15:29 /tmp/hadoop-yarn/staging
drwx------   - nenn supergroup          0 2023-02-11 15:21 /tmp/hadoop-yarn/staging/d

我想运行一个计算book1.txt中单词数的脚本。我用~/hadoop/sbin/start-yarn.sh启动YARN
然后运行脚本hadoop jar /home/nenn/wordcount.jar WordCount /my_data/book1.txt /my_data/output_wordcount
我看到了这个

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/home/dell/hadoop/share/hadoop/common/lib/hadoop-auth-2.7.3.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
23/02/11 15:33:55 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
23/02/11 15:33:55 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
23/02/11 15:33:56 INFO input.FileInputFormat: Total input paths to process : 1
23/02/11 15:33:56 INFO mapreduce.JobSubmitter: number of splits:1
23/02/11 15:33:56 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1676124615395_0004
23/02/11 15:33:56 INFO impl.YarnClientImpl: Submitted application application_1676124615395_0004
23/02/11 15:33:56 INFO mapreduce.Job: The url to track the job: http://my-computer-05:8088/proxy/application_1676124615395_0004/
23/02/11 15:33:56 INFO mapreduce.Job: Running job: job_1676124615395_0004

我看到这个至少5分钟或更多。是剧本还在数单词还是出了什么问题？
这个脚本wordcount. jar是学校给我的。2我在学校试的时候，它还能用。3但是现在我想在我自己的电脑上试一下，我知道它能不能用。
您能帮我一下吗？
代码如下：

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

// defining the class WordCount
public class WordCount {

  // defining the class TokenizerMapper
  // this class is in charge of the mapping process
  public static class TokenizerMapper
       extends Mapper<Object, Text, Text, IntWritable>{
    // it extends the class Mapper from mapreduce api
    // this mapper takes as input an Object (identifier of the partition) and a Text (the partition of the text)
    // it outputs a Text (a word) and an Integer (1)

    // defining the value to emit
    private final static IntWritable one = new IntWritable(1);
    // initializing the word to emit
    private Text word = new Text();

    // defining the function performed during map
    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      // tokenizing the text partition
      StringTokenizer itr = new StringTokenizer(value.toString());

      // running through the tokens
      while (itr.hasMoreTokens()) {
        // setting the value of word
        word.set(itr.nextToken().toLowerCase().replaceAll("[^a-z 0-9A-Z]",""));
        // emitting the key-value pair
        context.write(word, one);
      }
    }
  }

  // defining the class IntSumReducer
  // this class is in charge of the reducing process
  public static class IntSumReducer
       extends Reducer<Text,IntWritable,Text,IntWritable> {
    // it extends the class Reducer from mapreduce api
    // it takes as input a Text (a word) and a list of integers (1s)
    // it outputs a Text (a word) and an integer (the frequency of the word)

    // initializing the frequency
    private IntWritable result = new IntWritable();

    // defining the function performed during reduce
    public void reduce(Text key, Iterable<IntWritable> values,
                       Context context
                       ) throws IOException, InterruptedException {
      // initializing the the sum
      int sum = 0;
      // running through the values associated to this key
      for (IntWritable val : values) {
        // incrementing the sum
        sum += val.get();
      }
      // attributing the sum to the value to emit
      result.set(sum);
      // emitting the key-value pair
      context.write(key, result);
    }
  }

  // defining the main class containing the parameters of the job
  public static void main(String[] args) throws Exception {
    // initializing configuration
    Configuration conf = new Configuration();
    // initializing job
    Job job = Job.getInstance(conf, "word count");
    // providing job with the classes for mapper and reducer
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class); // mapper
    job.setCombinerClass(IntSumReducer.class); // combiner
    job.setReducerClass(IntSumReducer.class); // reducer
    // providing job with the output classes
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

    // arguments to interpret
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    // completion of the job
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

java hadoop：当我运行一个MapReduce程序来计算单词时，有什么问题吗？

1条答案

相关问题

热门标签

最新问答