emr spark java应用程序失败

up9lanfz  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(233)

我写了一个小程序

public class EMRSpark {
    public static void main(String[] args) throws IOException {
        String logFile = "s3n://aBucket/aFolder/*.json"; //EMR cluster has access to bucket
        SparkConf conf = new SparkConf().setAppName("EMRSpark");
        JavaSparkContext sc = new JavaSparkContext(conf);
        JavaRDD<String> logData = sc.textFile(logFile).cache();

        long numAs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("a"); }
        }).count();

        long numBs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("b"); }
        }).count();

        System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);
    }
}

pom.xml文件

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.foo</groupId>
    <artifactId>emrspark</artifactId>
    <packaging>jar</packaging>
    <version>1.0-SNAPSHOT</version>
    <name>emrspark</name>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>1.6.1</version>
        </dependency>
    </dependencies>
    <build>
        <finalName>emrspark</finalName>
        <plugins>
            <!-- download source code in Eclipse, best practice -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-eclipse-plugin</artifactId>
                <version>2.9</version>
                <configuration>
                    <downloadSources>true</downloadSources>
                    <downloadJavadocs>false</downloadJavadocs>
                </configuration>
            </plugin>
            <!-- Set a compiler level -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.3.2</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
            <!-- Maven Assembly Plugin -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.4.1</version>
                <configuration>
                    <!-- get all project dependencies -->
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <!-- MainClass in mainfest make a executable jar -->
                    <archive>
                        <manifest>
                            <mainClass>com.foo.emrspark.EMRSpark</mainClass>
                        </manifest>
                    </archive>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <!-- bind to the packaging phase -->
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

我正在努力让它运行在电子病历。我试过(用瘦jar和胖jar)

$ aws emr put --cluster-id anId --key-pair-file bla.pem --src /home/fred-fri/git/emrspark/target/emrspark.jar
$ aws emr add-steps --cluster-id anId --steps Type=Spark,Name="EMRSpark",Args=[--class,com.foo.emrspark.EMRSpark,--deploy-mode,cluster,--master,yarn,/home/hadoop/emrspark.jar]

步骤已创建并运行,但失败。我也尝试过通过awswebgui来实现它们,结果是一样的。
如何让应用程序在emr中运行?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题