没有错误消息但是map reduce没有启动(maven?)

myss37ts  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(331)

我正在尝试用pigserver运行pig脚本,因为我需要在脚本中使用“while”和“if”。所以java有助于解决这个问题。
困难是我的主运行,但什么都没有发生(除了我的system.out.print),我不知道为什么map reduce没有启动。程序结束时没有任何错误。
我认为这是我的pom的一个问题,我认为我没有把所有需要的依赖性。
这是我的pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.blablabla</groupId>
<artifactId>testPigServer</artifactId>
<version>0.0.1-SNAPSHOT</version>

<dependencies>

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-client</artifactId>
        <version>2.2.0</version>
    </dependency>

    <dependency>
        <groupId>log4j</groupId>
        <artifactId>log4j</artifactId>
        <version>1.2.16</version>
    </dependency>

    <dependency>
        <groupId>org.apache.pig</groupId>
        <artifactId>pig</artifactId>
        <version>0.12.1</version>
    </dependency>

    <dependency>
        <groupId>org.antlr</groupId>
        <artifactId>antlr-runtime</artifactId>
        <version>3.4</version>
    </dependency>

</dependencies>

这是我的主要观点:

import java.io.IOException;

import org.apache.pig.ExecType;
import org.apache.pig.PigServer;
import org.apache.pig.backend.executionengine.ExecException;

public class MainPigServer {
/**
 * @param args
 * @throws IOException
 * @throws ExecException
 */
public static void main(String[] args) throws ExecException, IOException {

    System.out.println("Hello");
    PigServer pigServer = new PigServer(ExecType.LOCAL);;
    try {

        String inputFile = "/home/cloudera/jeuxEtudiants/data/parents.csv";
        String outPut = "/home/cloudera/jeuxEtudiants/resultat_PigServer_9";
        queryCSV(pigServer, inputFile, outPut);
        // queryJson(pigServer, inputFile,inputRef, outPut);
    } catch (ExecException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
    finally{
        pigServer.shutdown();
        System.out.println("Finally");
    }
}

public static void queryCSV(PigServer pigServer, String inputFile, String outPut) throws IOException {
    System.out.println("dans queryCSV");
    pigServer.registerQuery("donnees_fait = LOAD '" + inputFile + "' USING PigStorage(';') ;");
    pigServer.registerQuery("donnees_group = GROUP donnees_fait by $0 ;");
    pigServer.store("donnees_group", outPut, "PigStorage('|')");
    System.out.println("fin queryCSV");
}

public static void queryJson(PigServer pigServer, String inputFile, String inputRef, String outPut) {
    System.out.println("dans queryJson");
    try {
        pigServer.registerQuery("donnees_fait = LOAD '" + inputFile + "' USING PigStorage(';') AS(id,nom,prenom);");
        pigServer.registerQuery("ligne_finale = FOREACH donnees_fait GENERATE id AS Description, (nom,prenom) AS Test:(nom,prenom);");
        pigServer.store("ligne_finale", outPut, "JsonStorage");
    } catch (IOException e) {
        e.printStackTrace();
    }
}

}
当我运行main时,我得到:

Hello
log4j:WARN No appenders could be found for logger (org.apache.pig.impl.util.PropertiesUtil).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
dans queryCSV
fin queryCSV
Finally

我不知道发生了什么。
更重要的是,我试着用咕噜声来执行脚本,它是有效的。
谢谢你的阅读。
安吉利克

fykwrbwg

fykwrbwg1#

最后,我找到了解决办法:
在文件settings.xml(您可以在???/.m2/settings.xml中找到)中,您可能需要创建一个。输入:

<?xml version="1.0" encoding="UTF-8"?>
<settings>
    <profiles>
        <profile>
            <id>standard-extra-repos</id>
            <activation>
                <activeByDefault>true</activeByDefault>
            </activation>
            <repositories>
                <repository>
                    <!-- Central Repository -->
                    <id>central</id>
                    <url>http://repo1.maven.org/maven2/</url>
                    <releases>
                        <enabled>true</enabled>
                    </releases>
                    <snapshots>
                        <enabled>true</enabled>
                    </snapshots>
                </repository>
                <repository>
                    <!-- Cloudera Repository -->
                    <id>cloudera</id>
                    <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
                    <releases>
                        <enabled>true</enabled>
                    </releases>
                    <snapshots>
                        <enabled>true</enabled>
                   </snapshots>
                </repository>
            </repositories>
        </profile>
    </profiles>
</settings>

在pom中:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.businessdecision</groupId>
<artifactId>testPigServer</artifactId>
<version>0.0.1-SNAPSHOT</version>

<repositories>
    <repository>
        <id>cloudera</id>
        <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
    </repository>
</repositories>

<dependencies>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>2.0.0-cdh4.5.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-core</artifactId>
        <version>2.0.0-mr1-cdh4.5.0</version>
    </dependency>
    <dependency>
        <groupId>joda-time</groupId>
        <artifactId>joda-time</artifactId>
        <version>2.3</version>
    </dependency>
    <dependency>
        <groupId>log4j</groupId>
        <artifactId>log4j</artifactId>
        <version>1.2.17</version>
    </dependency>
    <dependency>
        <groupId>jline</groupId>
        <artifactId>jline</artifactId>
        <version>0.9.5</version>
    </dependency>
    <dependency>
        <groupId>org.antlr</groupId>
        <artifactId>antlr-runtime</artifactId>
        <version>3.5.2</version>
    </dependency>
    <dependency>
        <groupId>org.apache.pig</groupId>
        <artifactId>pig</artifactId>
        <version>0.11.0-cdh4.5.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.pig</groupId>
        <artifactId>pigunit</artifactId>
        <version>0.11.0-cdh4.5.0</version>
    </dependency>
    <dependency>
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>4.11</version>
        <scope>test</scope>
    </dependency>
</dependencies>

我仍然不知道真正的问题是什么,但现在它起作用了。我可能需要更多的依赖。
安吉利克

相关问题