在hadoop中访问sqlserver

5w9g7ksd  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(298)

我想做的是:该程序设计用于从SQLServer2008加载区域销售数据,并在mapreduce上运行简单的统计计算,以获得每个区域的总销售额。我得到的错误说明程序找不到sqljdbc4.jar文件,但是,该文件确实复制到了代码中指定的位置。
所以代码是这样的:

//fileName: MRExp.java
public class MRExp {
    public static void main(String[] args) throws IOException {
        JobConf conf = new JobConf(MRExp.class);
        DistributedCache.addFileToClassPath(new Path("/userX/sqljdbc4.jar"), conf);

        conf.setMapperClass(MRMapper.class);
        conf.setReducerClass(MRReducer.class);

        conf.setMapOutputKeyClass(Text.class);
        conf.setMapOutputValueClass(LongWritable.class);

        conf.setOutputKeyClass(LongWritable.class);
        conf.setOutputValueClass(Text.class);

        conf.setInputFormat(DBInputFormat.class);
        FileOutputFormat.setOutputPath(conf, new Path(args[0]));

        DBConfiguration
                .configureDB(
                        conf,
                        "com.microsoft.sqlserver.jdbc.SQLServerDriver",
                        "jdbc:sqlserver://MyDbServerAddr:1433;databaseName=ThisDb;integratedSecurity=true;",
                        "db_userName", "db_Pws");

        DBInputFormat
                .setInput(conf, InfoUnit.class,
                        "SELECT R_NAME,L_ORDERKEY from dbo.United10MB ;"/* inputQuery */
                        , "SELECT COUNT(L_ORDERKEY) from dbo.United10MB"/* inputCountQuery */);

        try {
            JobClient.runJob(conf);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

//其次是mrmapper、mrreducer和infounit的定义。infounit实现了writable、dbwritable。
文件位置:
[root@test mrexp]#电源
/根/mrexp
[root@test mrexp]#ls
类hadoop-0.20.2-core.jar mrexp.java sqljdbc4.jar
然后,对于编译mrexp.java:
[root@test mrexp]#javac-classpath hadoop-0.20.2-core.jar-d类/mrexp.java
[root@test mrexp]#jar-cvf mrexp.jar-c类/。
还要将sqljdbc4.jar复制到hdfs:

[root@test MRExp]#  hadoop dfs -copyFromLocal sqljdbc4.jar /userX

所以我们得到:

[root@test MRExp]# ls  
classes  hadoop-0.20.2-core.jar  MRExp.jar  MRExp.java  sqljdbc4.jar

完成上述操作后,启动mapr进程:
[root@test mrexp]#hadoop jar mrexp.jar mrexp.mrexp/userx/output
但节目说:

17:02:50 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/10/28 17:02:50 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/root/.staging/job_1350984913454_0009
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver
        at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.setConf(DBInputFormat.java:165)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:70)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
        at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:607)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:476)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:468)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:359)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:609)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:604)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:604)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:880)
        at mrexp.MRExp.main(MRExp.java:70)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver
        at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.getConnection(DBInputFormat.java:191)
        at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.setConf(DBInputFormat.java:159)
        ... 25 more
Caused by: java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:169)
        at org.apache.hadoop.mapreduce.lib.db.DBConfiguration.getConnection(DBConfiguration.java:148)
        at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.getConnection(DBInputFormat.java:185)
        ... 26 more
eh57zj3b

eh57zj3b1#

在的“-libjars”命令行选项中包含sqljdbc4.jar hadoop jar … 命令。
更多信息请阅读cloudera的这篇文章。
更新:
执行以下操作

[root@test MRExp]#  hadoop dfs -ls /userX

将绝对路径复制到文件系统中的sqljdbc4.jar,并放入以下行

DistributedCache.addFileToClassPath(new Path("<Absolute Path>/sqljdbc4.jar"), conf);

这将解决问题。

相关问题