sqoop:nosuchmethoderror:com.google.common.base.stopwatch.createstarted()

k0pti3hp  于 2021-05-27  发布在  Hadoop
关注(0)|答案(1)|浏览(684)

这个问题在这里已经有了答案

如何解决提交uberjar到googledataproc时的guava依赖问题(1个答案)
12个月前关门了。
我正在google cloud dataproc上的hadoop上运行sqoop,以便通过cloud sql代理访问postgresql,但是我得到了一个java依赖错误:

  1. INFO: First Cloud SQL connection, generating RSA key pair.
  2. Exception in thread "main" java.lang.reflect.InvocationTargetException
  3. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  4. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  5. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  6. at java.lang.reflect.Method.invoke(Method.java:498)
  7. at com.google.cloud.hadoop.services.agent.job.shim.HadoopRunClassShim.main(HadoopRunClassShim.java:19)
  8. Caused by: java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.createStarted()Lcom/google/common/base/Stopwatch;
  9. at com.google.common.util.concurrent.RateLimiter$SleepingStopwatch$1.<init>(RateLimiter.java:414)
  10. at com.google.common.util.concurrent.RateLimiter$SleepingStopwatch.createFromSystemTimer(RateLimiter.java:413)
  11. at com.google.common.util.concurrent.RateLimiter.create(RateLimiter.java:127)
  12. at com.google.cloud.sql.core.CloudSqlInstance.<init>(CloudSqlInstance.java:73)
  13. at com.google.cloud.sql.core.CoreSocketFactory.lambda$createSslSocket$0(CoreSocketFactory.java:221)
  14. at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
  15. at com.google.cloud.sql.core.CoreSocketFactory.createSslSocket(CoreSocketFactory.java:220)
  16. at com.google.cloud.sql.core.CoreSocketFactory.connect(CoreSocketFactory.java:185)
  17. at com.google.cloud.sql.postgres.SocketFactory.createSocket(SocketFactory.java:71)
  18. at org.postgresql.core.PGStream.<init>(PGStream.java:67)
  19. at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:91)
  20. at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:192)
  21. at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
  22. at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:211)
  23. at org.postgresql.Driver.makeConnection(Driver.java:458)
  24. at org.postgresql.Driver.connect(Driver.java:260)
  25. at java.sql.DriverManager.getConnection(DriverManager.java:664)
  26. at java.sql.DriverManager.getConnection(DriverManager.java:247)
  27. at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:904)
  28. at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:59)
  29. at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:763)
  30. at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786)
  31. at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289)
  32. at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260)
  33. at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:246)
  34. at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:327)
  35. at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1872)
  36. at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1671)
  37. at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106)
  38. at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:501)
  39. at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
  40. at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
  41. at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
  42. at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
  43. at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
  44. at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
  45. at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
  46. ... 5 more

这将启动群集:

  1. gcloud dataproc clusters create ${CLUSTER_NAME} \
  2. --region=${CLUSTER_REGION} \
  3. --scopes=default,sql-admin \
  4. --initialization-actions=gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh \
  5. --properties=hive:hive.metastore.warehouse.dir=gs://$GCS_BUCKET/export \
  6. --metadata=enable-cloud-sql-hive-metastore=false \
  7. --metadata=additional-cloud-sql-instances=${PSQL_INSTANCE}=tcp:${PSQL_PORT}

然后运行作业:

  1. # !/usr/bin/env bash
  2. export GCS_BUCKET="mybucket"
  3. export CLUSTER_NAME="mycluster"
  4. export CLUSTER_REGION="us-central1"
  5. export SOURCE_DB_NAME="mydb"
  6. export SOURCE_USER="myuser"
  7. export SOURCE_PASSWORD="****"
  8. export SOURCE_HOST="127.0.0.1"
  9. export SOURCE_PORT="5432"
  10. export SQOOP_JAR="gs://$GCS_BUCKET/sqoop-1.4.7.jar"
  11. export AVRO_JAR="gs://$GCS_BUCKET/avro-tools-1.9.1.jar"
  12. export GUAVA_JAR="gs://$GCS_BUCKET/guava-11.0.2.jar"
  13. export PSQL_JAR="gs://$GCS_BUCKET/postgresql-42.2.9.jar"
  14. export PSQL_FACTORY_JAR="gs://$GCS_BUCKET/postgres-socket-factory-1.0.15-jar-with-dependencies.jar"
  15. export INSTANCE_CONNECTION_NAME="myinstance:connection:name"
  16. export CONNECTION_STRING="jdbc:postgresql:///${SOURCE_DB_NAME}?cloudSqlInstance=${INSTANCE_CONNECTION_NAME}&socketFactory=com.google.cloud.sql.postgres.SocketFactory&user=${SOURCE_USER}&password=${SOURCE_PASSWORD}"
  17. gcloud dataproc jobs submit hadoop \
  18. --cluster=$CLUSTER_NAME \
  19. --class=org.apache.sqoop.Sqoop \
  20. --jars=$GUAVA_JAR,$SQOOP_JAR,$PSQL_FACTORY_JAR,$AVRO_JAR,$PSQL_JAR \
  21. --region=$CLUSTER_REGION \
  22. -- import -Dmapreduce.job.user.classpath.first=true \
  23. --connect="${CONNECTION_STRING}" \
  24. --username=${SOURCE_USER} \
  25. --password="${SOURCE_PASSWORD}" \
  26. --target-dir=gs://$GCS_BUCKET/export \
  27. --table=insight_actions \
  28. --as-avrodatafile

我试着准备不同版本的 GUAVA_JAR 在路径中,认为可能就是这样,但无法摆脱错误: guava-11.0.2.jar , guava-16.0.jar , guava-18.0.jar , guava-23.0.jar , guava-28.2-jre.jar . gcloud beta dataflow jobs describe ... 告诉我dataroc图像 https://www.googleapis.com/compute/v1/projects/cloud-dataproc/global/images/dataproc-1-3-deb9-20191216-000000-rc01

kyvafyod

kyvafyod1#

经过进一步的研究,我发现hadoop2.x覆盖了类路径,所以解决方案是创建一个uberjar并将其传递给hadoop。
我也改为使用特定的sqoopjar来代替hadoop260。
所以,我创造了一个 pom.xml 文件,已运行 maven package 在上面生成uberjar:

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!-- SEE: https://cloud.google.com/blog/products/data-analytics/managing-java-dependencies-apache-spark-applications-cloud-dataproc -->
  3. <project xmlns="http://maven.apache.org/POM/4.0.0"
  4. xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  5. xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  6. <modelVersion>4.0.0</modelVersion>
  7. <properties>
  8. <maven.compiler.source>1.8</maven.compiler.source>
  9. <maven.compiler.target>1.8</maven.compiler.target>
  10. </properties>
  11. <groupId>org.example.exporter</groupId>
  12. <artifactId>example-exporter-postgresql</artifactId>
  13. <version>0.0.1</version>
  14. <!-- YOUR_DEPENDENCIES -->
  15. <dependencies>
  16. <!-- https://mvnrepository.com/artifact/org.apache.sqoop/sqoop -->
  17. <dependency>
  18. <groupId>org.apache.sqoop</groupId>
  19. <artifactId>sqoop</artifactId>
  20. <version>1.4.7</version>
  21. <classifier>hadoop260</classifier>
  22. </dependency>
  23. <!-- https://mvnrepository.com/artifact/postgresql/postgresql -->
  24. <dependency>
  25. <groupId>org.postgresql</groupId>
  26. <artifactId>postgresql</artifactId>
  27. <version>42.2.9</version>
  28. </dependency>
  29. <!-- https://mvnrepository.com/artifact/com.google.cloud.sql/postgres-socket-factory -->
  30. <dependency>
  31. <groupId>com.google.cloud.sql</groupId>
  32. <artifactId>postgres-socket-factory</artifactId>
  33. <version>1.0.15</version>
  34. </dependency>
  35. <!-- https://mvnrepository.com/artifact/org.apache.avro/avro-tools -->
  36. <dependency>
  37. <groupId>org.apache.avro</groupId>
  38. <artifactId>avro-tools</artifactId>
  39. <version>1.9.1</version>
  40. </dependency>
  41. </dependencies>
  42. <build>
  43. <plugins>
  44. <plugin>
  45. <groupId>org.apache.maven.plugins</groupId>
  46. <artifactId>maven-shade-plugin</artifactId>
  47. <executions>
  48. <execution>
  49. <phase>package</phase>
  50. <goals>
  51. <goal>shade</goal>
  52. </goals>
  53. <configuration>
  54. <transformers>
  55. <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
  56. <mainClass>org.apache.sqoop.Sqoop</mainClass>
  57. </transformer>
  58. </transformers>
  59. <filters>
  60. <filter>
  61. <artifact>*:*</artifact>
  62. <excludes>
  63. <exclude>META-INF/maven/**</exclude>
  64. <exclude>META-INF/*.SF</exclude>
  65. <exclude>META-INF/*.DSA</exclude>
  66. <exclude>META-INF/*.RSA</exclude>
  67. </excludes>
  68. </filter>
  69. </filters>
  70. <relocations>
  71. <relocation>
  72. <pattern>com</pattern>
  73. <shadedPattern>repackaged.com.google.common</shadedPattern>
  74. <includes>
  75. <include>com.google.common.**</include>
  76. </includes>
  77. </relocation>
  78. </relocations>
  79. </configuration>
  80. </execution>
  81. </executions>
  82. </plugin>
  83. </plugins>
  84. </build>
  85. </project>
展开查看全部

相关问题