使用Eclipse在数据流上运行Wordcount示例管道时出错

nx7onnlm  于 2022-10-15  发布在  Eclipse
关注(0)|答案(2)|浏览(210)

当尝试在Eclipse IDE中使用DataFlow运行Wordcount示例管道时,我收到以下错误:

Exception in thread "main" java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:233)
    at org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:162)
    at org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)
    at org.apache.beam.sdk.Pipeline.create(Pipeline.java:150)
    at com.google.cloud.dataflow.examples.WordCount.main(WordCount.java:178)

Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:222)
    ... 4 more

Caused by: java.lang.IllegalArgumentException: Missing object or bucket in path: 'gs://mysite-ga-datastreaming-196008-my-bucket/', did you mean: 'gs://some-bucket/mysite-ga-datastreaming-196008-my-bucket'?
    at org.apache.beam.sdks.java.extensions.google.cloud.platform.core.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:383)
    at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.verifyPath(GcsPathValidator.java:77)
    at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.validateOutputFilePrefixSupported(GcsPathValidator.java:60)
    at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:246)
    ... 9 more

有些人认为这是因为Java版本的问题,似乎Beam在Java 9上不能很好地工作。不管怎样,我还在用Java 8。另一方面,也有人说错误是因为你必须在你的存储桶下提供一个子文件夹作为存储位置。我试过了,但还是不行。
如果任何人以前遇到过同样的问题,或能就错误提供任何建议,将不胜感激。

f87krz0w

f87krz0w1#

在使用管道之前,您需要在Google云存储中创建存储桶gs://mysite-ga-datastreaming-196008-my-bucket/

2ledvvac

2ledvvac2#

在GCP项目中创建MySite-ga-DataStreaming-196008-My-Bucket。
创建存储桶:进入GCP用户界面,选择存储存储桶。单击创建存储桶按钮。输入存储桶名称MySite-ga-DataStreaming-196008-My-Bucket。单击确定。然后运行该命令。

相关问题