我正在尝试通过h2o(使用库)使用一些机器学习功能 rsparkling
)在 sparklyr
会议。我正在运行hadoop集群。
考虑以下示例:
library(dplyr)
library(sparklyr)
library(rsparkling)
library(h2o)
# configure the spark session and connect
sc = spark_connect(master = 'yarn-client',
spark_home = '/usr/hdp/current/spark-client',
app_name = 'sparklyr',
config = list(
"sparklyr.shell.executor-memory" = "1G",
"sparklyr.shell.driver-memory" = "4G",
"spark.driver.maxResultSize" = "2G" # may need to transfer a lot of data into R
)
)
mtcars_tbl <- copy_to(sc, mtcars, "mtcars")
mtcars_hf <- as_h2o_frame(sc=sc,x=mtcars_tbl,name='h_cars')
我得到以下错误:
Error: java.lang.IllegalArgumentException: Unsupported argument: (spark.dynamicAllocation.enabled,true)
at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.apply(InternalBackendUtils.scala:48)
at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.apply(InternalBackendUtils.scala:40)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.h2o.backends.internal.InternalBackendUtils$class.checkUnsupportedSparkOptions(InternalBackendUtils.scala:40)
at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkUnsupportedSparkOptions(InternalH2OBackend.scala:31)
at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkAndUpdateConf(InternalH2OBackend.scala:61)
at org.apache.spark.h2o.H2OContext.<init>(H2OContext.scala:96)
at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:294)
at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:316)
at org.apache.spark.h2o.H2OContext.getOrCreate(H2OContext.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sparklyr.Invoke$.invoke(invoke.scala:94)
at sparklyr.StreamHandler$.handleMethodCall(stream.scala:89)
at sparklyr.StreamHandler$.read(stream.scala:55)
at sparklyr.BackendHandler.channelRead0(handler.scala:49)
at sparklyr.BackendHandler.channelRead0(handler.scala:14)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
知道怎么回事吗?
2条答案
按热度按时间wbgh16ku1#
目前,sparkling water/rsparkling不支持动态spark cluster。所以你只需要禁用它:
config = list("spark.dynamicAllocation.enabled" = "false")
acruukt92#
对于python用户: