typeerror:google collab上不能调用“javapackage”对象

k7fdbhmy  于 2021-07-09  发布在  Spark
关注(0)|答案(1)|浏览(608)

这个问题在这里已经有了答案

spark nlp“javapackage”对象不可调用(1个答案)
上个月关门了。
我正在学习apachespark,我在googlecolab上运行了下面的代码。

  1. # installed based upon https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/jupyter/quick_start_google_colab.ipynb#scrollTo=lNu3meQKEXdu
  2. import os
  3. # Install java
  4. !apt-get install -y openjdk-8-jdk-headless -qq > /dev/null
  5. !wget -q "https://downloads.apache.org/spark/spark-3.1.1/spark-3.1.1-bin-hadoop2.7.tgz" > /dev/null
  6. !tar -xvf spark-3.1.1-bin-hadoop2.7.tgz > /dev/null
  7. !pip install -q findspark
  8. os.environ["SPARK_HOME"] = "/content/spark-3.1.1-bin-hadoop2.7"
  9. os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
  10. os.environ["PATH"] = os.environ["JAVA_HOME"] + "/bin:" + os.environ["PATH"]
  11. ! java -version
  12. # Install spark-nlp and pyspark
  13. ! pip install spark-nlp==3.0.0 pyspark==3.1.1
  14. import sparknlp
  15. spark = sparknlp.start()
  16. from sparknlp.base import DocumentAssembler
  17. documentAssembler = DocumentAssembler().setInputCol(text_col).setOutputCol('document')

我得到下面的错误。我该怎么解决呢

  1. ---------------------------------------------------------------------------
  2. TypeError Traceback (most recent call last)
  3. <ipython-input-48-535b177b526b> in <module>()
  4. 4
  5. 5 from sparknlp.base import DocumentAssembler
  6. ----> 6 documentAssembler = DocumentAssembler().setInputCol(text_col).setOutputCol('document')
  7. 4 frames
  8. /usr/local/lib/python3.7/dist-packages/pyspark/ml/wrapper.py in _new_java_obj(java_class, *args)
  9. 64 java_obj = getattr(java_obj, name)
  10. 65 java_args = [_py2java(sc, arg) for arg in args]
  11. ---> 66 return java_obj(*java_args)
  12. 67
  13. 68 @staticmethod
  14. TypeError: 'JavaPackage' object is not callable
np8igboo

np8igboo1#

正如我在上次评论中提到的:
通过spark数据框中列的名称更改文本列,通过可添加的its名称创建文档。setcleanupmode(“clean\u mode”)有关详细信息,请参阅以下链接:https://spark.apache.org/docs/latest/ml-features

  1. documentAssembler = DocumentAssembler \
  2. .setInputCol("text_col") \
  3. .setOutputCol("document")

相关问题