The above in the image is the code that i want to run. Actually the dataset is a TXT file and i want to calculate the no of repetitions of "CAG" from that dataset txt file. I'm always getting this error . Pls find me a way
我已经用java和python版本尝试了一些youtube视频和某些命令沿着
1条答案
按热度按时间wrrgggsh1#
createDataFrame
方法需要一个RDD,但textFile
返回一个字符串的RDD。在创建DataFrame之前,您需要将每行转换为元组或结构。字符串