我有一个问题,当我试图索引和掩蔽我的数据
我知道这个问题以前被问过,但我似乎不能让我的工作。
train = pd.read_csv('/kaggle/input/twitter2/train.csv', lineterminator='\n')
test = pd.read_csv('/kaggle/input/twitter2/test.csv', lineterminator='\n')
train.head()
tweet gender
0 i need this for when my wife and i live in our... 1
1 why we never saw alfredhitchcock s bond and th... 0
2 oh my gosh the excitement of coming back from ... 1
3 because its monday and im missing him a little... 1
4 so to cwnetwork for having the current episode... 1
test.head()
tweet gender
0 the opposite of faith is not doubt its absolu... 1
1 wen yu really value somethingyu stay commited ... 1
2 today was such a bad day i wish i could text t... 1
3 so i took a nap amp had the weirdest dream lit... 0
4 savagejaspy i like the purple but you seem mor... 1
字符串
我已经检查了这两个框架,他们已经与鸣叫和性别标签归档
但是当我试着运行下面的代码时。。
def xlnet_encode(data,maximum_length) :
input_ids = []
attention_masks = []
train_input_ids,train_attention_masks = xlnet_encode(train[:50000],60)
test_input_ids,test_attention_masks = xlnet_encode(test[:20000],60)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-27-8f651b283574> in <module>
----> 1 train_input_ids,train_attention_masks = xlnet_encode(tweet_train[:50000],60)
2 test_input_ids,test_attention_masks = xlnet_encode(tweet_test[:20000],60)
TypeError: cannot unpack non-iterable NoneType object
型
这些代码的输出将是来自数据集和掩码的单词的索引号
我刚刚错过了什么吗?如何解决这个问题?
1条答案
按热度按时间bwleehnv1#
算了,我已经想好了。我忘了把XLNet令牌器