Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API , FAQ , Github Issue and AI community to get the answer.Have a nice day!
好的,不好意思哈,现在我将代码复制下来了。如下。麻烦大佬们帮忙解答疑惑,感谢。 import paddle from paddlenlp.data import Stack, Dict, Pad from paddlenlp.datasets import MapDataset
batch_size = 2 data_list = [] data_path="data.txt" with open(data_path,encoding="utf-8") as f: for line in f.readlines(): one_exapme = line.strip() data_list.append(eval(one_exapme))
data_list = MapDataset(data_list)
for idx in range(1): print(data_list[idx]['input_ids']) print(data_list[idx]['input_mask']) print(data_list[idx]['segment_ids']) print(data_list[idx]['start_positions']) print(data_list[idx]['end_positions'])
9条答案
按热度按时间5vf7fwbs1#
您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看 官网API文档 、 常见问题 、 历史Issue 、 AI社区 来寻求解答。祝您生活愉快~
Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API , FAQ , Github Issue and AI community to get the answer.Have a nice day!
kgsdhlau2#
您好,能否提供下最小可复现代码
ar5n3qh53#
请把代码和数据粘贴出来我们才好帮你复现,否则我们还要手动敲代码,很浪费时间
gcuhipw94#
好的,不好意思哈,现在我将代码复制下来了。如下。麻烦大佬们帮忙解答疑惑,感谢。
import paddle
from paddlenlp.data import Stack, Dict, Pad
from paddlenlp.datasets import MapDataset
batch_size = 2
data_list = []
data_path="data.txt"
with open(data_path,encoding="utf-8") as f:
for line in f.readlines():
one_exapme = line.strip()
data_list.append(eval(one_exapme))
data_list = MapDataset(data_list)
for idx in range(1):
print(data_list[idx]['input_ids'])
print(data_list[idx]['input_mask'])
print(data_list[idx]['segment_ids'])
print(data_list[idx]['start_positions'])
print(data_list[idx]['end_positions'])
train_batch_sampler = paddle.io.DistributedBatchSampler(
data_list, batch_size=batch_size, shuffle=True)
train_batchify_fn = lambda samples, fn=Dict({
"input_ids": Pad(axis=0, pad_val=0),
"input_mask": Pad(axis=0, pad_val=0),
"segment_ids": Pad(axis=0, pad_val=0),
"start_positions": Stack(dtype="int64"),
"end_positions": Stack(dtype="int64")
}): fn(samples)
train_data_loader = paddle.io.DataLoader(
dataset=data_list,
batch_sampler=train_batch_sampler,
collate_fn=train_batchify_fn,
return_list=True)
for step, batch in enumerate(train_data_loader, start=1):
input_ids, input_mask, segment_ids, start_positions, end_positions = batch
print(input_ids)
break
#数据demo如下
data.txt
数据demo
{'input_ids':[1, 1034, 1189, 734, 2003, 241, 284, 131, 553],'input_mask':[1, 1, 1, 1, 1, 1, 1, 1, 1],'segment_ids':[0, 0, 0, 1, 1, 1, 1, 1, 1],'start_positions':5,'end_positions':7}
{'input_ids':[1, 1034, 1189, 734, 2003, 241, 284,],'input_mask':[1, 1, 1, 1, 1, 1, 1],'segment_ids':[0, 0, 0, 1, 1, 1, 1],'start_positions':2,'end_positions':4}
{'input_ids':[1, 1034, 1189, 734, 2003, 241, 284, 131],'input_mask':[1, 1, 1, 1, 1, 1, 1, 1],'segment_ids':[0, 0, 0, 1, 1, 1, 1, 1],'start_positions':4,'end_positions':6}
{'input_ids':[1, 1034, 1189, 734, 2003, 241],'input_mask':[1, 1, 1, 1, 1, 1],'segment_ids':[0, 0, 0, 1, 1, 1],'start_positions':2,'end_positions':3}
zdwk9cvp5#
您好,能否提供下最小可复现代码
麻烦您抽时间看下这个是什么问题?感谢,每次运行for step, batch in enumerate(train_data_loader):就会陷入死循环,ctrl+c无法强制暂停。
flvlnr446#
请把代码和数据粘贴出来我们才好帮你复现,否则我们还要手动敲代码,很浪费时间
麻烦您抽时间看下这个是什么问题?感谢,每次运行for step, batch in enumerate(train_data_loader):就会陷入死循环,ctrl+c无法强制暂停。
h43kikqp7#
请问该情况是可以稳定复现的么,我这边用您提供的数据demo测试了下是可以正常运行的,通过ctrl-C可以立刻终止。
mzaanser8#
请问该情况是可以稳定复现的么,我这边用您提供的数据demo测试了下是可以正常运行的,通过ctrl-C可以立刻终止。
感谢您抽时间解答我的问题,我这里还是那个问题,当运行那个for step, batch in enumerate(train_data_loader):时,无法执行之后的程序,而且一直无法强制暂停,直到一定时间后程序被调试机强制killed。不知道这是怎么回事。
lvmkulzt9#
请问该情况是可以稳定复现的么,我这边用您提供的数据demo测试了下是可以正常运行的,通过ctrl-C可以立刻终止。
您好,感谢大佬们抽出宝贵时间为我们解答,我又测试了这个代码,现在可以了,但是前面为什么会一直无法强制停止的问题还是不知道,还有上面时另一个类似代码运行出现无法强制停止的例子,可能是我自己的原因触发了什么吧。