pytorch 如何在不使用额外空间的情况下合并变量？

ryevplcw 于 2023-10-20 发布在其他

关注(0)|答案(1)|浏览(103)

我目前有两个功能文件，每个文件的大小为3GB，和一个8GB的机器。现在，我想将这两个文件一个接一个地加载到内存中，并将它们合并到一个变量中。我想知道如何在不超出可用内存的情况下做到这一点。我尝试使用torch.cat，但似乎cat函数可能会为合并分配额外的内存，这会导致内存溢出。

pytorch

来源：https://stackoverflow.com/questions/77044764/how-to-merge-variables-without-using-additional-space

1条答案

按热度按时间

ezykj2lf1#

以下是该过程的高级概述：
1.按顺序打开和读取文件：不要同时将两个文件加载到内存中，而是一个接一个地读取它们。
1.在Chunks中处理
1.关闭并释放资源
1.重复：继续阅读、处理和合并块，直到处理完整个文件。
下面是一个Python代码片段，说明了这种方法：

merged_data = []

# Process the first file
with open('file1.txt', 'rb') as file1:
    while True:
        chunk = file1.read(1024)  # Read 1MB at a time (adjust chunk size as needed)
        if not chunk:
            break  # End of file
        merged_data.append(chunk)

# Process the second file
with open('file2.txt', 'rb') as file2:
    while True:
        chunk = file2.read(1024)  # Read 1MB at a time (adjust chunk size as needed)
        if not chunk:
            break  # End of file
        merged_data.append(chunk)

# Merge the chunks into a single variable
final_data = b''.join(merged_data)

# Now 'final_data' contains the merged content of both files

赞(0）回复(0）举报 2023-10-20

我来回答

pytorch 如何在不使用额外空间的情况下合并变量？

1条答案

相关问题

热门标签

最新问答