[自用]遇到的问题及解决办法

问题:pandas读取文件过大时内存爆炸
解决:分块读取数据再拼接
https://blog.csdn.net/weixin_39750084/article/details/81501395
'''
f = open(path)
data = pd.read_csv(path, sep=',',engine = 'python',iterator=True)
loop = True
chunkSize = 1000
chunks = []
index=0

while loop:
try:
print(index)
chunk = data.get_chunk(chunkSize)
chunks.append(chunk)
index+=1
except StopIteration:
loop = False print("Iteration is stopped.")

print('开始合并')
data = pd.concat(chunks, ignore_index= True)
'''

你可能感兴趣的:([自用]遇到的问题及解决办法)