python - Python,Pandas 写入数据帧,lxml.etree.SerialisationError: IO_WRITE
问题描述
从数据框中选择所需行的代码。原始数据是 Excel 格式,我把它放在数据框中。
我想选择“201506”和“201508”中所有“测试日期”的行,并将它们写入Excel文件。线路工作正常。
import pandas as pd
data_short = {'Contract_type' : ["Other", "Other", "Type-I", "Type-I", "Type-I", "Type-II", "Type-II", "Type-III", "Type-III", "Part-time"],
'Test Date': ["20150816", "20150601", "20150204", "20150609", "20150204", "20150806", "20150201", "20150615", "20150822", "20150236" ],
'Test_time' : ["16:26", "07:39", "18:48", "22:32", "03:54", "03:30", "04:00", "22:02", "13:43", "10:29"],
}
df = pd.DataFrame(data_short)
data_201508 = df[df['Test Date'].astype(str).str.startswith('201508')]
data_201506 = df[df['Test Date'].astype(str).str.startswith('201506')]
data_68 = data_201506.append(data_201508)
writer = pd.ExcelWriter("C:\\test-output.xlsx", engine = 'openpyxl')
data_68.to_excel(writer, "Sheet1", index = False)
writer.save()
但是当我将它们应用于更大的文件时,大约 600,000 行,25 列(文件大小为 65 MB),它返回如下错误消息:
Traceback (most recent call last):
File "C:\Python27\Working Scripts\LL move pick wanted ATA in months.py", line 15, in <module>
writer.save()
File "C:\Python27\lib\site-packages\pandas\io\excel.py", line 732, in save
return self.book.save(self.path)
File "C:\Python27\lib\site-packages\openpyxl\workbook\workbook.py", line 263, in save
save_workbook(self, filename)
File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 239, in save_workbook
writer.save(filename, as_template=as_template)
File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 222, in save
self.write_data(archive, as_template=as_template)
File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 80, in write_data
self._write_worksheets(archive)
File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 163, in _write_worksheets
xml = sheet._write(self.workbook.shared_strings)
File "C:\Python27\lib\site-packages\openpyxl\worksheet\worksheet.py", line 776, in _write
return write_worksheet(self, shared_strings)
File "C:\Python27\lib\site-packages\openpyxl\writer\worksheet.py", line 263, in write_worksheet
xf.write(worksheet.page_breaks.to_tree())
File "src/lxml/serializer.pxi", line 1016, in lxml.etree._FileWriterElement.__exit__ (src\lxml\lxml.etree.c:142025)
File "src/lxml/serializer.pxi", line 904, in lxml.etree._IncrementalFileWriter._write_end_element (src\lxml\lxml.etree.c:140218)
File "src/lxml/serializer.pxi", line 999, in lxml.etree._IncrementalFileWriter._handle_error (src\lxml\lxml.etree.c:141711)
File "src/lxml/serializer.pxi", line 195, in lxml.etree._raiseSerialisationError (src\lxml\lxml.etree.c:131087)
lxml.etree.SerialisationError: IO_WRITE
是不是说明电脑不够好(8GB,Win10)?有没有办法优化代码(例如,消耗更少的内存)?谢谢你。
顺便说一句:保存 Excel 文件时出现与 I/O 错误类似的问题- Python但没有解决方案...
解决方案
找到了解决方案:改为将输出写入 csv(无论如何它也可以在 Excel 中打开)
data_wanted_all.to_csv("C:\\test-output.csv", index=False)
在这里发帖以防有人遇到同样的问题。让我知道这个问题是否应该被删除。:)
推荐阅读
- node.js - Nest.js 优雅关闭
- azure - Azure 逻辑应用 - 访问在 Azure VM 中运行的 SQL Server
- express - 'Session | 类型的参数 undefined' 不可分配给'Session' 类型的参数
- java - 如何修复实体的 Minecraft 1.12 空指针异常错误?
- python-3.x - 如果值是真实的,则将键,值添加到 dict
- python - TypeError: dtype 对象的图像数据无法转换为浮点数 - 使用 Seaborn 的 HeatMap Plot 问题
- java - 一个spring boot应用类可以扩展另一个依赖spring boot应用类吗
- google-cloud-platform - 即使在重新启动 gcp 计算实例后也会生成相同的公共 IP
- xamarin.forms - 将 Xamarin.Forms Android 应用程序迁移到 AndroidX 支持包
- material-ui - 如何将 Material-ui 选择的宽度保持为最宽的 menuItem?