首页 > 解决方案 > 在python中使用pandas数据框保存.xlsm(excel)文件的问题

问题描述

我有两个包含不同信息的宏 excel 文件(.xlsm 文件)。我编写了基本上检查某些字段的python代码,如果该字段存在,则保存在一个文件夹中,否则如果该字段不存在,则保存在另一个文件夹中。我不希望从该 excel 文件中删除任何信息。我只是想如果该字段存在然后将原始文件保存到该文件夹​​中,否则将原始文件保存在其他文件夹中。代码没有给出任何错误。但是当我检查保存的文件时,它显示了这个错误。附图片。

在此处输入图像描述

为了测试,输入文件附在此处

from pathlib import Path
import time
import parser
import argparse
import pandas as pd
import os
import warnings

warnings.filterwarnings("ignore")

parser = argparse.ArgumentParser(description="Process some integers.")

parser.add_argument("path", help="define the directory to folder/file")
parser.add_argument("--verbose", help="display processing information")

start = time.time()


def main(path_xlsm, verbose):
    if (".xlsm" in str(path_xlsm).lower()) and path_xlsm.is_file():
        xlsm_files = [Path(path_xlsm)]
    else:
        xlsm_files = list(Path(path_xlsm).glob("*.xlsm"))

    df = pd.DataFrame()
    
    for fn in xlsm_files:
        all_dfs = pd.read_excel(fn, sheet_name=None, header=None, engine="openpyxl")
        print(all_dfs)
        list_data = all_dfs.keys()
        all_dfs.pop("Lookups", None)
        all_dfs.pop("Instructions For Use", None)
        all_dfs.pop("Drop Down Boxes", None)
        all_dfs.pop("ResolutionLookups", None)
        
        for ws in list_data:  # Looping for excel sheet
            df1 = all_dfs[ws]
              
            if df1.iloc[3, 0] == "Client Representative" and df1.iloc[4, 1] == "DATE" and df1.iloc[4, 3] == "SHIFT":
                path_save = "C:\\Users\\ShantanuGupta\\Desktop\\Incoming\\Peel"
                df.to_excel(os.path.join(path_save, f"{fn.name}"), index=False)
            else:
                path_save = "C:\\Users\\ShantanuGupta\\Desktop\\Incoming\\Resolution"
                df.to_excel(os.path.join(path_save, f"{fn.name}"), index=False)
            
            
if __name__ == "__main__":
    start = time.time()
    args = parser.parse_args()
    path = Path(args.path)
    verbose = args.verbose
    main(path, verbose)  # Calling Main Function
    print("Processed time:", time.time() - start)  # Total Time  

      

谁能帮我解决这个问题???

标签: pythonpandasxlsm

解决方案


你可以使用ExcelWriterfrom pandas 来做到这一点。

import pandas as pd

writer = pd.ExcelWriter('<filename>.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='<sheet_name>)

workbook  = writer.book
workbook.filename = '<filename>.xlsm'
writer.save()

推荐阅读