首页 > 解决方案 > 将多个 xml 文件转换为 .csv 时出现 NotADirectoryError

问题描述

代码

   import os
   import pandas as pd

   df = pd.DataFrame()
   xml_file_path = "/Users/ruzi/Desktop/top1000_complete 2"
   csv_file_path = "/Users/ruzi/Desktop/xml.csv"
   if os.path.isdir(xml_file_path):
       for e in os.listdir(xml_file_path):
           new_path = xml_file_path + "/" + str(e)
           if str(e) != '.DS_Store' and os.path.isdir(new_path):
               for e1 in os.listdir(new_path):
                 next_new_path = new_path + "/" + str(e1)
                 if str(e1) != '.DS_Store' and os.path.isfile(next_new_path):
                    for e2 in os.listdir(next_new_path):
                      third_new_path = new_path + "/" + str(e1)
                      if str(e2) != '.DS_Store' and os.path.isfile(third_new_path):
                      data_frame = pd.read_xml(third_new_path)
                      df=df.append(data_frame)
                      data_frame = pd.DataFrame()
 # Convert Into CSV
   df.to_csv(csv_file_path, index=None)

错误信息

     /Library/Frameworks/Python.framework/Versions/3.8/bin/python3     /Users/ruzi/Documents/pythonProject/main.py
     Traceback (most recent call last):
     File "/Users/ruzi/Documents/pythonProject/main.py", line 14, in <module>
     for e2 in os.listdir(next_new_path):
     NotADirectoryError: [Errno 20] Not a directory: '/Users/ruzi/Desktop/top1000_complete 2/P04-3022/citing_sentences_annotated.json'

     Process finished with exit code 1
    

[文件位置][1]

[文件夹 1][2]

[文件夹 2][3]

##images [1]:https ://i.stack.imgur.com/UEeki.png [2] : https ://i.stack.imgur.com/2CmnR.png [3]:https://i .stack.imgur.com/g5fVU.png

标签: python

解决方案


               for e1 in os.listdir(new_path):
                 next_new_path = new_path + "/" + str(e1)
                 if str(e1) != '.DS_Store' and os.path.isfile(next_new_path):
                    # At this point, next_new_path is a file, not a dir
                    for e2 in os.listdir(next_new_path):

考虑这个文件树:

dir1/
    subdir1/
    subdir2/
    file1
    file2

e1 是 new_path(文件或目录)中的任何子路径(例如 subdir1 或 file1) next_new_path 是相同的,包括父路径(例如 dir1/subdir1/ 或 dir1/file)

然后你检查 next_new_path 是一个文件(不是一个目录),所以你排除了 dir1/subdir/ 并且只保留了 dir1/file。

然后您对此调用 listdir,这是错误的,因为它是一个文件,而这正是错误消息所说的内容。


在 2021 年,我建议使用 pathlib 而不是 os.path。


推荐阅读