首页 > 解决方案 > Python: List all the files names containing the string as in its column name

问题描述

I am new to python. I have a folder with many sub folders containing Parquet files of 100+GB data. some of the file size is also in GBs. I am trying to list all the files that contains column name like "Email"(at start, end or mid) case=False. The output should be in a .txt file. I have tried the below code but its not working properly.Can someone help?

inp=["Email","Mail"]
    op=[]
    for elem in listOfFiles:
        if(elem.endswith(".parquet")):
            full_path=elem
            filename = elem.split(".")
            filename = filename[0]
            pfile=pq.read_table(elem)
           stri  =  str(pfile.schema)
            for val in inp:
                if(stri.count(val)>0):
                    op.append(full_path)

标签: pythonpython-3.8

解决方案


inp=["Email","Mail"]
op=[]
for elem in listOfFiles:
  if(elem.endswith(".parquet")):
    full_path=elem
    filename = elem.split(".")
    filename = filename[0]
    pfile=pq.read_table(elem)
    stri = str(pfile.schema)
    for val in inp:
      if(stri.count(val)>0):
        op.append(full_path)

试试看。如果您遇到错误,请在此处发布,我可以进一步解决问题。


推荐阅读