首页 > 解决方案 > 从另一个数据帧动态生成数据帧

问题描述

我需要帮助生成要由 pandas.read_csv() 打开的文件的动态列表

_ExtensionToLookFor = '.csv'
# Setting up boolean to check for CSV
_isCsv = None
# Returning all files in folder ** If the folder needs to be changed - FilePath #
_FileReturn = [_Files for _Files in listdir(_FilePath) if isfile(join(_FilePath,_Files))]
_FileReturn = pd.DataFrame(_FileReturn)
_FileReturn.columns = ['Files']
# Returning only CSV files #
_FileReturn = _FileReturn[_FileReturn['Files'].str.contains('.csv')]

def SpcFormat(_FileReturn):
    def __init__(self,_FileReturn):
        self._FileReturn = _FileReturn
    def __DataframeCreation__(self):
        _FileReturn = self._FileReturn
        for i in _FileReturn:
            StartInt = 1

我在完成此操作时遇到了麻烦,在底部附近我试图遍历列表并将数据帧命名为与计数位置等效。

所以在伪代码中它应该像这样

For Files in _FileReturn:
Create New DataFrame(StartInt) = Pandas.Read_csv(DataFrame(IntPosition)+_ExtensionToLookFor)
StartInt ++ // Add One

谢谢 !

*编辑:为了清楚起见,我要做的是检查文件夹-返回文件夹中的所有文件-按特定文件类型过滤然后根据检索到的 Csv 文件的数量动态创建具有名称格式的数据框*

_FilePath = r'\\Ezquest\Quality Control\Transend Programs\ConversionTest'
# Returning all files in folder ** If the folder needs to be changed - FilePath #
_FileReturn  = glob(_FilePath + '\\' + '*.csv')
#_FileReturn = [_Files for _Files in listdir(_FilePath) if isfile(join(_FilePath,_Files))]
_FileReturn = pd.DataFrame(_FileReturn)
_FileReturn.columns = ['Files']

# Returning only CSV files #
_Files  = {
        'csv_' + str(_FileReturnName): pd.read_csv(_FileReturn['Files'],sep=',',encoding='latin')
        for _FileReturnName in range(len(_FileReturn['Files']))
          }

上面的代码包含来自@J 的部分答案。Doe - 虽然我要回来

_Files  = {
        'csv_' + str(_FileReturnName): pd.read_csv(_FileReturn['Files'],sep=',',encoding='latin')
        for _FileReturnName in range(len(_FileReturn['Files']))
          }
Traceback (most recent call last):

  File "<ipython-input-3-a7027d4eb492>", line 3, in <module>
    for _FileReturnName in range(len(_FileReturn['Files']))

  File "<ipython-input-3-a7027d4eb492>", line 3, in <dictcomp>
    for _FileReturnName in range(len(_FileReturn['Files']))

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 413, in _read
    filepath_or_buffer, encoding, compression)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\common.py", line 232, in get_filepath_or_buffer
    raise ValueError(msg.format(_type=type(filepath_or_buffer)))

ValueError: Invalid file path or buffer object type: <class 'pandas.core.series.Series'>

任何进一步的帮助将不胜感激!

标签: pythonpandas

解决方案


你可以考虑以下

import pandas as pd
from glob import glob
path = ('path/to/csv/folder/')
# get all csv files in a folder
files = glob(path + '*.csv')
# create a dictionary and read csv files
df_dict = {'csv_' + str(k): pd.read_csv(files[k], sep=',', encoding='latin') 
           for k in range(len(files))}
# then check each dataframe by indexing df_dict['csv_1']

推荐阅读