首页 > 解决方案 > 在单个 csv 文件上创建具有连续值的列

问题描述

我有一个大的 csv 文件,我将其拆分为六个单独的文件。我正在使用“for 循环”来读取每个文件并创建一列,其中值递增一。

whole_file=['100Hz1-raw.csv','100Hz2-raw.csv','100Hz3-raw.csv','100Hz4-raw.csv','100Hz5-raw.csv','100Hz6-raw.csv']

first_file=True

for piece in whole_file:
    if not first_file:
        skip_row = [0] # if it is not the first csv file then skip the header row (row 0) of that file
    else:
        skip_row = []
    V_raw = pd.read_csv(piece)
    V_raw['centiseconds'] = np.arange(len(V_raw)) #label each centisecond

我的输出:

我想要的输出

有没有一种聪明的方法来做我想做的事。

标签: pythondataframecsv

解决方案


将最后一个值存储为厘秒并从那里开始计数:

whole_file=['100Hz1-raw.csv','100Hz2-raw.csv','100Hz3-raw.csv','100Hz4-raw.csv','100Hz5-raw.csv','100Hz6-raw.csv']

first_file=True

## create old_centiseconds variable
old_centiseconds = 0

for piece in whole_file:
    if not first_file:
        skip_row = [0] # if it is not the first csv file then skip the header row (row 0) of that file
    else:
        skip_row = []
    V_raw = pd.read_csv(piece)

    # add old_centiseconds onto what you had before
    V_raw['centiseconds'] = np.arange(len(V_raw)) + old_centiseconds #label each centisecond

    # update old_centiseconds
    old_centiseconds += len(V_raw)

推荐阅读