首页 > 解决方案 > Unable to assign values to a pandas data frame column from a list suing iteration

问题描述

I am changing my original code, to present a much simplified version of it. But, this is where the main problem is occurring. Using the following code:

Sp=pd.DataFrame()
l1=['a', 'b', 'c']
for i in l1:
    Sp['col1'] = i

Gives me the result Sp as:

col1

I would want my col1 to have values a, b and c. Could anyone please suggest why this is happening, and how to rectify it.

EDIT:

For every value in my list, I use it to connect to a different file using os, (file names are made up of list values). After picking up the csv file from there I take values such as mean, devisation etc. of the data from the file and assign those values to sp in another column. My final sp should look something as follows:

col1    Mean    Median  Deviation
a       1       1.1     0.5
b       2       2.1     0.5
c       3       3.1     0.5

标签: python-3.xlistpandas

解决方案


编辑:如果需要为每个循环创建DataFrame和处理它,迭代并最终DataFrame附加到 DataFrames 列表。最后concat将所有聚合的 DataFrame 放在一起:

dfs = []
l1 = ['a', 'b', 'c']
for i in l1:
    df = pd.read_csv(file)
    df = df.groupby('col').agg({'col1':'mean', 'col2':'sum'})
    #another code
    dfs.append(df)

Sp = pd.concat(dfs, ignore_index=True)

老答案:

我认为需要调用DataFrame构造函数list

Sp = pd.DataFrame({'col1':l1})

如果真的需要它,但它是最慢的解决方案

6)一次更新一个空帧。我已经看到这种方法使用得太多了。它是迄今为止最慢的。这可能是常见的地方(对于某些 python 结构来说相当快),但是 DataFrame 对索引进行了大量检查,因此一次更新一行总是很慢。更好地创建新结构和连接。

Sp=pd.DataFrame()
l1=['a', 'b', 'c']
for j, i in enumerate(l1):
    Sp.loc[j, 'col1'] = i

print (Sp)
  col1
0    a
1    b
2    c

推荐阅读