python - 在熊猫python中添加列
问题描述
在数据框中添加列后,我没有得到正确的输出文件。这是我的输入文件
Security Wise Delivery Position - Compulsory Rolling Settlement
10,MTO,01022018,592287763,0001583
Trade Date <01-FEB-2018>,Settlement Type <N>,Settlement No <2018023>,Settlement Date <05-FEB-2018>
Record Type,Sr No,Name of Security,Quantity Traded,Deliverable Quantity(gross across client level),% of Deliverable Quantity to Traded Quantity
20,1,20MICRONS,EQ,53466,27284,51.03
20,2,3IINFOTECH,EQ,7116046,3351489,47.10
20,3,3MINDIA,EQ,2613,1826,69.88
20,4,5PAISA,EQ,8463,5230,61.80
20,5,63MOONS,EQ,324922,131478,40.46
期望输出
20,1,20MICRONS,EQ,53466,27284,51.03,01022018
20,2,3IINFOTECH,EQ,7116046,3351489,47.10,01022018
20,3,3MINDIA,EQ,2613,1826,69.88,01022018
20,4,5PAISA,EQ,8463,5230,61.80,01022018
20,5,63MOONS,EQ,324922,131478,40.46,01022018
我的代码
import pandas as pd
df = pd.read_csv('C:/Working/dalal/MTO_11052018.DAT', sep='\t',skiprows=1)
df=df.iloc[1]
l1=list(str(df).split(","))
l2=l1[2]
df2=pd.read_csv('C:/Working/dalal/MTO_11052018.DAT',sep='\t',skiprows=3)
df2['Trans_dt']=df2.apply(lambda row:[l2],axis=1)
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT',sep=',')
我没有得到预期。请帮忙
解决方案
我认为需要header=1
第二行到列,nrows=0
没有行并且usecols=[2]
只读第三列:
import pandas as pd
temp=u"""Security Wise Delivery Position - Compulsory Rolling Settlement
10,MTO,01022018,592287763,0001583
Trade Date <01-FEB-2018>,Settlement Type <N>,Settlement No <2018023>,Settlement Date <05-FEB-2018>
Record Type,Sr No,Name of Security,Quantity Traded,Deliverable Quantity(gross across client level),% of Deliverable Quantity to Traded Quantity
20,1,20MICRONS,EQ,53466,27284,51.03
20,2,3IINFOTECH,EQ,7116046,3351489,47.10
20,3,3MINDIA,EQ,2613,1826,69.88
20,4,5PAISA,EQ,8463,5230,61.80
20,5,63MOONS,EQ,324922,131478,40.46"""
#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
a = pd.read_csv(pd.compat.StringIO(temp), nrows=0, header=1, usecols=[2]).columns
print (a)
Index(['01022018'], dtype='object')
然后读取所有必要的数据和assign
新列:
#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=3).assign(Trans_dt=a[0])
print (df)
Record Type ... Trans_dt
20 1 ... 1022018
20 2 ... 1022018
20 3 ... 1022018
20 4 ... 1022018
20 5 ... 1022018
[5 rows x 7 columns]
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT')
#if columns names is necessary remove
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', header=None)
或者如果需要默认范围索引,则类似:
#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=3).rename_axis('val').reset_index().assign(Trans_dt=a[0])
print (df)
val ... Trans_dt
0 20 ... 1022018
1 20 ... 1022018
2 20 ... 1022018
3 20 ... 1022018
4 20 ... 1022018
[5 rows x 8 columns]
如果列名不重要:
#after testing replace 'pd.compat.StringIO(temp)' to 'C:/Working/dalal/MTO_11052018.DAT'
df = pd.read_csv(pd.compat.StringIO(temp), skiprows=4, header=None).assign(Trans_dt=a[0])
print (df)
0 1 2 3 4 5 6 Trans_dt
0 20 1 20MICRONS EQ 53466 27284 51.03 1022018
1 20 2 3IINFOTECH EQ 7116046 3351489 47.10 1022018
2 20 3 3MINDIA EQ 2613 1826 69.88 1022018
3 20 4 5PAISA EQ 8463 5230 61.80 1022018
4 20 5 63MOONS EQ 324922 131478 40.46 1022018
最后:
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', index=False)
#if columns names is necessary remove
df2.to_csv('C:/Working/dalal/deldata/MTO_11052018.OUT', index=False, header=None)
推荐阅读
- c++ - 如何修复“PCH 警告:标头停止不在文件范围内”
- java - XSL 多列切换
- shell - 如何从makefile中的字符串中分离出版本号?
- c++ - emscripten 和 boost 库:如何为 webassembly 编译现有项目?
- python - 检测日期浓度(python中的列表)
- c# - 模拟hostingenvironment.mappath apicontroller c#
- java - CloseableHttpClient 抛出错误:org.apache.http.impl.execchain.RetryExec execute with java.net.SocketException
- php - PHP将字节转换为字节数组
- javascript - React Table:row.original 属性在同一个 console.log 中评估不同
- javascript - 我正在处理的故障效果不起作用