python - pandas dataFrame过滤器/从列中导出标量值
问题描述
我的数据在 CSV 中,如下所示:
(m-M),err(m-M),D,Method,Refcode,Notes,SN Name,Redshift,H0,LMCModulus
28.96,0.20,6.190,SNII optical,2017ApJ...841..127M,EPM,SN 2013ej,,,
29.13,,6.700,SNII optical,2004A&A...427..453V,EPM,SN 2002ap,,,
29.29,,7.200,SNII optical,2006PASP..118..351V,,SN 2003gd,,,
29.94,0.54,9.730,SNII optical,2010ApJ...715..833O,"SCM, I",SN 2003gd,,,
29.98,0.28,9.910,SNII optical,2010ApJ...715..833O,"SCM, BVI",SN 2003gd,,,
29.98,0.55,9.910,SNII optical,2010ApJ...715..833O,"SCM, V",SN 2003gd,,,
29.99,0.42,9.950,SNII optical,2010ApJ...715..833O,"SCM, B",SN 2003gd,,,
30.01,0.07,10.000,SNII optical,2014AJ....148..107R,"V, photospheric magnitude method",SN 2013ej,,,
26.72,0.69,2.210,Tully-Fisher,1984A&AS...56..381B,B,,,103.00,
29.93,0.40,9.700,Tully-Fisher,1988NBGC.C....0000T,B,,,75.00,
我的代码是:
import pandas as pd,
from pandas import DataFrame
d = pd.read_csv('ngc0628_zid.csv')
d # Whole of the CSV prints OK
d.loc[:, 'D':'Method']
sub_d = d.loc[d['Method'] == 'SNII optical'] # Filter for 'SNII Optical' only - OK
sub_d.loc[:, 'D':'Method'] # Just report columns 'D' and 'Method' - OK
maxColumn = sub_d.max(axis=0)
maxColumn # Prints max of all values
minColumn = sub_d.min(axis=0)
minColumn # Prints max of all values
meanColumn = sub_d.mean(axis=0)
meanColumn # Prints mean of all values
问题:我找不到一种方法来选择“D”列来处理平均值、最大值、最小值,而不会出现语法错误。在每种情况下,我只能得到一个值表,而不是我需要的 3 个标量。
解决方案
国际大学联合会,
import pandas as pd
import numpy as np
from io import StringIO
csvfile = StringIO("""(m-M),err(m-M),D,Method,Refcode,Notes,SN Name,Redshift,H0,LMCModulus
28.96,0.20,6.190,SNII optical,2017ApJ...841..127M,EPM,SN 2013ej,,,
29.13,,6.700,SNII optical,2004A&A...427..453V,EPM,SN 2002ap,,,
29.29,,7.200,SNII optical,2006PASP..118..351V,,SN 2003gd,,,
29.94,0.54,9.730,SNII optical,2010ApJ...715..833O,"SCM, I",SN 2003gd,,,
29.98,0.28,9.910,SNII optical,2010ApJ...715..833O,"SCM, BVI",SN 2003gd,,,
29.98,0.55,9.910,SNII optical,2010ApJ...715..833O,"SCM, V",SN 2003gd,,,
29.99,0.42,9.950,SNII optical,2010ApJ...715..833O,"SCM, B",SN 2003gd,,,
30.01,0.07,10.000,SNII optical,2014AJ....148..107R,"V, photospheric magnitude method",SN 2013ej,,,
26.72,0.69,2.210,Tully-Fisher,1984A&AS...56..381B,B,,,103.00,
29.93,0.40,9.700,Tully-Fisher,1988NBGC.C....0000T,B,,,75.00,""")
df = pd.read_csv(csvfile)
vmin, vmax, vmean, vmedian = df['D'].agg(['min', 'max', 'mean', 'median'])
print(vmin)
print(vmax)
print(vmean)
print(vmedian)
print(f'The min is {vmin}. The max is {vmax}. The mean is {vmean}. The median is {vmedian}.')
输出:
10.0
8.15
9.715
The min is 2.21. The max is 10.0. The mean is 8.15. The median is 9.715.
推荐阅读
- php - PHP提交按钮没有任何效果(PhpStorm)
- angular - 我们如何在路由模块中使用服务来为路由路径提供动态名称?
- groovy - 使用 ant builder 压缩文件并排除在 groovy 中无法按预期工作
- mysql - MySQL没有使用复合索引的所有关键部分
- typescript - 带有 vuejs 打字稿和 pwa 的 firebaseui
- jquery - jquery脚本检查文本框是否为空,或者复选框是否被选中
- vue.js - 如果验证后没有错误,Vusjs Vee-validate 添加类
- java - JTable 保龄球记分卡
- python - Matplotlib 图例 + 紧密布局 = 压扁的子图
- android - 透明状态栏