python - 如何根据不同列中的唯一变量获取 csv 文件中列的运行总计?
问题描述
import csv
def getDataFromFile(filename, dataList):
file = open(filename, "r")
csvReader = csv.reader(file)
for aList in csvReader:
dataList.append(aList)
file.close()
def getTotalByYear(expendDataList):
total = 0
for row in expendDataList:
expenCount = float(row[2])
total += expenCount**
Rtotal = input(print("Enter 'every' or a particular year. "))
if Rtotal == 'every' or == 'Every':
print(expenCount)
如您所见,如果您键入every
或,我得到了第 2 列的运行总计,Every
但我不明白如何在依赖于第一列中的某个变量时为第 2 列计算运行总计。
在这种情况下,我的 CSV 文件包含三列数据。一个year
领域,一个item
领域,一个expenditure
领域。如何expenditure
根据某一年获得该领域的总和?
expendDataList = []
fname = "expenditures.csv"
getDataFromFile(fname, expendDataList)
getTotalByYear(expendDataList)
解决方案
生成运行总计对于生成器函数来说是一项很好的任务。此示例使用filter内置函数来过滤掉不需要的年份(可以使用生成器表达式/列表推导式代替)。然后它遍历选定的行以产生结果。
import csv
def running_totals(year):
with open('year-item-expenditure.csv') as f:
reader = csv.DictReader(f)
predicate = None if year.lower() == 'every' else lambda row: row['Year'] == year
total = 0
for row in filter(predicate, reader):
total += float(row['Expenditure'])
yield total
totals = running_totals('2019')
for total in totals:
print(total)
另一种方法是使用itertools.accumulate,尽管您仍然必须执行所有过滤,因此除非您需要性能,否则这样做没有太多好处。
import csv
import itertools
def running_totals(year):
with open('year-item-expenditure.csv') as f:
reader = csv.DictReader(f)
predicate = None if year.lower() == 'every' else lambda row: row['Year'] == year
# Create a generator expression that yields expenditures as floats
expenditures = (float(row['Expenditure']) for row in filter(predicate, reader))
for total in itertools.accumulate(expenditures):
yield total
推荐阅读
- android - PagingLibrary 未加载更多数据
- php - 盖伊。HTTPS 重定向
- ssl-certificate - NativeScript 'tns run android' 抛出无法找到请求目标的有效认证路径
- intellij-idea - 集成终端在idea 2018中消失了
- stored-procedures - SQL Server 2017 存储过程需要声明表变量,但它已经存在
- php - 502 网关故障(1-C Bitrix)
- python - 如何计算列表中的重复列表?Python
- swift - iOS非阻塞状态栏
- java - 使用 Process 从 Java 代码执行 Java 程序
- javascript - 在记忆游戏中点击卡片不会翻转它