首页 > 解决方案 > 总和值取决于python中没有熊猫的其他列

问题描述

我试图总结这一点,但我得到了一个关于unsupported operand type(s) for +: 'int' and 'str'. 之后我写了Change_Point = int(float(row[6])),我得到了'int' object is not iterable。我只想总结没有熊猫的变化点。

import csv
import json

with open('sample.csv','r') as file:
    rows = csv.reader(file,delimiter='|')
    next(rows, None)
    y = []
    orders = {}
    for row in rows:
        PointRec_ID = row[0]        
        Opeartion = row[1]        
        Member_ID = row[2]                
        Order_ID = row[3]        
        Point_Valid_Date = row[4]        
        Point_Invalid_Date = row[5]        
        Change_Point = row[6]  
        Accumulative_Point = row[7]       
        if not Order_ID in orders :
            orders[Order_ID] = {
                'PointRec_ID': PointRec_ID,
                'Opeartion': Opeartion,
                'Member_ID': Member_ID,
                'Order_ID': Order_ID,
                'Point_Valid_Date': Point_Valid_Date,
                'Point_Invalid_Date': Point_Invalid_Date,
                'Change_Point': sum(Change_Point),
                'Accumulative_Point': Accumulative_Point,                
            }
        order = orders[Order_ID]
    for Order_ID in orders:        
        y.append(orders[Order_ID])
print(json.dumps(y))

示例 .csv:

PointRec_ID|Opeartion|Member_ID|Order_ID|Point_Valid_Date|Point_Invalid_Date|Change_Point|Accumulative_Point
20200819000001760|Point gain|00100224165|AD031SA12016866|2020-08-23 16:00:00|2021-08-23 16:00:00|639|934
20200819000001761|Point gain|00100224165|AD031SA12016866|2020-08-23 16:00:00|2021-08-23 16:00:00|0|934
20200819000001762|Point gain|00100224165|AD031SA12016866|2020-08-23 16:00:00|2021-08-23 16:00:00|1|935
20200819000001763|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|89|90
20200819000001764|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|699|789
20200819000001765|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|0|789
20200819000001766|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|1|790
20200819000001767|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|0|790
20200819000001768|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|1169|1959

期望结果(如果我输出 csv):

20200819000001762|Point gain|00100224165|AD031SA12016866|2020-08-23 16:00:00|2021-08-23 16:00:00|640|935
20200819000001768|Point gain|00101206808|AD031SA12016867|2020-08-23 16:00:00|2021-08-23 16:00:00|1958|1959

任何帮助将不胜感激。

标签: pythonjsoncsv

解决方案


您需要对现有项目的值求和。

import csv
import json

with open('sample.csv','r') as file:
    rows = csv.reader(file,delimiter='|')
    y = [next(rows, None)]
    
    orders = {}
    for row in rows:
        PointRec_ID = row[0]        
        Opeartion = row[1]        
        Member_ID = row[2]                
        Order_ID = row[3]        
        Point_Valid_Date = row[4]        
        Point_Invalid_Date = row[5]        
        Change_Point = row[6]  
        Accumulative_Point = row[7] 
            
        if not Order_ID in orders :
            orders[Order_ID] = {
                'PointRec_ID': PointRec_ID,
                'Opeartion': Opeartion,
                'Member_ID': Member_ID,
                'Order_ID': Order_ID,
                'Point_Valid_Date': Point_Valid_Date,
                'Point_Invalid_Date': Point_Invalid_Date,
                'Change_Point': int(Change_Point),
                'Accumulative_Point': Accumulative_Point,                
            }
        else:
            orders[Order_ID]["Change_Point"] +=  int(Change_Point)
            
    for Order_ID in orders:        
        y.append(list(orders[Order_ID].values()))
print(y)

推荐阅读