首页 > 解决方案 > Showing the total of a column without repeating values

问题描述

I have a script which outputs a csv with five columns. I've added two lines of code to SUM two of those columns. I have managed to do this, however, the totals are these columns are repeated on every row, where i just want the totals to be shown on one row.

df['Unit Total'] = df['Units Sold'].sum()
df['Total Revenue'] = df['data_revenue'].sum()

This is what my script produces

8   0.013207    AR  ARS 0.105656    74012   575.2779
10  0.013207    AR  ARS 0.13207     74012   575.2779
6   0.013207    AR  ARS 0.079242    74012   575.2779
6   0.013207    AR  ARS 0.079242    74012   575.2779

What i actually want to see

8   0.013207    AR  ARS 0.105656    74012   575.2779
10  0.013207    AR  ARS 0.13207     
6   0.013207    AR  ARS 0.079242    
6   0.013207    AR  ARS 0.079242    

My Script

for filename in filelist:
    print(filename)
    df = pandas.read_csv('SYB_M_20171001_20171031.txt', header=None, encoding='utf-8', sep='\t', names=colnames,
                         skiprows=3, usecols=['Units Sold', 'Dealer Price', 'End Consumer Country', 'Currency Code']
                         )
    df['data_revenue'] = df['Units Sold'] * df['Dealer Price']
    df = df.sort_values(['End Consumer Country', 'Currency Code'])
    df['Unit Total'] = df['Units Sold'].sum()
    df['Total Revenue'] = df['data_revenue'].sum()
    df.to_csv(outfile + r"\output.csv", index=None)
    dflist.append(filename)

标签: pythonpandas

解决方案


Set first value of index by position:

df.loc[df.index[0], 'Unit Total'] = df['Units Sold'].sum()

df.loc[df.index[0], 'Unit Revenue'] = df['data_revenue'].sum()

Another solution is create default index by reset_index with drop=True, so possible set by 0:

df = df.sort_values(['End Consumer Country', 'Currency Code']).reset_index(drop=True)

df.loc[0, 'Unit Total'] = df['Units Sold'].sum()
df.loc[0, 'Unit Revenue'] = df['data_revenue'].sum()

推荐阅读