python - 从向量中分类和打印信息

问题描述

我是编程初学者，所以我不太了解 Python，我有一个从 CSV 文件中获取信息的代码：

23;1;42.8
21;1;....

所以我认为信息在向量中更容易管理，所以我编写了以下代码：

import csv

with open("city_traffic.csv") as file_csv:
    csv_reader = csv.reader(file_csv, delimiter=';')

    cities=[]
    for line in csv_reader: 
        new_list=[]
        i=int(line[0])
        k=int(line[1])
        j=float(line[2])
        new_list.append(k)
        new_list.append(i)
        new_list.append(j)      

        cities.append(new_list)

    for s in cities:
        print("City: "+str(s[0])+ ". Total Amount of Traffic: "+str(s[2])+ ". Rush Hour: "+str(s[1]))

输出如下：

City: 1. Total Amount of Traffic: 42.8. Rush Hour: 23
City: 1. Total Amount of Traffic: 89.1. Rush Hour: 21
City: 4. Total Amount of Traffic: 60.5. Rush Hour: 2
City: 4. Total Amount of Traffic: 50.6. Rush Hour: 10
City: 3. Total Amount of Traffico: 44.2. Rush Hour: 10

我的问题是，有没有一种方法可以对信息进行分类或排序，以便我可以添加每个城市的总流量，然后显示哪个小时是流量最多的时间，例如：

City: 1 Total Amount of Traffic: 131.9 Rush Hour: 21
City: 4 Total Amount of Traffic: 111.1 Rush Hour: 2
City: 3 Total Amount of Traffic: 44.2 Rush Hour: 10

正如您所看到的，我没有最后一部分的任何代码，我一直在努力解决这个问题，但我会很感激有关如何做到这一点的任何建议，或者改进我的代码。谢谢你。

标签： pythonsortingvector

解决方案

你应该使用pandas这个。它有许多有用的功能，不需要使用for-loop

在开始时，您可以阅读它并在一行代码中添加列名（如果您不计算import）

import pandas as pd

df = pd.read_csv('city_traffic.csv', sep=';', names=['Rush', 'City', 'Traffic'])

你可以显示它

print(df)

结果：

   Rush  City  Traffic
0    23     1     42.8
1    21     1     89.1
2     2     4     60.5
3    10     4     50.6
4    10     3     44.2

它还具有仅显示某些列或行的功能

print(df[ df['City'] == 1 ])

结果：

   Rush  City  Traffic
0    23     1     42.8
1    21     1     89.1

或者，如果您需要使用for-loop

for index, row in df.iterrows():
    print(f"City: {row['City']}. Total Amount of Traffic: {row['Traffic']}. Rush Hour: {row['Rush']}")

结果：

City: 1.0. Total Amount of Traffic: 42.8. Rush Hour: 23.0
City: 1.0. Total Amount of Traffic: 89.1. Rush Hour: 21.0
City: 4.0. Total Amount of Traffic: 60.5. Rush Hour: 2.0
City: 4.0. Total Amount of Traffic: 50.6. Rush Hour: 10.0
City: 3.0. Total Amount of Traffic: 44.2. Rush Hour: 10.0

使用pandas你可以分组City和求和Traffic

groups = df.groupby('City')

print(groups['Traffic'].sum())

结果：

City
1    131.9
3     44.2
4    111.1
Name: Traffic, dtype: float64

在不同列的组中，您可以运行不同的函数：sumforTraffic和minforRush

new_df = groups.agg({'Traffic': 'sum', 'Rush': 'min'})
new_df = new_df.reset_index()

print(new_df)

结果：

   City  Traffic  Rush
0     1    131.9    21
1     3     44.2    10
2     4    111.1     2

最少的工作代码。

我只使用io.StringIOinread_csv()来模拟内存中的文件，但你应该使用read_csv('city_traffic.csv', ...)

text ='''23;1;42.8
21;1;89.1
2;4;60.5
10;4;50.6
10;3;44.2'''

import pandas as pd
import io

#df = pd.read_csv('city_traffic.csv', sep=';', names=['Rush', 'City', 'Traffic'])
df = pd.read_csv(io.StringIO(text), sep=';', names=['Rush', 'City', 'Traffic'])

print(df)
print('---')

print(df[ df['City'] == 1 ])
print('---')

for index, row in df.iterrows():
    print(f"City: {row['City']}. Total Amount of Traffic: {row['Traffic']}. Rush Hour: {row['Rush']}")
print('---')


groups = df.groupby('City')

print(groups['Traffic'].sum())
print('---')


new_df = groups.agg({'Traffic': 'sum', 'Rush': 'min'})
new_df = new_df.reset_index()
print(new_df)
print('---')

#new_df['City'] = new_df['City'].replace({1:'Berlin', 4:'Paris', 3:'Roma'})
new_df['City'] = ['Berlin', 'Paris', 'Roma']
print(new_df)
print('---')

for index, row in new_df.iterrows():
    print(f"City: {row['City']:6} | Total Amount of Traffic: {row['Traffic']:6.2f} | Rush Hour: {row['Rush']:2}")
print('---')

结果：

   Rush  City  Traffic
0    23     1     42.8
1    21     1     89.1
2     2     4     60.5
3    10     4     50.6
4    10     3     44.2
---
   Rush  City  Traffic
0    23     1     42.8
1    21     1     89.1
---
City: 1.0. Total Amount of Traffic: 42.8. Rush Hour: 23.0
City: 1.0. Total Amount of Traffic: 89.1. Rush Hour: 21.0
City: 4.0. Total Amount of Traffic: 60.5. Rush Hour: 2.0
City: 4.0. Total Amount of Traffic: 50.6. Rush Hour: 10.0
City: 3.0. Total Amount of Traffic: 44.2. Rush Hour: 10.0
---
City
1    131.9
3     44.2
4    111.1
Name: Traffic, dtype: float64
---
   City  Traffic  Rush
0     1    131.9    21
1     3     44.2    10
2     4    111.1     2
---
     City  Traffic  Rush
0  Berlin    131.9    21
1   Paris     44.2    10
2    Roma    111.1     2
---
City: Berlin | Total Amount of Traffic: 131.90 | Rush Hour: 21
City: Paris  | Total Amount of Traffic:  44.20 | Rush Hour: 10
City: Roma   | Total Amount of Traffic: 111.10 | Rush Hour:  2
---

python - 从向量中分类和打印信息

问题描述

解决方案

推荐阅读