python - CSV 文件数据组织(结构化数据)
问题描述
我一直在尝试从这个 csv 文件中提取数据并以一种我可以更清楚地查看数据的方式对其进行组织。目标是创建 2 个字典。一种保存来自 csv 中列出的区域的数据。另一个在 csv 中保存来自国家/地区的数据。我在循环数据时遇到问题。csv 文件首先开始列出所有区域。直到“ID”列达到第 4 位,国家才开始需要帮助来组织它。到目前为止我有这个。但我仍然需要帮助根据地区和国家组织它。csv 文件的链接是: https ://docs.google.com/document/d/1v68_QQX7Tn96l-b0LMO9YZ4ZAn_KWDMUJboa6LEyPr8/edit?usp=sharing
import csv
f = open('dph_SYB60_T03_Population Growth, Fertility and Mortality Indicators.csv')
reader = csv.DictReader(f)
data_by_region = {}
data_by_country = {}
answers = []
for line in reader:
#Collects all the region names
regions = line['Region/Country/Area']
# Gets All the Years
years = line['Year']
# print(regions)
if regions not in data_by_region:
data_by_region[regions] = {}
解决方案
也许这会有所帮助:
import csv
f = open('dph_SYB60_T03_Population Growth, Fertility and Mortality Indicators.csv', encoding='utf-8-sig')
reader = csv.DictReader(f)
data_by_region = {}
data_by_country = {}
answers = []
for line in reader:
# Collects all the region names
regions = line['Region/Country/Area']
# Gets All the Years
years = line['Year']
# print(regions)
if regions not in data_by_region:
data_by_region[regions] = [line]
else:
data_by_region[regions].append(line)
# print data count group by regions.
for region, data_list in data_by_region.items():
print('{:>30s}: {} rows.'.format(region, len(data_list)))
输出:
Total, all countries or areas: 21 rows.
Africa: 18 rows.
Northern Africa: 21 rows.
Sub-Saharan Africa: 21 rows.
Eastern Africa: 18 rows.
Middle Africa: 18 rows.
Southern Africa: 18 rows.
Western Africa: 18 rows.
Northern America: 18 rows.
...
推荐阅读
- reactjs - express-typescript-react: 404 (not found) frontend bundle file
- arduino - Arduino serial.available() 触发按键
- html - 添加到购物车按钮对齐和尺寸问题
- algorithm - 基于时间的聚类推荐算法
- arm-template - 可以从多个(复制/复制索引)ARM 子模板中获取输出吗?
- java - 使用 Clip 和 AudioInputstream 的声音处理程序将音频保存在 RAM 中
- swift - 排序本地化数据的领域查询问题
- bash - Windows 批处理脚本中 $() 的替换
- c++ - 打印 multimap 的所有元素,它们是 C++ 中的 2 个不同对象对?
- single-sign-on - OpenID Connect 用户映射