python - calculate the median age for each region from frequency table with python
问题描述
I have a dataframe that is similar to:
I would like to calculate the median age for each city but given that it is a frequency table I'm finding it somewhat tricky. Is there any function in pandas or other that would help me achieve this?
解决方案
For each row, find the number of instances there are. Then take that number, divide by 2, and determine what age that would be by checking if the number of people have the age smaller than what we are looking for.
For example, for the row 'alabama', you would add 34 + 67 + ... + 23 = 5463. That, divided by 2, would be 2731.5 ==> 2731. Then, checking each age group, determine where the 2731th person would be.
- At age 1, since 2731 > 34, check the next.
- At age 2, since 2731 > 34 + 67, check the next.
- At age 3, since 2731 > 34 + 67 + 89, check the next.
- At age 4, since 2731 > 34 + 67 + 89 + 89, check the next.
- At age 5, since 2731 > 34 + 67 + 89 + 89 + 67, check the next.
- At age 6, since 2731 > 34 + 67 + 89 + 89 + 67 + 545, check the next.
- At age 7, since 2731 < 34 + 67 + 89 + 89 + 67 + 545 + 4546, the median has to be in this age group.
Do this repeatedly for each city/state, and you should get the median for each one.
推荐阅读
- css - 具有自定义颜色和间距的 Bootstrap 4 导航栏居中和样式
- android-studio - 如何在android studio中打开asset studio?
- mysql - 引起:java.sql.SQLException:无法添加外键约束
- java - 使用jsoup解析带有多个孩子的xml
- python - Pandas - 对如何操作数据框感到困惑
- r - 按多个变量分组的汇总表,按列而不是行分组
- julia - 如何在 Julia 中将字符转换为字符串?
- asp.net-core - 使用 EFCore = no / 将上下文实体实例化为异步列表 = 有效的 CrossDB 查询 - 我应该采用什么真正的方法?
- oracle - Oracle expdp 和 impdp 命令?
- vue.js - TypeError:无法设置未定义的属性