python - How to sample dataframe that results in same distribution from a column in another dataframe
问题描述
Using Pandas:
I have a dataframe that has people in it like this:
member_id on_service start_date end_date days_in_study dod \
12345678 12345678 False 2019-11-03 2020-05-31 210 NaT
23456789 23456789 True 2019-12-27 2020-05-31 156 NaT
last_enrollment_date RAF Expense Age admits_in_range \
12345678 2020-05-31 0.144511 0.042008 0.716981 0
23456789 2020-05-31 0.145709 0.033580 0.547170 0
I am doing some analysis between the on_service group versus not on service.
I would like to sample the not on_service_group to have the same Age distribution as the on_service_group
I have tried
weights = on_service_members["Age"]
df = no_on_service_members.sample(weights = weights)
But I am getting an error "Invalids weights: weights sum to zero"
I think it is because it is not using the Age column to look up the weight? Or perhaps I am completely on the wrong track.
解决方案
我相信我已经找到了解决方案,但是这似乎应该是我缺少的标准库的一部分。
def sample_with_distribution(source_of_distribution,source_to_sample,column_name):
size_to_sample = len(source_of_distribution)
bins = source_of_distribution[column_name].value_counts(bins=8,normalize=True)
new_data_frame = pd.DataFrame(data=None, columns=source_to_sample.columns)
for iv, bin_size in bins.iteritems():
m = source_to_sample[(source_to_sample[column_name] > iv.left) & (source_to_sample[column_name] <= iv.right)]
how_many = int(bin_size * size_to_sample)
if how_many > len(m):
print( "ISSUE: How many we want ", how_many, " How big is it ", len(m))
how_many = len(m)
a = m.sample(n = how_many, random_state=100)
new_data_frame = new_data_frame.append(a)
return new_data_frame
它似乎确实有效。当我通过 KDE 运行 TTEST 和图形时,看起来我得到了我想要的。
推荐阅读
- java - 如何在java中刷新蓝牙设备列表
- ansible - 如果我想在同一个盒子上部署 gitlab + ansible,我需要“shell”运行器吗?
- python - “sys.version_info”的类型是什么?
- c++ - 在 Word 中计算字母
- google-bigquery - BigQuery 按最后一个日期过滤并使用分区
- regex - 打印相关符号之前的所有内容,并在相关符号之后保留 1 个字符
- git - 如何显示基于先前提交的提交?
- php - PHP循环获取HTML下拉列表中的选项组
- c# - 从一个按钮的“单击”方法访问其他按钮
- ios - 你能用 WINDOWS 构建一个 react-native iOS 应用吗?