pandas - Count and assign categories based on majority voting
问题描述
I have a pandas dataframe in the below format:
Class Category
XYZ ABC
XYZ ABC
XYZ DEF
XYZ1 ABC
XYZ1 ABC
XYZ1 ABC
XYZ1 HLR
XYZ2 ABC
For every unique class, if there are duplicates, based on "majority voting", I assign the corresponding category to that class. For example, for "XYZ", Category should be "ABC". For "XYZ1", category has to be "ABC" as well as "HLR" appears only once. If there are no discrepencies, then its straightforward (for "XYZ2", it would be "ABC").
Wondering is there a way to achieve this without storing the value counts in a table and then loop over it to groupby and assign categories based on majority voting.
Any leads would be appreciated.
解决方案
尝试通过mode
:
from statistics import mode
df['New_Categroy'] = df.groupby('Class').transform(mode)
输出:
Class Category New_Categroy
0 XYZ ABC ABC
1 XYZ ABC ABC
2 XYZ DEF ABC
3 XYZ1 ABC ABC
4 XYZ1 ABC ABC
5 XYZ1 ABC ABC
6 XYZ1 HLR ABC
7 XYZ2 ABC ABC
推荐阅读
- amazon-web-services - Redshift 中的嵌套 While 循环
- unit-testing - Unit Testing Login Vue Jest ValidationProvider
- python - How to communicate via SSH connection and plink in Python?
- python - Animating a scatter plot with a stationary gap in python
- android - Turn off pause subscription option in google play billing
- regex - Regex for a permutation of exactly 7 digits and 2 hyphens, without 2 consecutive hyphens
- c++ - SetWindowsHookEx hooking into every running program
- r - Create Table from Summary() in R
- python - Pandas 将 DataFrame2 ROW 附加到 DataFrame1 ROW
- xamarin - 如何提高 CollectionView 的性能