首页 > 解决方案 > Pandas 相当于 R 中的 ntile()

问题描述

我正在同时学习 pandas 和 R 并且想知道是否有任何方法可以在 pandas 中进行跟踪?

y = c(3,2,2,NA,30,4)
ntile(y, n=2) # 1  1  1 NA  2  2

Pandas
y = pd.Series((3,2,2,np.nan,30,4))
??

Explanation:
From: (3,2,2,NA,30,4)
To:   1  1  1 np.nan  2  2
Logic: first three number are smaller in rank and assign rank 1
       last two values are larger so have rank 2.

**Required Output**
array([1  1 1 nan  2  2])

标签: pythonrpandasnumpy

解决方案


尝试:

pd.qcut(y, q=2)

0    (1.999, 3.0]
1    (1.999, 3.0]
2    (1.999, 3.0]
3             NaN
4     (3.0, 30.0]
5     (3.0, 30.0]
dtype: category
Categories (2, interval[float64]): [(1.999, 3.0] < (3.0, 30.0]]

如果你想要数字答案:

cuts = 2
pd.qcut(y,q=cuts, labels=range(1, cuts+1))

0    1.0
1    1.0
2    1.0
3    NaN
4    2.0
5    2.0
dtype: category
Categories (2, int64): [1 < 2]

推荐阅读