首页 > 解决方案 > 如何从 pandas.value_counts() 返回元素

问题描述

假设我有以下代码:

y = pd.DataFrame([3, 1, 2, 3, 4], columns=['TARGET'])
y['TARGET'].value_counts()

输出:

3.0    2
4.0    1
2.0    1
1.0    1
Name: TARGET, dtype: int64

如何单独返回上面输出中的元素(即计数 2、1、1、1)?

当我尝试下面的代码时:

y['TARGET'].value_counts()[0]

我收到以下错误消息:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2645             try:
-> 2646                 return self._engine.get_loc(key)
   2647             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

KeyError: 0.0

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-59-63137bfef4a6> in <module>
----> 1 index['TARGET'].value_counts()[0]

~\Anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
    869         key = com.apply_if_callable(key, self)
    870         try:
--> 871             result = self.index.get_value(self, key)
    872 
    873             if not is_scalar(result):

~\Anaconda3\lib\site-packages\pandas\core\indexes\numeric.py in get_value(self, series, key)
    447 
    448         k = com.values_from_object(key)
--> 449         loc = self.get_loc(k)
    450         new_values = com.values_from_object(series)[loc]
    451 

~\Anaconda3\lib\site-packages\pandas\core\indexes\numeric.py in get_loc(self, key, method, tolerance)
    506         except (TypeError, NotImplementedError):
    507             pass
--> 508         return super().get_loc(key, method=method, tolerance=tolerance)
    509 
    510     @cache_readonly

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2646                 return self._engine.get_loc(key)
   2647             except KeyError:
-> 2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2649         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2650         if indexer.ndim > 1 or indexer.size > 1:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

KeyError: 0.0

为什么会这样?

当我尝试时:

y['TARGET'].value_counts()[1]

或者

y['TARGET'].value_counts()[2]

等等

它找到了,但是元素的顺序都混淆了。有谁知道为什么会这样?

标签: pythonpandasseries

解决方案


如果需要按Series使用位置选择Series.iatSeries.iloc

s = y['TARGET'].value_counts()
print (s.iat[0])
2
print (s.iloc[0])
2

如果需要按标签选择,这里3用于第一个值使用Series.atSeries.loc

print (s.at[3])
2

print (s.loc[3])
2

像索引一样工作:

print (s[3])
2

推荐阅读