python - 将一系列值设置为条形图的变量 - python
问题描述
在我的数据中,我有一列显示以下选项之一:'NOT_TESTED'
、'NOT_COMPLETED'
、'TOO_LOW'
或介于 5 步之间的值150
(190
如 150、155、160 等)。
我正在尝试绘制一个条形图,它显示每个出现在列中的时间量,包括每个单独的数字。
所以条形图应该在 x 轴上有变量:'NOT_TESTED'
, 'NOT_COMPLETED'
, 'TOO_LOW'
, 150
, 155
,160
等等。
棒的高度应该是它在列中出现的次数。
这是我尝试过的代码,它让我最接近我的目标,但是,所有数字(150-190)都输出 1 作为条形图的值,所以所有的棒都在相同的高度。
这不符合数据,我不知道如何前进。
我是新手,任何指导将不胜感激!
num_range = list(range(150,191, 5))
OUTCOMES = ['NOT_TESTED', 'NOT_COMPLETED', 'TOO_LOW']
OUTCOMES.extend(num_range)
df = df.append(pd.DataFrame(num_range,
columns=['PT1']),
ignore_index = True)
df["outcomes_col"] = df["PT1"].astype ("category")
df["outcomes_col"].cat.set_categories(OUTCOMES , inplace = True )
sns.countplot(x= "outcomes_col", data=df, palette='Magma')
plt.xticks(rotation = 90)
plt.ylabel('Amount')
plt.xlabel('Outcomes')
plt.title("Outcomes per Testing")
plt.show()
pd.DataFrame({'ID': {0: 'GF342', 1: 'IF874', 2: 'FH386', 3: 'KJ190', 4: 'TY748', 5: 'YT947', 6: 'DF063', 7: 'ET512', 8: 'GC714', 9: 'SD978', 10: 'EF472', 11: 'PL489', 12: 'AZ315', 13: 'OL821', 14: 'HN765', 15: 'ED589'}, 'Location': {0: 'Q1', 1: 'Q3', 2: 'Q1', 3: 'Q3', 4: 'Q3', 5: 'Q4', 6: 'Q3', 7: 'Q1', 8: 'Q2', 9: 'Q3', 10: 'Q1', 11: 'Q2', 12: 'Q1', 13: 'Q1', 14: 'Q3', 15: 'Q1'}, 'NEW': {0: 'YES', 1: 'NO', 2: 'NO', 3: 'YES', 4: 'YES', 5: 'NO', 6: 'NO', 7: 'YES', 8: 'NO', 9: 'NO', 10: 'NO', 11: 'YES', 12: 'NO', 13: 'YES', 14: 'YES', 15: 'YES'}, 'YEAR': {0: 2021, 1: 2018, 2: 2019, 3: 2021, 4: 2021, 5: 2019, 6: 2019, 7: 2021, 8: 2018, 9: 2019, 10: 2018, 11: 2021, 12: 2018, 13: 2021, 14: 2021, 15: 2021}, 'PT1': {0: '', 1: 'NOT_TESTED', 2: '', 3: 'NOT_FINISHED', 4: '165', 5: '', 6: '180', 7: '145', 8: '155', 9: '', 10: '', 11: '', 12: 'TOO_LOW', 13: '150', 14: '155', 15: ''}, 'PT2': {0: '', 1: '', 2: '', 3: '', 4: '', 5: 'TOO_LOW', 6: '', 7: '', 8: '160', 9: 'TOO_LOW', 10: '', 11: '', 12: '', 13: '', 14: '', 15: ''}, 'PT3': {0: '', 1: 'TOO_LOW', 2: '', 3: 'TOO_LOW', 4: '', 5: '', 6: '', 7: '', 8: '', 9: '', 10: '', 11: 'NOT_FINISHED', 12: '', 13: '185', 14: '', 15: '165'}, 'PT4': {0: '', 1: '', 2: '', 3: '', 4: '', 5: 165.0, 6: '', 7: '', 8: '', 9: '', 10: '', 11: '', 12: 180.0, 13: '', 14: '', 15: ''}})
这不是整个数据集,只是其中的一部分。
解决方案
从此数据框开始:(
我替换NOT_FINISHED
为NOT_COMPLETED
, 符合您问题中的代码,让我知道此替换是否正确)
ID Location NEW YEAR PT1 PT2 PT3 PT4
0 GF342 Q1 YES 2021
1 IF874 Q3 NO 2018 NOT_TESTED TOO_LOW
2 FH386 Q1 NO 2019
3 KJ190 Q3 YES 2021 NOT_COMPLETED TOO_LOW
4 TY748 Q3 YES 2021 165
5 YT947 Q4 NO 2019 TOO_LOW 165
6 DF063 Q3 NO 2019 180
7 ET512 Q1 YES 2021 145
8 GC714 Q2 NO 2018 155 160
9 SD978 Q3 NO 2019 TOO_LOW
10 EF472 Q1 NO 2018
11 PL489 Q2 YES 2021 NOT_COMPLETED
12 AZ315 Q1 NO 2018 TOO_LOW 180
13 OL821 Q1 YES 2021 150 185
14 HN765 Q3 YES 2021 155
15 ED589 Q1 YES 2021 165
如果您对'PT1'
列的计数图感兴趣,首先您必须定义要绘制的类别。您可以使用pandas.CategoricalDtype
,因此您可以对这些类别进行排序。
因此,您定义了一个新'outcomes_col'
列:
num_range = list(range(150,191, 5))
OUTCOMES = ['NOT_TESTED', 'NOT_COMPLETED', 'TOO_LOW']
OUTCOMES.extend([str(num) for num in num_range])
OUTCOMES = CategoricalDtype(OUTCOMES, ordered = True)
df["outcomes_col"] = df["PT1"].astype(OUTCOMES)
然后您可以继续绘制此列:
sns.countplot(x= "outcomes_col", data=df, palette='Magma')
plt.xticks(rotation = 90)
plt.ylabel('Amount')
plt.xlabel('Outcomes')
plt.title("Outcomes per Testing")
plt.show()
完整代码
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pandas.api.types import CategoricalDtype
df = pd.DataFrame({'ID': {0: 'GF342', 1: 'IF874', 2: 'FH386', 3: 'KJ190', 4: 'TY748', 5: 'YT947', 6: 'DF063', 7: 'ET512', 8: 'GC714', 9: 'SD978', 10: 'EF472', 11: 'PL489', 12: 'AZ315', 13: 'OL821', 14: 'HN765', 15: 'ED589'}, 'Location': {0: 'Q1', 1: 'Q3', 2: 'Q1', 3: 'Q3', 4: 'Q3', 5: 'Q4', 6: 'Q3', 7: 'Q1', 8: 'Q2', 9: 'Q3', 10: 'Q1', 11: 'Q2', 12: 'Q1', 13: 'Q1', 14: 'Q3', 15: 'Q1'}, 'NEW': {0: 'YES', 1: 'NO', 2: 'NO', 3: 'YES', 4: 'YES', 5: 'NO', 6: 'NO', 7: 'YES', 8: 'NO', 9: 'NO', 10: 'NO', 11: 'YES', 12: 'NO', 13: 'YES', 14: 'YES', 15: 'YES'}, 'YEAR': {0: 2021, 1: 2018, 2: 2019, 3: 2021, 4: 2021, 5: 2019, 6: 2019, 7: 2021, 8: 2018, 9: 2019, 10: 2018, 11: 2021, 12: 2018, 13: 2021, 14: 2021, 15: 2021}, 'PT1': {0: '', 1: 'NOT_TESTED', 2: '', 3: 'NOT_COMPLETED', 4: '165', 5: '', 6: '180', 7: '145', 8: '155', 9: '', 10: '', 11: '', 12: 'TOO_LOW', 13: '150', 14: '155', 15: ''}, 'PT2': {0: '', 1: '', 2: '', 3: '', 4: '', 5: 'TOO_LOW', 6: '', 7: '', 8: '160', 9: 'TOO_LOW', 10: '', 11: '', 12: '', 13: '', 14: '', 15: ''}, 'PT3': {0: '', 1: 'TOO_LOW', 2: '', 3: 'TOO_LOW', 4: '', 5: '', 6: '', 7: '', 8: '', 9: '', 10: '', 11: 'NOT_COMPLETED', 12: '', 13: '185', 14: '', 15: '165'}, 'PT4': {0: '', 1: '', 2: '', 3: '', 4: '', 5: 165.0, 6: '', 7: '', 8: '', 9: '', 10: '', 11: '', 12: 180.0, 13: '', 14: '', 15: ''}})
num_range = list(range(150,191, 5))
OUTCOMES = ['NOT_TESTED', 'NOT_COMPLETED', 'TOO_LOW']
OUTCOMES.extend([str(num) for num in num_range])
OUTCOMES = CategoricalDtype(OUTCOMES, ordered = True)
df["outcomes_col"] = df["PT1"].astype(OUTCOMES)
sns.countplot(x= "outcomes_col", data=df, palette='Magma')
plt.xticks(rotation = 90)
plt.ylabel('Amount')
plt.xlabel('Outcomes')
plt.title("Outcomes per Testing")
plt.show()
推荐阅读
- python - 在哪里可以找到 PyQt5 驱动程序?
- wear-os - WearOS - 分发 - 如何准备我的应用程序以选择加入 wearOS?
- plotly-python - 同一 x 轴上的多个图:make_subplots - 跟踪错误
- rest - REST API 资源是否应该根据查询参数进行授权?
- excel - 结果未计入excel vba电子表格
- tomcat - Jython Tomcat IntitialContext 无法查找上下文 JDBC JNDI 连接
- jq - 使用方法 | 添加和保留具有不同值的重复键,而是将它们添加到数组中
- javascript - 如何运行多个 Promise [键入针对 es5 的脚本]
- api - 如何通过 API Woocommerce 订阅更新获取?
- mongodb - 如何在满足给定条件的mongodb中创建帐户集合