python - 按日期和条件不匹配结果汇总 Python pandas 列
问题描述
我有一个有两列的熊猫:日期和情绪。我需要按天和情绪类型的数量(积极、中立、消极)对其进行分组
原始数据框:
在我的代码之后,总和/天与不同情绪的总和不匹配:
df_diario = df_com_sentiment.groupby( df_com_sentiment.date.dt.floor('d')).size().reset_index(name='n_tweets')
df_diario['TB_POSITIVE'] = df_com_sentiment.groupby( df_com_sentiment[df_com_sentiment['TextBlob_sentiment_type']=='POSITIVE'].date.dt.floor('d')).size().reset_index(name='TB_POSITIVE').TB_POSITIVE.astype(int)
df_diario['TB_NEGATIVE'] = df_com_sentiment.groupby( df_com_sentiment[df_com_sentiment['TextBlob_sentiment_type']=='NEGATIVE'].date.dt.floor('d')).size().reset_index(name='TB_NEGATIVE').TB_NEGATIVE.astype(int)
df_diario['TB_NEUTRAL'] = df_com_sentiment.groupby( df_com_sentiment[df_com_sentiment['TextBlob_sentiment_type']=='NEUTRAL'].date.dt.floor('d')).size().reset_index(name='TB_NEUTRAL').TB_NEUTRAL.astype(int)
按天列的情绪类型数
如果您查看日期 2020-02-15,总计 = 12,但正 + 负 + 中性的总和 == 14
解决方案
您是否正在寻找这样的东西:
import pandas as pd
import numpy as np
import datetime as dt
df = pd.DataFrame({'date':pd.date_range(start='2021-01-01', end=dt.datetime.today(),freq='3h'),
'sentiment':np.random.choice(['POSITIVE','NEGATIVE','NEUTRAL'],104)})
df1 = df.groupby([df.date.dt.date,df.sentiment])['sentiment'].count()
df1 = df1.unstack()
print (df1)
其输出将是:
sentiment NEGATIVE NEUTRAL POSITIVE
date
2021-01-01 4.0 3.0 1.0
2021-01-02 2.0 2.0 4.0
2021-01-03 4.0 3.0 1.0
2021-01-04 3.0 2.0 3.0
2021-01-05 4.0 1.0 3.0
2021-01-06 1.0 3.0 4.0
2021-01-07 3.0 3.0 2.0
2021-01-08 4.0 3.0 1.0
2021-01-09 4.0 1.0 3.0
2021-01-10 2.0 2.0 4.0
2021-01-11 5.0 3.0 NaN
2021-01-12 3.0 2.0 3.0
2021-01-13 1.0 3.0 4.0
为此的输入数据框是:
date sentiment
0 2021-01-01 00:00:00 NEUTRAL
1 2021-01-01 03:00:00 NEGATIVE
2 2021-01-01 06:00:00 NEGATIVE
3 2021-01-01 09:00:00 NEGATIVE
4 2021-01-01 12:00:00 NEGATIVE
5 2021-01-01 15:00:00 NEUTRAL
6 2021-01-01 18:00:00 NEUTRAL
7 2021-01-01 21:00:00 POSITIVE
8 2021-01-02 00:00:00 POSITIVE
9 2021-01-02 03:00:00 POSITIVE
10 2021-01-02 06:00:00 POSITIVE
11 2021-01-02 09:00:00 NEUTRAL
12 2021-01-02 12:00:00 NEGATIVE
13 2021-01-02 15:00:00 POSITIVE
14 2021-01-02 18:00:00 NEGATIVE
15 2021-01-02 21:00:00 NEUTRAL
16 2021-01-03 00:00:00 NEUTRAL
17 2021-01-03 03:00:00 NEGATIVE
18 2021-01-03 06:00:00 NEGATIVE
19 2021-01-03 09:00:00 NEUTRAL
20 2021-01-03 12:00:00 POSITIVE
21 2021-01-03 15:00:00 NEGATIVE
22 2021-01-03 18:00:00 NEGATIVE
23 2021-01-03 21:00:00 NEUTRAL
24 2021-01-04 00:00:00 NEGATIVE
25 2021-01-04 03:00:00 POSITIVE
26 2021-01-04 06:00:00 NEGATIVE
27 2021-01-04 09:00:00 POSITIVE
28 2021-01-04 12:00:00 NEUTRAL
29 2021-01-04 15:00:00 NEUTRAL
30 2021-01-04 18:00:00 NEGATIVE
31 2021-01-04 21:00:00 POSITIVE
32 2021-01-05 00:00:00 NEGATIVE
33 2021-01-05 03:00:00 NEGATIVE
34 2021-01-05 06:00:00 NEGATIVE
35 2021-01-05 09:00:00 NEUTRAL
36 2021-01-05 12:00:00 POSITIVE
37 2021-01-05 15:00:00 POSITIVE
38 2021-01-05 18:00:00 NEGATIVE
39 2021-01-05 21:00:00 POSITIVE
40 2021-01-06 00:00:00 POSITIVE
41 2021-01-06 03:00:00 POSITIVE
42 2021-01-06 06:00:00 NEUTRAL
43 2021-01-06 09:00:00 POSITIVE
44 2021-01-06 12:00:00 NEUTRAL
45 2021-01-06 15:00:00 NEUTRAL
46 2021-01-06 18:00:00 NEGATIVE
47 2021-01-06 21:00:00 POSITIVE
48 2021-01-07 00:00:00 POSITIVE
49 2021-01-07 03:00:00 NEUTRAL
50 2021-01-07 06:00:00 NEGATIVE
51 2021-01-07 09:00:00 NEGATIVE
52 2021-01-07 12:00:00 NEGATIVE
53 2021-01-07 15:00:00 NEUTRAL
54 2021-01-07 18:00:00 POSITIVE
55 2021-01-07 21:00:00 NEUTRAL
56 2021-01-08 00:00:00 NEGATIVE
57 2021-01-08 03:00:00 NEGATIVE
58 2021-01-08 06:00:00 NEUTRAL
59 2021-01-08 09:00:00 NEUTRAL
60 2021-01-08 12:00:00 POSITIVE
61 2021-01-08 15:00:00 NEGATIVE
62 2021-01-08 18:00:00 NEUTRAL
63 2021-01-08 21:00:00 NEGATIVE
64 2021-01-09 00:00:00 NEGATIVE
65 2021-01-09 03:00:00 POSITIVE
66 2021-01-09 06:00:00 NEGATIVE
67 2021-01-09 09:00:00 POSITIVE
68 2021-01-09 12:00:00 NEGATIVE
69 2021-01-09 15:00:00 NEGATIVE
70 2021-01-09 18:00:00 NEUTRAL
71 2021-01-09 21:00:00 POSITIVE
72 2021-01-10 00:00:00 NEUTRAL
73 2021-01-10 03:00:00 POSITIVE
74 2021-01-10 06:00:00 POSITIVE
75 2021-01-10 09:00:00 NEGATIVE
76 2021-01-10 12:00:00 POSITIVE
77 2021-01-10 15:00:00 NEGATIVE
78 2021-01-10 18:00:00 NEUTRAL
79 2021-01-10 21:00:00 POSITIVE
80 2021-01-11 00:00:00 NEGATIVE
81 2021-01-11 03:00:00 NEGATIVE
82 2021-01-11 06:00:00 NEUTRAL
83 2021-01-11 09:00:00 NEUTRAL
84 2021-01-11 12:00:00 NEUTRAL
85 2021-01-11 15:00:00 NEGATIVE
86 2021-01-11 18:00:00 NEGATIVE
87 2021-01-11 21:00:00 NEGATIVE
88 2021-01-12 00:00:00 NEGATIVE
89 2021-01-12 03:00:00 NEUTRAL
90 2021-01-12 06:00:00 NEGATIVE
91 2021-01-12 09:00:00 POSITIVE
92 2021-01-12 12:00:00 POSITIVE
93 2021-01-12 15:00:00 NEGATIVE
94 2021-01-12 18:00:00 POSITIVE
95 2021-01-12 21:00:00 NEUTRAL
96 2021-01-13 00:00:00 NEUTRAL
97 2021-01-13 03:00:00 POSITIVE
98 2021-01-13 06:00:00 POSITIVE
99 2021-01-13 09:00:00 NEGATIVE
100 2021-01-13 12:00:00 NEUTRAL
101 2021-01-13 15:00:00 POSITIVE
102 2021-01-13 18:00:00 NEUTRAL
103 2021-01-13 21:00:00 POSITIVE