首页 > 解决方案 > 如何找到作者的频率并使用 Python 绘制它?

问题描述

这里ABC news被观察了 5 次,但该列Times反映1了每一行。预期输出是ABC news每行一次,但总数Times为 5,因为 ABC 总共发布了 5 个标题。

因此,虽然绘图Author在 X 轴上,并且Times已发布相关联,但在 Y 轴上。

以下数据框的代码需要像上面提到的那样进行更改:

a=df1.groupby(['author','title'])['title'].count().reset_index(name="Time")
a.head()



    author                    title                               Time
0   ABC News    WATCH: How to get the most bang for your buck ...   1
1   ABC News    WATCH: Man who confessed to killing wife, chil...   1
2   ABC News    WATCH: Nearly 1,000 still missing 11 days afte...   1
3   ABC News    WATCH: Teen hockey player skates after brain i...   1
4   ABC News    WATCH: Trump: Will not do in-person interview ...   1
5   Ali Dukakis and Mike Levine     Mueller  'has no eff...         1

标签: pythonpandasvisualization

解决方案


以下将使用适当的数字不断更新您的Times专栏。您可以选择在函数中声明循环以供以后重用。

import pandas as pd

df = pd.DataFrame( data=[['ABC News','WATCH: How to get the most bang for your buck...','1'], ['ABC News','WATCH: Man who confessed to killing wife, chil...','1'], ['ABC News','WATCH: Nearly 1,000 still missing 11 days afte...','1'], ['ABC News','WATCH: Teen hockey player skates after brain i...','1'], ['ABC News','WATCH: Trump: Will not do in-person interview ...','1'], ['Ali Dukakis and Mike Levine',"Mueller  'has no eff...",'1'] ], columns=['author','title','Times'])

word_count = dict(df['author'].value_counts())
for i,v in df["author"].iteritems():
    if v in word_count.keys():
        df.loc[i, "Times"] = word_count[v]

print(df)

这将获得您想要的结果,例如:在此处输入图像描述

我相信,现在策划author反对Times应该不是问题。如果它符合您的要求,请接受答案,否则请让我知道这是否不适合您。


推荐阅读