首页 > 解决方案 > 如果句子列中少于 4 个字符,如何删除数据框行?

问题描述

假设我已经在我的数据框中标记了这样的句子:

+-----------------------------------------+-----------+
|                sentence                 | sentiment |
+-----------------------------------------+-----------+
| [i, like, this, app, it, s, awesome]    | positive  |
| [way, to, many, ads, pop, up, hate, it] | negative  |
| [ye]                                    | negative  |
| [p]                                     | positive  |
| [niceeeee]                              | positive  |
| [i, do, not, like, the, design]         | negative  |
| [very, useful, recommended]             | positive  |
| [ugly]                                  | negative  |
| [xxx]                                   | negative  |
| [yes]                                   | positive  |
+-----------------------------------------+-----------+

如果句子列少于 4 个字符,我想通过删除 df 行从数据框中清除不必要的数据,因此最终结果将是这样的:

+-----------------------------------------+-----------+
|                sentence                 | sentiment |
+-----------------------------------------+-----------+
| [i, like, this, app, it, s, awesome]    | positive  |
| [way, to, many, ads, pop, up, hate, it] | negative  |
| [niceeeee]                              | positive  |
| [i, do, not, like, the, design]         | negative  |
| [very, useful, recommended]             | positive  |
| [ugly]                                  | negative  |
+-----------------------------------------+-----------+

有没有人可以提供程序代码来解决这个问题?我将非常感谢您的帮助,这将有助于我的论文工作,感谢您的关注

标签: pythonmachine-learningsentiment-analysis

解决方案


您可以apply为此使用功能

char_limit=4
df[df['sentence'].apply(lambda x : len("".join(x))>=char_limit)]

推荐阅读