excel - I have a list of keywords and want to count the number of match keywords in each cells text from EXCEL File
问题描述
I have 2 sheet in a excel file. one is a Dictionary sheet and 2nd is a sheet containing a column of text. I want to match the dictionary keywords columns one by one and then the number of match keywords counts in each cell of the text column.
I have tried these formulas: =(LEN(B2)-LEN(SUBSTITUTE(B2,Sheet1!A:A,"")))/LEN(Sheet1!A:A) in this B2 is the first (start) cell of the text column and Sheet1!A:A is the dictionary column of other sheet. but by this i get zero as a result
=(LEN(B2)-LEN(SUBSTITUTE(B2,Sheet1!A:A,"")))/LEN(Sheet1!A:A)
The result will be like this:
Text number_of_keyword_match | number_of_keyword_match using DIC col 2 | ........
using DIC col 1
1 any Text or sentence/sentences e.g match "3"
2 7
3 0
4 15
5 .................................................
7 .....................................................
.......................................................
..................................continue up to 2815 rows....
解决方案
假设您的文本输入如下所示:
| A |
-+----------------------+
1|apple apple beat beat |
2|apple beat beat carrot|
3|carrot apple apple |
你的字典看起来像这样:
| A | B |
-+-------+-------+
1|apple |beat |
2|beat |carrot |
3| | |
此公式将为您提供每个文本单元格的每个单词的计数
=(LEN(text!A1)-LEN(SUBSTITUTE(text!A1,dictionary!A1,"")))/LEN(dictionary!A1)
(在这个例子中2
)
如果我理解正确,您的预期输出将是工作表中的额外列,text
其中每个单元格包含相应列中每个单词的计数总和dictionary
,对吗?例如:
| A | B | C |
-+----------------------+---+---+
1|apple apple beat beat | 4 | 2 |
2|apple beat beat carrot| 3 | 3 |
3|carrot apple apple | 2 | 1 |
您可以使用数组公式来执行此操作,从单元格 B1 中的这个开始:
=SUM(IFERROR((LEN(text!$A1)-LEN(SUBSTITUTE(text!$A1;dictionary!A:A;"")))/LEN(dictionary!A:A);0))
但不是Enter
在粘贴后按下,而是按下Ctrl+Shift+Enter
以将其作为数组公式运行。然后将此公式向下和向右拖动以获得您想要的所有计数。
推荐阅读
- maven - 错误:自动模块不能与 jlink 一起使用:-带有 JavaFX 的 Maven
- android - rx.Subscriber 实体退订后不能重用?
- kubernetes - 从 kubernetes yaml 定义中的文件创建配置映射时,|+ 和 |- 有什么区别?
- css - Bootstrap - 进度条作为卡片元素中的背景
- ios - iOS导航栏标题在更新一次后被截断
- python - 如何通过每行的非 nan 值计数提取此数据框中的所有非 nan 值
- regex - 如何使用正则表达式捕获多个模式?
- python - 将 Pandas 数据框拆分为子数据框(不是数据框列表)
- linux - Linux命令知道postgres中任何时间点的活动数据库连接数?
- sql - 使用同一列的一部分替换表列值