stata - 在句子中查找单词并创建指示变量
问题描述
我有一个带有各种句子的变量:
Cats are good pets, for they are clean and are not noisy.
Abstraction is often one floor above you.
She wrote a long letter to Charlie, but he didn't read it.
Where do random thoughts come from?
Mary plays the piano.
I want more detailed information.
I'd rather be a bird than a fish.
When I was little I had a car door slammed shut on my hand. I still remember it quite vividly.
Malls are great places to shop; John can find everything he needs under one roof.
My Mum tries to be cool by saying that she likes all the same things that I do.
name == 1
如果找到名称,如何创建变量?
name == 2
如果句子中的任何单词与我选择的单词匹配(例如letter
) ,我也希望有变量。
我尝试了以下方法:
gen name = regexm(sentence, "letter* & (Charlie | Mary | John)*")`
但是,这不起作用。我只得到name == 0
所有的观察。
解决方案
正则表达式很棒,但 Catch-22 是您必须非常努力地学习该语言;如果你精通了,那么你就会看到好处。
我将把它留给其他答案来提供智能正则表达式解决方案。这里的目的是强调其他字符串函数也可以使用。在这里,我利用了一个事实strpos()
,如果它在另一个字符串中找到一个字符串,则返回一个肯定的结果,相当于 true。此外,Stata 将解析成单词,因此即使(例如)当且仅当它是一个单词时才找到一个字符串,从第一原理来看并不太难。
clear
input strL whatever
"Cats are good pets, for they are clean and are not noisy."
"Abstraction is often one floor above you."
"She wrote a long letter to Charlie, but he didn't read it."
"Where do random thoughts come from?"
"Mary plays the piano."
"I want more detailed information."
"I'd rather be a bird than a fish."
"When I was little I had a car door slammed shut on my hand. I still remember it quite vividly."
"Malls are great places to shop; John can find everything he needs under one roof."
"My Mum tries to be cool by saying that she likes all the same things that I do."
end
gen wanted1 = strpos(whatever, "Charlie") | strpos(whatever, "Mary") | strpos(whatever, "John")
* cat or cats as a word
gen wanted2 = 0
gen wordcount = wordcount(whatever)
su wordcount, meanonly
local J = r(max)
quietly foreach w in cat cats {
forval j = 1/`J' {
replace wanted2 = 1 if word(lower(whatever), `j') == "`w'"
}
}
gen what = substr(whatever, 1, 40)
list wanted? what, sep(0)
+--------------------------------------------------------------+
| wanted1 wanted2 what |
|--------------------------------------------------------------|
1. | 0 1 Cats are good pets, for they are clean a |
2. | 0 0 Abstraction is often one floor above you |
3. | 1 0 She wrote a long letter to Charlie, but |
4. | 0 0 Where do random thoughts come from? |
5. | 1 0 Mary plays the piano. |
6. | 0 0 I want more detailed information. |
7. | 0 0 I'd rather be a bird than a fish. |
8. | 0 0 When I was little I had a car door slamm |
9. | 1 0 Malls are great places to shop; John can |
10. | 0 0 My Mum tries to be cool by saying that s |
+--------------------------------------------------------------+
推荐阅读
- java - 如何将底部回收站视图制作为用于突出显示位置的谷歌地图
- python - 表示带有时间序列词嵌入的每日推文语料库
- angular - 如何将角度材质组件转换并使用为 web 元素
- python - Python – 根据存储为字符串变量的 Unicode 名称打印字符
- python - 无法从 Pyspark 中的 Maptype 列中获取值
- node.js - 电子应用 | 带有电子生成器的自定义安装程序
- django - 如何向 Django 中的特定用户发送推送通知?
- aws-lambda - AWS Lambda 扩展:抛出 LaunchError
- javascript - Scrapy Splash 代码刺激点击链接
- mongodb - MongoDB:数组大小与 $where - ReferenceError