python - How to extract a certain sentence in a paragraph? Python
问题描述
I want to extract certain sentences from a paragraph looking at a certain set of words Object C Statement:
. The paragraph is as follows:
Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat.
I want to extract Object C Statement: as follows:
There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat.
All examples that I have come across are with splitting a specific word etc.
I tried this, but it doesn't work for me:
word="Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat."
a, b, c, d, e = re.split(r"\B\s(?=[^\s:]+:)", word)
regex = re.compile(r"""Object A Statement\s(.*?)Object B Statement\s(.*?)Object C Statement\s(.*?)Object D Statement\s(.*?)Object E Statement\s(.*)""", re.S|re.X)
a, b, c, d, e = regex.match(word).groups()
解决方案
You can split the string with "\s*Object . Statement:\s*"
import re
word="Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat."
result = re.split(r"\s*Object . Statement:\s*", word)
result = [r for r in result if len(r) > 0]
print("\n".join(result))
I get the following result.
There was a cat with a bag full of meat. It was a red cat with a blue hat.
There was a dog with a bag full of toys. It was a blue dog with a green hat.
There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat.
There was a zebra with a bag full of grass. It was a white zebra with a blue hat.
There was a bear with a bag full of wood. It was a brown bear with a black hat.
推荐阅读
- java - Java - 对泛型 X 的引用应该被参数化
- java - 添加 Java JNI -Xcheck:jni 选项以在 Android Studio 中运行命令
- here-api - 使用 here-api 查找门牌号为 17 1/2 的地址
- python - AttributeError: 'NoneType' object has no attribute 'send' ,当我尝试将带有机器人的消息发送到特定的不和谐频道时
- pine-script - 回测 - 如果在上一次收盘和新交易之间没有发生条件,则不开仓
- javascript - MongoDB 使用聚合管道计算单个用户的所有喜欢和帖子
- uncertainty - 执行不确定性 - 蒙特卡罗分析 - 在 Brightway2 中使用 Ecoinvent
- javascript - 链接没有返回值的承诺的最佳实践
- html - CSS 布局 - 网格或弹性
- ringcentral - 在来电时打开网页