regex - 使用正则表达式捕捉莎士比亚角色的对话
问题描述
我正在尝试使用正则表达式来捕获莎士比亚对话,以练习使用正则表达式进行文本匹配。例如,我想捕捉CALIBAN
在这个特定场景中调用的角色所说的所有文本:
PROSPERO. Thou most lying slave,
Whom stripes may move, not kindness! I have us'd thee,
Filth as thou art, with human care, and lodg'd thee
In mine own cell, till thou didst seek to violate
The honour of my child.
CALIBAN. O ho, O ho! Would't had been done.
Thou didst prevent me. I had peopl'd else
This isle with Calibans.
PROSPERO. Thou most lying slave,
Whom stripes may move, not kindness! I have us'd thee,
Filth as thou art, with human care, and lodg'd thee
In mine own cell, till thou didst seek to violate
The honour of my child.
CALIBAN. O ho, O ho! Would't had been done.
Thou didst prevent me. I had peopl'd else
This isle with Calibans.
我想捕捉
O ho, O ho! Would't had been done.
Thou didst prevent me. I had peopl'd else
This isle with Calibans.
我将如何使用正则表达式来实现这一点?我尝试了这个特殊的正则表达式:
(?<=\n CALIBAN\. )[A-Za-z ',\.\n\!-]+(?=\n PROSPERO\. |$)
注意:在实际文本中,总是有 2 个空格字符,然后是新字符的名称。每行的末尾都有一个回车符。我的正则表达式寻找CALIBAN.
开始,然后匹配一些文本,并确保它必须以PROSPERO.
. 但是,当我将其插入 regexp.com 时,我的整个文本都匹配了:
解决方案
您可以将此正则表达式与惰性量词一起使用:
(?<=\n CALIBAN\. )[A-Za-z\s',.!-]+?(?=\n PROSPERO\. |$)
在 PHP 中使用:
$re = '/(?<=\n CALIBAN\. )[A-Za-z\s\',.!-]+?(?=\n PROSPERO\. |$)/';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the result
print_r($matches[0]);
推荐阅读
- java - 是否有不需要整个 mongodb 驱动程序的 org.bson.types.ObjectId 替代方法?
- html - 属性“authenticationService”是私有的,只能在“AdminComponent”类中访问
- python - 带有偏移量的工作日重采样
- google-apps-script - 如何通过 Google Apps 脚本触发 G Suite 密码重置电子邮件?
- oracle-sqldeveloper - 为什么我无法在 Oracle SQL Developer 网格中查看 XMLTYPE 数据类型列的值?
- r - R从长到宽的数据框,具有实值列
- amazon-web-services - 在 DAG 中使用 boto3 时,Apache 气流无法找到 AWS 凭证
- python - 从 timedelta 中提取分钟 - Python
- sql - SSIS OLE DB 命令错误:无法推断位置“1”中参数的类型以远程调用模块“sp_executesql”
- amazon-web-services - 子网 ID 'aws_subnet.firstsubnet.id' 不存在状态码:400,