python - 重复捕获组
问题描述
我正在尝试编写一个捕获以下内容的正则表达式:
- 一行的问题(以“Q:”开头)
- 初始捕获之后的不确定数量的段落,在下一个“Q:”之前停止
到目前为止,这是我所得到的,但我要强调:
不工作:
(Q:.*?\n){1}(?!Q:)(.+)*
(Q:.*?\n){1}(?!Q:)(.+\n+)
到目前为止,我所得到的内容适用于前两个,但是当我添加新行时,它并没有捕获后续段落。
我错过了什么?
Q: What are the service limits associated with Amazon Athena?
Please click here to learn more about service limits.
Q: What is the underlying technology behind Amazon Athena?
Amazon Athena uses Presto with full standard SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, Avro, and Parquet. Athena can handle complex analysis, including large joins, window functions, and arrays. Because Amazon Athena uses Amazon S3 as the underlying data store, it is highly available and durable with data redundantly stored across multiple facilities and multiple devices in each facility. Learn more about Presto here.
Q: How does Amazon Athena store table definitions and schema?
Amazon Athena uses a managed Data Catalog to store information and schemas about the databases and tables that you create for your data stored in Amazon S3. In regions where AWS Glue is available, you can upgrade to using the AWS Glue Data Catalog with Amazon Athena. In regions where AWS Glue is not available, Athena uses an internal Catalog.
You can modify the catalog using DDL statements or via the AWS Management Console. Any schemas you define are automatically saved unless you explicitly delete them. Athena uses schema-on-read technology, which means that your table definitions applied to your data in S3 when queries are being executed. There’s no data loading or transformation required. You can delete table definitions and schema without impacting the underlying data stored on Amazon S3.
解决方案
您可以使用以下模式:
^(Q:.*?\n)(?!Q:)([\s\S]+?(?=^Q:|\Z))
演示。
分解:
^(Q:.*?\n) # Matches "Q:" at the beginning of the line, followed by
# some optional text ending with a line-feed.
(?!Q:) # Not immediately followed by another "Q:".
( # Start of the second capturing group.
[\s\S]+? # Matches one or more characters (including line breaks) - non-greedy.
(?=^Q:|\Z) # Stop matching if either followed by "Q:" or is at the end of the string.
) # End of the second capturing group.
推荐阅读
- r - 做回归分析时如何评估随机森林的模型和预测?
- go - 有没有办法将 golang net/http 应用程序公开到我的本地网络?
- c# - 如何从 C# selenium 的下拉列表中找到值?
- android - CharlesProxy -Cisco Anyconnect - Android
- jquery - 如何使用引导箱模式对话框显示处理微调器/gif
- c# - XmlSerialize 类到 CDATA
- javascript - 如何使新的输入值立即自动替换旧的?
- sql - 从 Postgres 中的数组中删除 json 对象时出错
- openwrt - 如何在 openwrt uhttpd 中设置 Cache-Control 标头
- ios - 在 Swift 中,为什么具有 Protocol 类型的存储属性的 Struct/Class 默认不符合 Codable?