首页 > 解决方案 > 从python中的字符串中提取特定模式

问题描述

我在一列 Dataframe 中有以下数据(包含大约 100 行)。

需要从 DF 中为每一行提取 CK 字符串 (CK-36799-1523333)。

数据:

{"currency":"US","Cost":129,"receipt_id":"CK-36799-1523333","af_customer_user_id":"33738413"}

{"currency":"INR","Cost":429,"receipt_id":"CK-33711-15293046","af_customer_user_id":"33738414"}

{"currency":"US","Cost":229,"receipt_id":"CK-36798-1523333","af_customer_user_id":"33738423"}

{"currency":"INR","Cost":829,"receipt_id":"CK-33716-152930456","af_customer_user_id":"33738214"}

  {"currency":"INR","Cost":829,"order_id":"CK-33716-152930456","af_customer_user_id":"33738214"}

  {"currency":"INR","Cost":829,"suborder_id":"CK-33716-152930456","af_customer_user_id":"33738214"}

结果

CK-36799-1523333
CK-33711-15293046
CK-36798-1523333
CK-33716-152930456

我尝试了 str.find('CK-') 函数,但没有得到预期的结果。需要建议

标签: pythonpandasdataframeextractstartswith

解决方案


尝试使用正则表达式

import re

...
for line in data:
    res = re.findall(r"CK\-[0-9]+\-[0-9]+", line)
    if len(res) != 0:
        print(res[0])

推荐阅读