首页 > 解决方案 > extract number from string in pandas dataframe column

问题描述

I have a dataframe in the below format and and trying to use the extract function but I keep getting the following error:

ValueError: If using all scalar values, you must pass an index

column1    column2
1         abc2150/abc2152/abc2154/abc215601/U215602


df.column2.str
    .split('/',expand=True)
    .apply(lambda row: row.str.extract('(\d+)', expand=True))
    .apply(lambda x: '/'.join(x.dropna().astype(str)), axis=1)

I need the output in the below format.

column1    column2
1         2150/2152/2154/215601/215602

Please let me know how to fix it.

Thanks

标签: pythonpandas

解决方案


您可以改为使用str.replace积极的前瞻来删除数字部分之前的所有字符:

df.column2.str.replace(r'[a-zA-Z]+(?=\d+)','')

 0    2150/2152/2154/215601/215602
Name: column2, dtype: object

推荐阅读