regex - 大熊猫列上的正则表达式以创建新列
问题描述
我有一个熊猫专栏:
df['审稿人']
0 more, 25-34 Male on Treatment for 10 years or more
1 Idapida, 25-34 Female on Treatment for 2 to less than 5 years
2 Anna, 13-18 Female on Treatment for 5 to less than 10 years
3 Kepons, 55-64 on Treatment for 1 to 6 months
4 sammymaguire, 45-54 Female on Treatment for 1 to less than 2 years
我正在寻找使用以下正则表达式模式
ageRegex = re.compile('13-18|19-24|25-34|35-44|45-54|55-64|65-74|75 or
over')
timeRegex = re.compile('less than 1 month|1 to 6 months|6 months to less
than 1 year|1 to less than 2 years|2 to less than 5 years|5 to less than
10 years|10 years or more')
genderRegex = re.compile('Male|Female')
将年龄、时间和性别提取到看起来像这样的新列
0 25-34 10 years or more Male
1 25-34 Treatment for 2 to less than 5 years Female
2 13-18 Treatment for 5 to less than 10 years Female
3 55-64 Treatment for 1 to 6 months na
4 45-54 Treatment for 1 to less than 2 years Female
我试过这样的东西
df['age'] = ageRegex.findall(df['Reviewer'])
但我得到了错误
expected string or bytes-like object
解决方案
df["age"] = df["Reviewer"].str.extract('(13-18|19-24|25-34|35-44|45-54|55-64|65-74|75 or over)')
df["Time"] = df["Reviewer"].str.extract('(less than 1 month|1 to 6 months|6 months to less than 1 year|1 to less than 2 years|2 to less than 5 years|5 to less than 10 years|10 years or more)')
df["Gender"] = df["Reviewer"].str.extract('(Male|Female)')
推荐阅读
- docker - Docker-Compose 挂载卷覆盖主机文件
- c# - 绑定到 DataGrid 外部的属性
- python - 如何从本地 HTML 页面链接到 Jupter 笔记本
- common-lisp - Common Lisp:未定义的函数 k
- php - 如何在 Laravel PHP 框架中合并两个集合而不丢弃(丢失)键?
- match - COUNTIF 基于三个条件使用 OFFSET 和 MATCH
- unity3d - 如何实现游戏中的这种控制:rolly vortex?
- git - 如何在单个 Gitlab 存储库中创建多个项目
- vbscript - VBScript:如何将数字修剪到小数点后 4 位但不四舍五入?
- python - 使用计数器创建递归函数