python - Using regex to find (and replace) phone number extensions (Python)
问题描述
I'm currently trying to find phone number extensions from pandas series, an example being 'Ext: 123'. The extension can be in the cell either on its own (like previously) or after a phone number, e.g. 123 456 789 / Ext: 4502.
The extensions can also be in varying formats, such as Ex.430 (missing the letter t, no space after punctuation mark. Therefore, I wanted to find all sequences in the series that have 1-3 letters, followed by zero or more symbols, zero or more spaces, followed by 2 to 6 numbers.
Optimally, I would also replace these with the correct format, which is Ext: 32 (can be up to 6 numbers)
Here is my regex so far:
({'\D{1,3}\W*\s*\d{2,6}]'
I have also used other variations, but those didn't work either.
I would appreciate any help, thanks.
解决方案
您可以将列拆分为字母字符(加上冒号)。
df['phones'].str.split(r'[A-Za-z:]+\.?', expand=True)
推荐阅读
- lua - 如何将一个函数传递给 Lua 中的另一个函数?
- scheduled-tasks - 雪花任务正在运行,调用过程时没有影响数据
- python-3.x - 使用 pikepdf 从 pdf 中提取图像
- php - 在 MySQL 中显示除少数记录外的所有记录
- r - R 包 biomaRt 和此依赖项 RSQLite 出错
- java - 如何将通知访问屏幕过滤到我的应用程序?
- c# - 未处理的异常。System.ArgumentNullException:值不能为空。(参数“名称或连接字符串”)
- rust - 循环文件,索引字符
- java - 当我使用导航组件时,在启动片段之前显示 3 秒的空白屏幕?
- css - Bootstrap 5 + Masonry-Layout JS 边距