首页 > 解决方案 > 格式化电话号码

问题描述

我有一个熊猫数据框,我正在尝试以这样一种方式格式化电话号码,以便我想在前三个电话号码的数字 306-877-9993 之间添加破折号。我还想删除字符串“我没有”和虚拟电话号码 999999999999。我该怎么做?谢谢

  Last Name  First Name Phone number
0   Dupont      Marie   3068779993
1   Trey        Tom     16669858121
2   Johnson     Lily    (407)6579091
3   Parmentier  John    I don't have one
4   Predi       Pamela  999999999999 

编辑:这是一个 Excel 文件,其中包含手动输入的多个电话号码。我正在尝试查看是否有办法格式化电话号码并清理文件。我试图用 : 去掉括号, df['Phone_number'] = df.Phone_number.str.strip('(')但我得到了一些电话号码的一堆 NaN。

标签: pythonpython-3.x

解决方案


您可以使用clean_phone()DataPrep中的函数。安装它pip install dataprep

>>> from dataprep.clean import clean_phone
>>> df = pd.DataFrame({"Phone number": [3068779993, "16669858121", 
         "(407)6579091", "I don't have one", 999999999999]})
>>> clean_phone(df, "Phone number")
Phone Number Cleaning Report:                                                                                                 
        3 values cleaned (60.0%)
        2 values unable to be parsed (40.0%), set to NaN
Result contains 3 (60.0%) values in the correct format and 2 null values (40.0%)
       Phone number Phone number_clean
0        3068779993       306-877-9993
1       16669858121       666-985-8121
2      (407)6579091       407-657-9091
3  I don't have one                NaN
4      999999999999                NaN

推荐阅读