python - Python用前缀分割字符串
问题描述
如果我有一个充满文本和价格的数据框列。
0 £75 BT Reward Card
1 £125 BT Reward Card
2 £50 Retail Voucher
3 £100 BT Reward Card
4 £150 BT Reward Card
5 £50 Cashback
6 Fibre Connection Fee (£50 Credit
7 £75 BT Reward Card
8 £125 BT Reward Card
9 £50 Cashback
10 £0 Fibre Connection Fee (£50 Credit
我只想在 £ 符号后直接返回数字。
到目前为止我已经有了这个,但是对于索引 6 和 10 来说它已经崩溃了
df['col']=df['col'].apply(lambda x: x.split(' ') [0])
我也试过这个:
df['col']=df['col'].apply(lambda x: x.split('£') [1])
解决方案
如果需要第一个值,则仅extract
在必要时使用并转换为整数:
df['new'] = df['col'].str.extract('£(\d+)').astype(int)
print (df)
col new
0 £75 BT Reward Card 75
1 £125 BT Reward Card 125
2 £50 Retail Voucher 50
3 £100 BT Reward Card 100
4 £150 BT Reward Card 150
5 £50 Cashback 50
6 Fibre Connection Fee (£50 Credit 50
7 £75 BT Reward Card 75
8 £125 BT Reward Card 125
9 £50 Cashback 50
10 £0 Fibre Connection Fee (£50 Credit 0
如果列表中的所有值都使用str.findall
:
#values are strings
df['new'] = df['col'].str.findall('£(\d+)')
#values are integers
#df['new'] = df['col'].str.findall('£(\d+)').apply(lambda x: [int(y) for y in x])
print (df)
col new
0 £75 BT Reward Card [75]
1 £125 BT Reward Card [125]
2 £50 Retail Voucher [50]
3 £100 BT Reward Card [100]
4 £150 BT Reward Card [150]
5 £50 Cashback [50]
6 Fibre Connection Fee (£50 Credit [50]
7 £75 BT Reward Card [75]
8 £125 BT Reward Card [125]
9 £50 Cashback [50]
10 £0 Fibre Connection Fee (£50 Credit [0, 50]
如果在新列中需要它们,请使用extractall
withunstack
和:add_prefix
join
df = df.join(df['col'].str.extractall('£(\d+)')[0].unstack().astype(float).add_prefix('new'))
print (df)
col new0 new1
0 £75 BT Reward Card 75.0 NaN
1 £125 BT Reward Card 125.0 NaN
2 £50 Retail Voucher 50.0 NaN
3 £100 BT Reward Card 100.0 NaN
4 £150 BT Reward Card 150.0 NaN
5 £50 Cashback 50.0 NaN
6 Fibre Connection Fee (£50 Credit 50.0 NaN
7 £75 BT Reward Card 75.0 NaN
8 £125 BT Reward Card 125.0 NaN
9 £50 Cashback 50.0 NaN
10 £0 Fibre Connection Fee (£50 Credit 0.0 50.0
推荐阅读
- javascript - 触发一次后事件不会重新触发?
- flutter - NeverScrollableScrollPhysics() 无法正常工作,需要有关选项卡填充/边距的建议
- python - Cloudwatch 警报,用于将 Aurora 数据自动转储到 S3 存储桶
- python - Python 无法在 Raspberry Pi 上编译
- python - 从用户获取输入并将其粘贴到命令提示符 | Python
- git - Linux 上的多个 github 帐户 - 无法推送到远程,但可以拉取
- javascript - 每当我加载我的项目时,它都会显示未定义的索引错误?
- python - Flask 应用程序 Heroku 错误应用程序崩溃 Windows
- android - React-native android:可以通过命令运行应用程序,但不能通过 android studio 运行
- react-native - 应用程序崩溃并出现错误:在广告上的触摸事件之后,无法找到类 UIManager 的 JSIModule