首页 > 解决方案 > Pandas 基于 str.find 切片字符串作为开始和停止的位置

问题描述

我有一个看起来像这样的数据框

commands
*client interface       : Eth-Trunk45.2903 is up
*client interface       : Eth-Trunk46.2620 is up
*client interface       : Eth-Trunk46.2988 is up
*client interface       : Eth-Trunk55.1703 is up
*client interface       : Eth-Trunk55.1704 is up
*client interface       : GigabitEthernet4/1/12.102 is up

如何对字符串进行切片并获得如下输出。

commands
Eth-Trunk45.2903
Eth-Trunk46.2620
Eth-Trunk46.2988
Eth-Trunk55.1703
Eth-Trunk55.1704
GigabitEthernet4/1/12.102

我试试

df['commands'] = df['commands'].str.slice(start=df['commands'].str.find(':'), stop=df['commands'].str.find(' is'))

但这只会返回我的 nan 值。

请帮忙。

标签: pythonpandas

解决方案


用于Series.str.extract获取值之间的值:

df['commands'] = df['commands'].str.extract(r":(.+) is", expand=False)
print (df)
                     commands
0            Eth-Trunk45.2903
1            Eth-Trunk46.2620
2            Eth-Trunk46.2988
3            Eth-Trunk55.1703
4            Eth-Trunk55.1704
5   GigabitEthernet4/1/12.102

您的解决方案在 中是可能的Series.apply,因为熊猫切片仅适用于所有列的整数:

df['commands'] = df['commands'].apply(lambda x: x[x.find(': ') + 1: x.find(' is ')])
print (df)
                     commands
0            Eth-Trunk45.2903
1            Eth-Trunk46.2620
2            Eth-Trunk46.2988
3            Eth-Trunk55.1703
4            Eth-Trunk55.1704
5   GigabitEthernet4/1/12.102

print (df['commands'].str.slice(26, 42))
0    Eth-Trunk45.2903
1    Eth-Trunk46.2620
2    Eth-Trunk46.2988
3    Eth-Trunk55.1703
4    Eth-Trunk55.1704
5    GigabitEthernet4
Name: commands, dtype: object

推荐阅读