首页 > 解决方案 > 从列中获取一个元素,如果等于某个元素,则将其放在 python 的另一列中

问题描述

可以说我有一个这样的数据框:

        full_path                     
0   C:\Users\User\Desktop\Test1\1.txt 
1   C:\Users\User\Desktop\ABC\1.txt 
2   C:\Users\User\Desktop\Test2\1.txt 
3   C:\Users\User\Desktop\Test1\1.txt 
4   C:\Users\User\Desktop\ABCD\1.txt 
5   C:\Users\User\Desktop\Test2\1.txt 

我想检查路径的第 5 个元素是否等于 Test 1 和 Test2 并创建如下所示的列:

        full_path                             folder 
0   C:\Users\User\Desktop\Test1\1.txt         Test1
1   C:\Users\User\Desktop\ABC\1.txt          
2   C:\Users\User\Desktop\Test2\1.txt         Test2
3   C:\Users\User\Desktop\Test1\1.txt         Test1
4   C:\Users\User\Desktop\ABCD\1.txt          
5   C:\Users\User\Desktop\Test2\1.txt         Test2

我试过这个命令df['folder']=df["full_path"].str.rsplit("\\").str[4],但它给了我这个输出:

        full_path                             folder 
0   C:\Users\User\Desktop\Test1\1.txt         Test1
1   C:\Users\User\Desktop\ABC\1.txt           ABC
2   C:\Users\User\Desktop\Test2\1.txt         Test2
3   C:\Users\User\Desktop\Test1\1.txt         Test1
4   C:\Users\User\Desktop\ABCD\1.txt          ABCD 
5   C:\Users\User\Desktop\Test2\1.txt         Test2

我不希望文件夹列中显示不是 Test1 和 Test2 的文件夹

标签: pythonpython-3.xpandaslistdataframe

解决方案


您可以在以下位置使用 Numpy:

import numpy as np

df['folder'] = np.where(df['full_path'].str.contains('Test'),
                        df['full_path'].str.rsplit('\\').str[4],
                        np.nan
                       )

输出:

                            full_path    folder
0   C:\Users\User\Desktop\Test1\1.txt     Test1
1     C:\Users\User\Desktop\ABC\1.txt       NaN
2   C:\Users\User\Desktop\Test2\1.txt     Test2
3   C:\Users\User\Desktop\Test1\1.txt     Test1
4    C:\Users\User\Desktop\ABCD\1.txt       NaN
5   C:\Users\User\Desktop\Test2\1.txt     Test2

推荐阅读