首页 > 解决方案 > 从熊猫列表系列中提取元素并存储为单独的系列

问题描述

我有这个 df(带有样本所需的结果)

dfn = pd.DataFrame({"country_code": ["USA, UK, FRA", "RUS, ZHC, JAP", "IN, BRA, ES"], 
                    "all_but_american_desired": [["United Kingdom", "France"], ["Russia", "China", "Japan"], ["India", "Spain"]]})

我将(到目前为止)字符串“翻译”成新的含义并存储为元素列表

masked = {"USA":"United States", "UK":"United Kingdom", "FRA":"France", 
          "RUS":"Russia", "ZHC":"China", "JAP":"Japan", 
          "IN":"India", "BRA":"Brazil", "ES":"Spain"}

dfn["country_name"] = dfn["country_code"].apply(lambda x: [", ".join({masked[i] for i in x.split(", ")})])

然后,我想通过外部列表提取一些翻译后的 country_name 系列,american并将它们放在单独的系列 ( all_but_american)

american = ["United States", "Brazil"]

结果应该与all_but_american_desired系列相同。到目前为止我已经尝试过:

dfn["all_but_american1"] = dfn["country_name"].apply(lambda x: [i for i in x if i not in american])

我以前使用过尝试1非常相同的方法并且它有效,但是这次没有任何效果,我找不到它的原因(这次我也尝试了其他方法,但由于我不熟悉它们我'将避免发布)...有人可以检查一下吗?如果可能的话,解释一下我做错了什么。

标签: pythonpandaslistextractexpand

解决方案


对于country_name创建列表,而不是一个具有连接值的元素列表:

dfn["country_name"] = dfn["country_code"].apply(lambda x: [masked[i] for i in x.split(", ")])

然后您的第二个解决方案运行良好:

american = ["United States", "Brazil"]

dfn["all_but_american1"] = dfn["country_name"].apply(lambda x: [i for i in x if i not in american])
print (dfn)
    country_code  all_but_american_desired  \
0   USA, UK, FRA  [United Kingdom, France]   
1  RUS, ZHC, JAP    [Russia, China, Japan]   
2    IN, BRA, ES            [India, Spain]   

                              country_name         all_but_american1  
0  [United States, United Kingdom, France]  [United Kingdom, France]  
1                   [Russia, China, Japan]    [Russia, China, Japan]  
2                   [India, Brazil, Spain]            [India, Spain]  

推荐阅读