首页 > 解决方案 > pandas.isin 全部大写都坏了?

问题描述

我找到了熊猫的isin函数,但看起来所有大写字母都没有显示?

import pandas as pd
df = pd.read_json('{"Technology Group":{"0":"Cloud","1":"Cloud","2":"Cloud","3":"Collaboration","4":"Collaboration","5":"Collaboration","6":"Collaboration","7":"Collaboration","8":"Collaboration","9":"Core", "10": "Software"},"Technology":{"0":"AMP","1":"EWS","2":"Webex","3":"Telepresence","4":"Call Manager","5":"Contact Center","6":"MS Voice","7":"Apps","8":"PRIME  ","9":"Wirelees", "10": "Prime Infrastructure"}}')

+------------------+----------------------+
| Technology Group | Technology           |
+------------------+----------------------+
| Cloud            | AMP                  |
+------------------+----------------------+
| Cloud            | EWS                  |
+------------------+----------------------+
| Cloud            | Webex                |
+------------------+----------------------+
| Collaboration    | Telepresence         |
+------------------+----------------------+
| Collaboration    | Call Manager         |
+------------------+----------------------+
| Collaboration    | Contact Center       |
+------------------+----------------------+
| Collaboration    | MS Voice             |
+------------------+----------------------+
| Collaboration    | Apps                 |
+------------------+----------------------+
| Collaboration    | PRIME                |
+------------------+----------------------+
| Core             | Wirelees             |
+------------------+----------------------+
| Software         | Prime Infrastructure |
+------------------+----------------------+

tech_input2 = ['AMP', 'Call Manager', 'PRIME']
df = df[df['Technology'].isin(tech_input2)]

它将显示下表:

+------------------+--------------+
| Technology Group | Technology   |
+------------------+--------------+
| Cloud            | AMP          |
+------------------+--------------+
| Collaboration    | Call Manager |
+------------------+--------------+

... 代替:

+------------------+--------------+
| Technology Group | Technology   |
+------------------+--------------+
| Cloud            | AMP          |
+------------------+--------------+
| Collaboration    | Call Manager |
+------------------+--------------+
| Collaboration    | PRIME        |
+------------------+--------------+

这是一个错误吗?还是我做错了什么?从技术上讲,它不是表中原始最后一行的副本,但不确定如何破译它。它似乎更像是contains而不是isin ...

标签: pythonpandasdataframe

解决方案


这可能是由于空格。strip()根据参数(指定要删除的字符集的字符串)从左侧和右侧删除字符。

import pandas as pd
df = pd.read_json('{"Technology Group": {"0":"Cloud","1":"Cloud", 
"2":"Cloud","3":"Collaboration", "4":"Collaboration" ,":"Collaboration", 
"6":"Collaboration", "7":"Collaboration","8":"Collaboration","9":"Core", 
"10": "Software"},"Technology":{"0":"AMP","1":"EWS","2":"Webex","3":"Telepresence",
"4":"Call Manager","5":"Contact Center","6":"MS Voice","7":"Apps","8":"PRIME  
","9":"Wirelees", "10": "Prime Infrastructure"}}')

df['Technology'] = df['Technology'].str.strip()
tech_input2 = ['AMP', 'Call Manager', 'PRIME']
df = df[df['Technology'].isin(tech_input2)]

推荐阅读