首页 > 解决方案 > Summing and subtracting 2 numbers in 1 column in Pandas

问题描述

How to summing and subtracting 2 numbers in 1 column?

  bedrooms
0 1 + 1
1 2 - 1

If I'm using this code

df['bedrooms'] = pd.eval(df['bedrooms'])

will get this error message

Traceback (most recent call last):

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-11-d8172a031240>", line 1, in <module>
    df['bedrooms'] = pd.eval(df['bedrooms'])

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/eval.py", line 322, in eval
    parsed_expr = Expr(expr, engine=engine, parser=parser, env=env, truediv=truediv)

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/expr.py", line 830, in __init__
    self.terms = self.parse()

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/expr.py", line 847, in parse
    return self._visitor.visit(self.expr)

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/expr.py", line 437, in visit
    raise e

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/computation/expr.py", line 431, in visit
    node = ast.fix_missing_locations(ast.parse(clean))

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)

  File "<unknown>", line 1
    [4 ,4 +1 ,2 ,0 ,5 ,4 +1 ,2 -3 ,5 ,3 ,7 ,4 ,3 ,3 ,5 +1 ,4 ,2 ,1 +1 ,5 +1 ,3 ,5 ,4 +1 ,3 ,2 ,4 ,4 ,4 ,3 +1 ,1 -2 ,2 ,1 ,6 ,4 ,1 ,6 +1 ,3 -4 ,6 +1 ,2 +1 ,3 ,0 -4 ,2 +2 ,3 +1 ,4 +1 ,6 ,4 ,3 ,3 +1 ,4 ,4 +1 ,3 +1 ,4 ,4 +1 ,1 -3 ,3 ,3 ,3 -4 ,3 ,3 ,2 ,5 ,4 +1 ,3 ,4 ,3 -5 ,4 +1 ,4 +1 ,1 ,4 ,4 ,4 ,4 ,4 +1 ,4 +1 ,4 ,4 ,6 +,1 -5 ,5 ,5 ,4 -5 ,6 +1 ,3 ,4 ,3 ,5 +1 ,6 ,5 +1 ,5 ,5 +1 ,5 +1 ,4 ,4 +1 ,3 ,3 ,4 ,3 +1 ,5 +1 ,4 ,4 +1 ,4 ,3 -5 ,...]
                                                                                                                                                                                                                                                                                                                             ^
SyntaxError: invalid syntax

I just found out these are a list of numbers can't be parsed.

74      6+
441     7+
459     4+
518     5+
558     5+
610     3+
990     5+
1585    7+
Name: bedrooms, dtype: object

标签: pythonpandas

解决方案


我相信你需要pandas.eval

df['new'] = pd.eval(df['bedrooms'])
print (df)
  bedrooms  new
0    1 + 1    2
1    2 - 1    1

编辑:数据中的问题是6 +,解析它的一种可能的解决方案6是使用Series.str.rstrip

df = pd.DataFrame({'bedrooms': "4 ,4 +,5 +1, 5+, 6+ ".split(',') * 200})

df['bedrooms'] = pd.eval(df['bedrooms'].str.rstrip('+- '))

或者:

df['bedrooms'] = df['bedrooms'].str.rstrip('+- ').apply(pd.eval)
print (df)
     bedrooms
0           4
1           4
2           6
3           5
4           6
..        ...
995         4
996         4
997         6
998         5
999         6

[1000 rows x 1 columns]

编辑1:

您可以找到有问题的值:

def f(x):
    try:
        return pd.eval(x)
    except:
        return np.nan

df['bedrooms1'] = df['bedrooms'].apply(f)

a = df.loc[df['bedrooms1'].isna(), 'bedrooms']
print (a)
74    6 +
Name: bedrooms, dtype: object

推荐阅读