首页 > 解决方案 > 导入类时未定义 NameError DataFrame

问题描述

我最近了解了该np.select操作,并决定创建一个类来试验它,并学习更多关于 OOP 的知识。

这里我的类的定义后跟一个例子(该类使用translate开头定义的函数):

def translate(text, conversion_dict, before = None):
    if not text: return text
    before = before or str.lower
    t = before(text)
    for key, value in conversion_dict.items():
        t = t.replace(key, value)
    return t

class Conditions:
    def __init__(self, base_conditions, composed_conditions, groups, default_group):
        self.base_conditions = base_conditions
        self.composed_conditions = composed_conditions
        self.groups = groups
        self.default_group = default_group
        self.readable_conditions = [translate(c, self.base_conditions) for c in self.composed_conditions]
        self.ok_conditions = []  

    def run_condition(self, condition, df_name):
        return eval(condition.replace("(","("+str(df_name)+"."))

    def run_conditions(self, df_name):
        return [self.run_condition(c, df_name) for c in  self.readable_conditions]

例子

首先,我们创建一个简单的 DataFrame 来玩:

import pandas as pd
import numpy as np

example = {"lev1" : [-1, -1, -1, 1, 0 , -1, 0 , 3],
           "lev2" : [-1, 0 , 1 , 5 , 0 , 7 , 8 , 6]}

ex_df = pd.DataFrame.from_dict(example)
print(ex_df)

   lev1  lev2
0    -1    -1
1    -1     0
2    -1     1
3     1     5
4     0     0
5    -1     7
6     0     8
7     3     6

接下来,我们创建一个新的类实例,在其中传递条件和组:

mycond = Conditions({"(m1)" : "(lev1 < 0)",
                     "(m2)" : "(lev2 > 2)", 
                     "(m3)" : "(lev1 == 0)"},
                    ["(m1)", "(m2) & (m3)", "(m2)"],
                    ['A', 'B', 'C'],
                    999)

最后,我们在DataFramenp.select上使用该操作并打印结果:ex_df

ex_df['MATCH'] = np.select(condlist = mycond.run_conditions("ex_df"), 
                           choicelist = mycond.groups, 
                           default = mycond.default_group) 
print(ex_df)

   lev1  lev2 MATCH
0    -1    -1     A
1    -1     0     A
2    -1     1     A
3     1     5     C
4     0     0   999
5    -1     7     A
6     0     8     B
7     3     6     C

如您所见,除了一个例外,一切都运行良好。

当我尝试从单独的文件(conditions.py也包含translate function)导入我的类时,它不再起作用。这是我的文件夹/文件的组织方式:

├── classes
│   ├── __init__.py
│   └── conditions.py
└── test-notebook.ipynb

在我的test-notebook.ipynb中,我以通常的方式导入我的课程(有效):

from classes.conditions import *

然后,在创建我的 DataFrame 之后,我创建了我的类的一个新实例(这也有效)。最后,当运行np.select操作时,会引发以下NameError: name 'ex_df' is not defined.

我不知道为什么这会输出错误以及如何修复它。我正在寻找关于为什么如何的答案。如果需要,这是错误的回溯:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-1d4b3ba4a3c0> in <module>
----> 1 ex_df['MATCH'] = np.select(condlist = mycond.run_conditions("ex_df"), 
      2                            choicelist = mycond.groups,
      3                            default = mycond.default_group) 
      4 print(ex_df)

~/Projects/test/notebooks/classes/conditions.py in run_conditions(self, df_name)
     20 
     21     def run_conditions(self, df_name):
---> 22         return [self.run_condition(c, df_name) for c in  self.readable_conditions]

~/Projects/test/notebooks/classes/conditions.py in <listcomp>(.0)
     20 
     21     def run_conditions(self, df_name):
---> 22         return [self.run_condition(c, df_name) for c in  self.readable_conditions]

~/Projects/test/notebooks/classes/conditions.py in run_condition(self, condition, df_name)
     17 
     18     def run_condition(self, condition, df_name):
---> 19         return eval(condition.replace("(","("+str(df_name)+"."))
     20 
     21     def run_conditions(self, df_name):

~/Projects/test/notebooks/classes/conditions.py in <module>

NameError: name 'ex_df' is not defined

标签: pythonclass

解决方案


我认为这将解决问题

第一个文件名 Stackoverflow2.py

import pandas as pd
import numpy as np


def translate(text, conversion_dict, before = None):
    if not text: return text
    before = before or str.lower
    t = before(text)
    for key, value in conversion_dict.items():
        t = t.replace(key, value)
    return t

class Conditions:
    def __init__(self, base_conditions, composed_conditions, groups, default_group):
        self.base_conditions = base_conditions
        self.composed_conditions = composed_conditions
        self.groups = groups
        self.default_group = default_group
        self.readable_conditions = [translate(c, self.base_conditions) for c in self.composed_conditions]
        self.ok_conditions = []  

    def run_condition(self, condition, df_name):
        return eval(condition.replace("(","("+str(df_name)+"."))

    def run_conditions(self, df_name):
        return [self.run_condition(c, df_name) for c in  self.readable_conditions]

class DataFrame(Conditions):

    def __init__(self):
        pass
    def makeDataFrame(self):

        example = {"lev1" : [-1, -1, -1, 1, 0 , -1, 0 , 3],
        "lev2" : [-1, 0 , 1 , 5 , 0 , 7 , 8 , 6]}

        ex_df = pd.DataFrame.from_dict(example)

        return ex_df



obj=DataFrame()

print(obj.makeDataFrame())

# mycond = Conditions({"(m1)" : "(lev1 < 0)",
#                      "(m2)" : "(lev2 > 2)", 
#                      "(m3)" : "(lev1 == 0)"},
#                     ["(m1)", "(m2) & (m3)", "(m2)"],
#                     ['A', 'B', 'C'],
#                     999)

# ex_df['MATCH'] = np.select(condlist = mycond.run_conditions("ex_df"), 
#                            choicelist = mycond.groups, 
#                            default = mycond.default_group) 
# print(ex_df)

第二个文件名:Stackoverflow3.py

from Stackoverflow2 import *

print(obj.makeDataFrame())

问题是 python 中的全局变量确实像在 c 或 c++ 中一样工作。所以改为让它们成为实例变量。

欲了解更多信息,请检查此

https://stackoverflow.com/questions/15959534/visibility-of-global-variables-in-imported-modules

推荐阅读