首页 > 解决方案 > 尝试除了不赶上功能?

问题描述

我在预处理一些数据时遇到了这个有效的错误:

 9:46:56.323 PM default_model Function execution took 6008 ms, finished with status: 'crash'
 9:46:56.322 PM default_model Traceback (most recent call last):
  File "/user_code/main.py", line 31, in default_model
    train, endog, exog, _, _, rawDf = preprocess(ledger, apps)
  File "/user_code/Wrangling.py", line 73, in preprocess
    raise InsufficientTimespanError(args=(appDf, locDf))

这发生在这里:

async def default_model(request):
    request_json = request.get_json()
    if not request_json:
        return '{"error": "empty body." }'
    if 'transaction_id' in request_json:
        transaction_id = request_json['transaction_id']

        apps = []  # array of apps whose predictions we want, or uempty for all
        if 'apps' in request_json:
            apps = request_json['apps']

        modelUrl = None
        if 'files' in request_json:
            try:
                files = request_json['files']
                modelUrl = getModelFromFiles(files)
            except:
                return package(transaction_id, error="no model to execute")
        else:
            return package(transaction_id, error="no model to execute")

        if 'ledger' in request_json:
            ledger = request_json['ledger']

            try:
                train, endog, exog, _, _, rawDf = preprocess(ledger, apps)
            # ...
            except InsufficientTimespanError as err:
                return package(transaction_id, error=err.message, appDf=err.args[0], locDf=err.args[1])

并且预处理正确地抛出了我的自定义错误:

def preprocess(ledger, apps=[]):
    """
    convert ledger from the server, which comes in as an array of csv entries.
    normalize/resample timeseries, returning dataframes
    """
    appDf, locDf = splitLedger(ledger)

    if len(appDf) < 3 or len(locDf) < 3:
        raise InsufficientDataError(args=(appDf, locDf))

    endog = appDf['app_id'].unique().tolist()
    exog = locDf['location_id'].unique().tolist()

    rawDf = normalize(appDf, locDf)
    trainDf = cutoff(rawDf.copy(), apps)
    rawDf = cutoff(rawDf.copy(), apps, trim=False)

    # TODO - uncomment when on realish data
    if len(trainDf) < 2 * WEEKS:
        raise InsufficientTimespanError(args=(appDf, locDf))

问题是,它在一个try``except块中正是因为我想捕获错误并返回带有错误的有效负载,而不是因为 500 错误而崩溃。但无论如何,它在我的自定义错误中崩溃,在 try 块中。就在那条线上打电话preprocess

这一定是我未能遵守正确的 python 代码。但我不确定我做错了什么。环境是python 3.7

这是在 Wrangling.py 中定义错误的地方:

class WranglingError(Exception):
    """Base class for other exceptions"""
    pass


class InsufficientDataError(WranglingError):
    """insufficient data to make a prediction"""

    def __init__(self, message='insufficient data to make a prediction', args=None):
        super().__init__(message)
        self.message = message
        self.args = args


class InsufficientTimespanError(WranglingError):
    """insufficient timespan to make a prediction"""

    def __init__(self, message='insufficient timespan to make a prediction', args=None):
        super().__init__(message)
        self.message = message
        self.args = args

以下是 main.py 声明(导入)它的方式:

from Wrangling import preprocess, InsufficientDataError, InsufficientTimespanError, DataNotNormal, InappropriateValueToPredict

标签: pythongoogle-cloud-functions

解决方案


您的preprocess函数已声明async。这意味着其中的代码实际上并没有在您调用的地方运行preprocess,而是在最终被await编辑或传递给主循环(如asyncio.run)时运行。因为它运行的地方已经不在 try 块中了default_model,所以异常没有被捕获。

您可以通过以下几种方式解决此问题:

  • preprocess不异步
  • 也使default_model异步,然后awaitpreprocess.

推荐阅读