首页 > 解决方案 > 传递给我的方法的特定大小的输入数组的 StackOverflowError?

问题描述

我想根据其属性之一递归地拆分attVal数据ds1集(...这样的数据集:ds2ds3min(df2)max(df2)min(df3)max(df3)ds1

Split by min(df2) and max(df2):

min(df1)-----------------------------------------------max(df1)
               min(df2)----------------max(df2)

或拆分max(df2)

                     min(df1)----------------------------------max(df1)
min(df2)------------------------------------max(df2)

拆分方式min(df2)

min(df1)------------------------------------max(df1)
                   min(df2)----------------------------------max(df2)

下一个拆分将使用上一步的拆分部分和min(df3) & min(df3)值(等后续步骤使用ds4...)。然后返回所有结果部分。为此,我创建了以下 Java 方法:

public static ArrayList<Dataset<Row>> mySplittingMethod(Dataset<Row> df1, ArrayList<Dataset<Row>> allSplittingDf) {
    ArrayList<Dataset<Row>> partsOfSplits = null;

    long df1Max = df1.select(max(df1.col("attVal"))).first().getLong(0);
    long df1Min = df1.select(min(df1.col("attVal"))).first().getLong(0);
    for (Dataset<Row> currentDf : allSplittingDf) {
        df2 = currentDf;
        long df2Max = df2.select(max(df2.col("attVal"))).first().getLong(0);
        long df2Min = df2.select(min(df2.col("attVal"))).first().getLong(0);
        if (df1Min < df2Min && df2Min < df1Max && df1Max < df2Max) {
            Dataset<Row> firstDf = df1.where("attVal<= df2Min");
            mySplittingMethod(firstDf, allSplittingDf);
            Dataset<Row> secondDf = df1.where("attVal> df2Min");
            mySplittingMethod(secondDf, allSplittingDf);
            partsOfSplits.add(firstDf);
            partsOfSplits.add(secondDf);
        } else if (df1Min > df2Min && df1Min < df2Max && df2Max < df1Max) {
            Dataset<Row> firstDf = df1.where("attVal<= df2Max");
            mySplittingMethod(firstDf, allSplittingDf);
            Dataset<Row> secondDf = df1.where("attVal> df2Max");
            mySplittingMethod(secondDf, allSplittingDf);
            partsOfSplits.add(firstDf);
            partsOfSplits.add(secondDf);
        } else if (df1Min > df2Min && df2Max < df1Max) {
            Dataset<Row> firstDf = df1.where("attVal<= df2Min");
            mySplittingMethod(firstDf, allSplittingDf);
            Dataset<Row> secondDf = df1.where("attVal>= df2Min && attVal<= df2Max");
            mySplittingMethod(secondDf, allSplittingDf);
            Dataset<Row> thirdDf = df1.where("attVal> df2Max");
            mySplittingMethod(thirdDf, allSplittingDf);
            partsOfSplits.add(firstDf);
            partsOfSplits.add(secondDf);
            partsOfSplits.add(thirdDf);
        } else continue;
    }
    return partsOfSplits;
}

我的问题是:对于较大尺寸的输入参数allSplittingDf,StackOverFlow 错误显示,为什么?从算法上讲,我的递归调用有问题吗?

标签: javaarraysalgorithmapache-sparkrecursion

解决方案


推荐阅读