首页 > 解决方案 > 将数据重新采样到训练中,对 Boostrapping python 进行有效和测试

问题描述

我想将数据重新采样为训练、有效和测试以用于 Boostrapping 目的。将其拆分为训练、有效和测试的任何帮助都是可观的。

错误是文件“”,第 22 行有效 = numpy.array([x for x in values if x.tolist() not in train.tolist() ^ IndentationError: unindent 不匹配任何外部缩进级别。

可重现的代码如下:

import numpy
from pandas import read_csv
from sklearn.utils import resample
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split

# load dataset

boston_dataset = load_boston()
df = pd.DataFrame(boston_dataset.data, columns=boston_dataset.feature_names)
df['MEDV'] = boston_dataset.target
values = df.values
# bootstrap
n_iterations = 1000
n_size = int(len(df))
# run bootstrap
stats = list()
for i in range(n_iterations):
    # prepare train and test sets
    train = resample(values, n_samples=n_size)
    valid = numpy.array([x for x in values if x.tolist() not in train.tolist()])
    test = numpy.array([x for x in values if x.tolist() not in valid.tolist()])

标签: python

解决方案


推荐阅读