首页 > 解决方案 > 将 XGBoost 与 dask 分布式一起使用时出现值类型错误

问题描述

这是在我的机器上重现错误的代码:

import numpy as np
import xgboost as xgb
import dask.array as da
import dask.distributed
from dask_cuda import LocalCUDACluster
from dask.distributed import Client

X = da.from_array(np.random.randint(0,10,size=(10,10)))
Y = da.from_array(np.random.randint(0,10,size=(10,1)))

cluster = LocalCUDACluster(n_workers=4, threads_per_worker=1)
client = Client(cluster)

dtrain = xgb.dask.DaskDeviceQuantileDMatrix(client=client, data=X, label=Y)

params = {'tree_method':'gpu_hist','objective':'rank:pairwise','min_child_weight':1,'max_depth':3,'eta':0.1} 
watchlist = [(trainLong, 'train')] 
reg= xgb.dask.train(client, params, dtrain, num_boost_round=10,evals=watchlist,verbose_eval=1)

这是错误的摘要:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-ff1b0329f2f9> in <module>
      1 params = {'tree_method':'gpu_hist','objective':'rank:pairwise','min_child_weight':1,'max_depth':3,'eta':0.1}
      2 watchlist = [(trainLong, 'train')]
----> 3 regLong = xgb.dask.train(client, params, trainLong, num_boost_round=10,evals=watchlist,verbose_eval=1)

/usr/local/share/anaconda3/lib/python3.7/site-packages/xgboost/data.py in _device_quantile_transform()
    804         return _transform_dlpack(data), feature_names, feature_types
    805     raise TypeError('Value type is not supported for data iterator:' +
--> 806                     str(type(data)))
    807 
    808 

TypeError: Value type is not supported for data iterator:<class 'numpy.ndarray'>

一些设备分位数矩阵如何仍然作为一个 numpy 数组传递???

我尝试使用 pandas 数据框并将其转换为 dask 数据框,然后将其转换为设备分位数矩阵...

标签: pythongpudaskdistributedxgboost

解决方案


推荐阅读