首页 > 解决方案 > Compatibility issues with H2O.ai Hadoop on MapR 6.0 via python API?

问题描述

Having apparent compatibility issues running H2O (via the 3.18.0.2 MapR 5.2 driver (trying with the latest driver (3.20.0.7) as recommended in another SO post did not help the problem)) on MapR 6.0.

While able to start an H2O cluster on MapR 6.0 (via something like hadoop jar h2odriver.jar -nodes 3 -mapperXmx 6g -output hdfsOutputDirName ) and seem to be able to access h2o Flow UI, having problems accessing the cluster via python API (pip show h2o confirms matching package version to driver being used).

Is the MapR 5.2 driver (currently the latest MapR driver version offered by H2O) incompatible with MapR 6.0 (would not be asking if not for the fact that seem to be able to use the H2O Flow UI on cluster instance started on MapR 6.0)? Any workaround other than standalone driver version (would like to still be able to leverage YARN on hadoop cluster)?

The code and error being seen when trying to connect to the running H2O using the python APIis shown below.

# connect to h2o service
h2o.init(ip=h2o_cnxn_ip)

where the h2o_cnxn_ip is the IP and port generated after starting the h2o cluster on the MapR 6.0 system. Produces error

Checking whether there is an H2O instance running at http://172.18.0.123:54321...
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-5-1728877a03a2> in <module>()
      1 # connect to h2o service
----> 2 h2o.init(ip=h2o_cnxn_ip)

/home/me/projects/myproject/lib/python2.7/site-packages/h2o/h2o.pyc in init(url, ip, port, https, insecure, username, password, cookies, proxy, start_h2o, nthreads, ice_root, enable_assertions, max_mem_size, min_mem_size, strict_version_check, ignore_config, extra_classpath, **kwargs)
    250                                      auth=auth, proxy=proxy,cookies=cookies, verbose=True,
    251                                      _msgs=("Checking whether there is an H2O instance running at {url}",
--> 252                                             "connected.", "not found."))
    253     except H2OConnectionError:
    254         # Backward compatibility: in init() port parameter really meant "baseport" when starting a local server...

/home/me/projects/myproject/lib/python2.7/site-packages/h2o/backend/connection.pyc in open(server, url, ip, port, https, auth, verify_ssl_certificates, proxy, cookies, verbose, _msgs)
    316             conn._stage = 1
    317             conn._timeout = 3.0
--> 318             conn._cluster = conn._test_connection(retries, messages=_msgs)
    319             # If a server is unable to respond within 1s, it should be considered a bug. However we disable this
    320             # setting for now, for no good reason other than to ignore all those bugs :(

/home/me/projects/myproject/lib/python2.7/site-packages/h2o/backend/connection.pyc in _test_connection(self, max_retries, messages)
    558                 raise H2OServerError("Local server was unable to start")
    559             try:
--> 560                 cld = self.request("GET /3/Cloud")
    561                 if cld.consensus and cld.cloud_healthy:
    562                     self._print(" " + messages[1])

/home/me/projects/myproject/lib/python2.7/site-packages/h2o/backend/connection.pyc in request(self, endpoint, data, json, filename, save_to)
    400                                     auth=self._auth, verify=self._verify_ssl_cert, proxies=self._proxies)
    401             self._log_end_transaction(start_time, resp)
--> 402             return self._process_response(resp, save_to)
    403 
    404         except (requests.exceptions.ConnectionError, requests.exceptions.HTTPError) as e:

/home/me/projects/myproject/lib/python2.7/site-packages/h2o/backend/connection.pyc in _process_response(response, save_to)
    711         if content_type == "application/json":
    712             try:
--> 713                 data = response.json(object_pairs_hook=H2OResponse)
    714             except (JSONDecodeError, requests.exceptions.ContentDecodingError) as e:
    715                 raise H2OServerError("Malformed JSON from server (%s):\n%s" % (str(e), response.text))

/home/me/projects/myproject/lib/python2.7/site-packages/requests/models.pyc in json(self, **kwargs)
    882                 try:
    883                     return complexjson.loads(
--> 884                         self.content.decode(encoding), **kwargs
    885                     )
    886                 except UnicodeDecodeError:

/usr/lib64/python2.7/json/__init__.pyc in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    349     if parse_constant is not None:
    350         kw['parse_constant'] = parse_constant
--> 351     return cls(encoding=encoding, **kw).decode(s)

/usr/lib64/python2.7/json/decoder.pyc in decode(self, s, _w)
    364 
    365         """
--> 366         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    367         end = _w(s, end).end()
    368         if end != len(s):

/usr/lib64/python2.7/json/decoder.pyc in raw_decode(self, s, idx)
    380         """
    381         try:
--> 382             obj, end = self.scan_once(s, idx)
    383         except StopIteration:
    384             raise ValueError("No JSON object could be decoded")

/home/me/projects/myproject/lib/python2.7/site-packages/h2o/backend/connection.pyc in __new__(cls, keyvals)
    823         for k, v in keyvals:
    824             if k == "__meta" and isinstance(v, dict):
--> 825                 schema = v["schema_name"]
    826                 break
    827             if k == "__schema" and is_type(v, str):

KeyError: u'schema_name'

标签: h2o

解决方案


H2O 目前不支持 MapR 6。目前 H2O 最高支持 MapR 5.2。

请参阅下载页面了解支持的 Hadoop 版本。


推荐阅读