H2O AI:不支持的MOJO模型"word2vec"



>我有 3 个 h2o 型号:

$ ls dataset/mojo
1. DeepLearning_model_python_1582176092021_2.zip
2. StackedEnsemble_BestOfFamily_AutoML_20200220_073620.zip
3. Word2Vec_model_python_1582176092021_1.zip

这 3 个的二进制模型是在 v3.28.0.3 上生成的,但我正在尝试升级 h2o 版本并将其生产到 v3.30.0.5 上 所以我成功地将这 3 个二进制文件转换为 MOJO 模型(如上所列(

尝试使用h2o.upload_mojo上传这些 mojo 模型时,仅针对 Word2Vec,出现错误:


In [15]: w2v_path = 'dataset/mojo/Word2Vec_model_python_1582176092021_1.zip'
In [16]: w2v_model = h2o.upload_mojo(w2v_path)
generic Model Build progress: | (failed)                                                      |   0%
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-16-734005ed70a8> in <module>
----> 1 w2v_model = h2o.upload_mojo(w2v_path)
~/.envs/h2o-test/lib/python3.8/site-packages/h2o/h2o.py in upload_mojo(mojo_path)
2149     frame_key = response["destination_frame"]
2150     mojo_estimator = H2OGenericEstimator(model_key = get_frame(frame_key))
-> 2151     mojo_estimator.train()
2152     print(mojo_estimator)
2153     return mojo_estimator
~/.envs/h2o-test/lib/python3.8/site-packages/h2o/estimators/estimator_base.py in train(self, x, y, training_frame, offset_column, fold_column, weights_column, validation_frame, max_runtime_secs, ignored_columns, model_id, verbose)
113                                  validation_frame=validation_frame, max_runtime_secs=max_runtime_secs,
114                                  ignored_columns=ignored_columns, model_id=model_id, verbose=verbose)
--> 115         self._train(parms, verbose=verbose)
116
117     def train_segments(self, x=None, y=None, training_frame=None, offset_column=None, fold_column=None,
~/.envs/h2o-test/lib/python3.8/site-packages/h2o/estimators/estimator_base.py in _train(self, parms, verbose)
205             return
206
--> 207         job.poll(poll_updates=self._print_model_scoring_history if verbose else None)
208         model_json = h2o.api("GET /%d/Models/%s" % (rest_ver, job.dest_key))["models"][0]
209         self._resolve_model(job.dest_key, model_json)
~/.envs/h2o-test/lib/python3.8/site-packages/h2o/job.py in poll(self, poll_updates)
75         if self.status == "FAILED":
76             if (isinstance(self.job, dict)) and ("stacktrace" in list(self.job)):
---> 77                 raise EnvironmentError("Job with key {} failed with an exception: {}nstacktrace: "
78                                        "n{}".format(self.job_key, self.exception, self.job["stacktrace"]))
79             else:
OSError: Job with key $03010a64051932d4ffffffff$_8d0c64127137bd1eef16202889cf4fca failed with an exception: java.lang.IllegalArgumentException: Unsupported MOJO model 'word2vec'.
stacktrace:
java.lang.IllegalArgumentException: Unsupported MOJO model 'word2vec'.
at hex.generic.Generic$MojoDelegatingModelDriver.computeImpl(Generic.java:99)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:248)
at hex.generic.Generic$MojoDelegatingModelDriver.compute2(Generic.java:78)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1557)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

其他两个模型成功没有任何问题,并返回有效的model_id。知道这里的问题是什么,因为从文档中了解到MOJO支持所有三种模型类型

我在 K8s 上尝试了 2 个 Pod 集群,每个 Pod 都有 2Gi/1cpu 内存,但结果与上述结果相同。

Word2Vec 目前不在允许导入回 H2O 的算法列表中。

文档有点混乱,需要改进。MOJO是一种将H2O模型投入生产的方法。这些可以使用H2O的genmodel在H2O之外使用。其中一些MOJO可以导入回H2O并进行检查。但不是全部。支持列出的前两种算法。不幸的是,Word2Vec不是。

我创建了一个 JIRA 来跟踪此问题。我们应该能够至少实现得分。

相关内容

  • 没有找到相关文章

最新更新