r - 更新 h2o.frame 子集的列
问题描述
我做了一个简单的例子来说明我正在尝试做的事情,我打算更加灵活。
我希望能够在其行上对 h2o.frame 进行子集化,对这些行进行一些计算,然后将结果分配给这些相同的行。在这个例子中,我计算了每组“cyl”中的相对“mpg”。
library(h2o)
packageVersion("h2o")
[1] ‘3.32.1.3’
version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 4
minor 1.1
year 2021
month 08
day 10
svn rev 80725
language R
version.string R version 4.1.1 (2021-08-10)
nickname Kick Things
h2o.init()
mtcars <- as.h2o(mtcars)
mtcars$mpg_rel <- NA
mtcars
mpg cyl disp hp drat wt qsec vs am gear carb mpg_rel
1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 NaN
2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 NaN
3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 NaN
4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 NaN
5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 NaN
6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 NaN
[32 rows x 12 columns]
mtcars[mtcars[["cyl"]] == 4, "mpg"] / h2o.mean(mtcars[mtcars[["cyl"]] == 4, "mpg"])
mpg
1 0.8550972
2 0.9151040
3 0.8550972
4 1.2151381
5 1.1401296
6 1.2713945
[11 rows x 1 column]
# however, the assignment throws an error and corrupts `mtcars`
mtcars[mtcars[["cyl"]] == 4, "mpg_rel"] <- mtcars[mtcars[["cyl"]] == 4, "mpg"] / h2o.mean(mtcars[mtcars[["cyl"]] == 4, "mpg"])
ERROR: Unexpected HTTP Status code: 412 Precondition Failed (url = http://localhost:54321/99/Rapids)
water.exceptions.H2OIllegalArgumentException
[1] "water.exceptions.H2OIllegalArgumentException: unimplemented"
[2] " water.H2O.unimpl(H2O.java:1310)"
[3] " water.rapids.ast.prims.assign.AstRectangleAssign.apply(AstRectangleAssign.java:93)"
[4] " water.rapids.ast.prims.assign.AstRectangleAssign.apply(AstRectangleAssign.java:30)"
[5] " water.rapids.ast.AstExec.exec(AstExec.java:63)"
[6] " water.rapids.ast.prims.assign.AstTmpAssign.apply(AstTmpAssign.java:48)"
[7] " water.rapids.ast.prims.assign.AstTmpAssign.apply(AstTmpAssign.java:17)"
[8] " water.rapids.ast.AstExec.exec(AstExec.java:63)"
[9] " water.rapids.Session.exec(Session.java:85)"
[10] " water.rapids.Rapids.exec(Rapids.java:94)"
[11] " water.api.RapidsHandler.exec(RapidsHandler.java:38)"
[12] " java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)"
[13] " java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)"
[14] " java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)"
[15] " java.base/java.lang.reflect.Method.invoke(Method.java:566)"
[16] " water.api.Handler.handle(Handler.java:60)"
[17] " water.api.RequestServer.serve(RequestServer.java:470)"
[18] " water.api.RequestServer.doGeneric(RequestServer.java:301)"
[19] " water.api.RequestServer.doPost(RequestServer.java:227)"
[20] " javax.servlet.http.HttpServlet.service(HttpServlet.java:707)"
[21] " javax.servlet.http.HttpServlet.service(HttpServlet.java:790)"
[22] " org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)"
[23] " org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:535)"
[24] " org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)"
[25] " org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1317)"
[26] " org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)"
[27] " org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)"
[28] " org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)"
[29] " org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1219)"
[30] " org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)"
[31] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)"
[32] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)"
[33] " water.webserver.jetty9.Jetty9ServerAdapter$LoginHandler.handle(Jetty9ServerAdapter.java:130)"
[34] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)"
[35] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)"
[36] " org.eclipse.jetty.server.Server.handle(Server.java:531)"
[37] " org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:352)"
[38] " org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)"
[39] " org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)"
[40] " org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)"
[41] " org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)"
[42] " org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)"
[43] " org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)"
[44] " org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)"
[45] " org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)"
[46] " org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)"
[47] " org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:762)"
[48] " org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:680)"
[49] " java.base/java.lang.Thread.run(Thread.java:834)"
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
ERROR MESSAGE:
unimplemented
我发现的一个解决方案是使用vector
行号中的一个来进行子集化。但是,当使用大数据时,这种as.vector
转换非常低效。如果可以采用与上述方法类似的方法,那就太好了。
which_rows <- as.vector(h2o.which(mtcars[["cyl"]] == 4))
mtcars[which_rows, "mpg_rel"] <- mtcars[which_rows, "mpg"] / h2o.mean(mtcars[which_rows, "mpg"])
mtcars
mpg cyl disp hp drat wt qsec vs am gear carb mpg_rel
1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 NaN
2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 NaN
3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 0.8550972
4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 NaN
5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 NaN
6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 NaN
[32 rows x 12 columns]
解决方案
我不确定这是否是最好的方法,但使用以下方法应该比使用更有效as.vector
(一次遍历所有行以获得h2o.ifelsempg_mean
中的一次遍历)。
mpg_mean <- h2o.mean(mtcars[mtcars[["cyl"]] == 4, "mpg"])
mtcars[["mpg_rel"]] <- h2o.ifelse(mtcars[["cyl"]] == 4, mtcars[["mpg"]] / mpg_mean, mtcars[["mpg"]])
推荐阅读
- symfony - 如何在 Symfony 中创建路由事件处理程序
- r - R - 按 ID 对数据框进行分组,计算每个 ID 的开始和结束日期范围内的记录数
- python - Pandas 从具有索引列的字典列表中创建一个 DataFrame
- javascript - 使用按钮打开桌面程序?
- c - “Rust 不会将整数转换为引用”这句话是什么意思?
- internationalization - BCP 47 - 区域子标签的定义不是很清楚
- javascript - Leaflet draw:drawvertex 不删除最后插入的点
- flutter - 如何在 Flutter 中包装包含其他小部件的行而不溢出
- c# - 从 JArray 列表转换为 JObject
- vb.net - 如何设置我的应用程序以可靠地将日志记录数据写入日志文件而不减慢它的速度?