c++ - How to calculate maximum/minimum over n-channels in Halide using domains?
问题描述
I am currently trying out Halide, playing around with calculating the maximum/minimum over all channels of an image. I would like to achieve this for a arbitrary images, where the amount of channels is only known at runtime.
I successfully got the following solution:
#include "Halide.h"
#include "halide_image_io.h"
using namespace Halide::Tools;
int main(int argc, char **argv) {
Halide::Buffer<uint8_t> input = load_image("rgb.png");
Halide::Var x, y;
Halide::Func max_channels, min_channels;
max_channels(x, y) = input(x, y, 0);
min_channels(x, y) = input(x, y, 0);
for (int i = 1; i < input.channels(); ++i)
{
max_channels(x, y) = max(max_channels(x, y), input(x, y, i));
min_channels(x, y) = min(min_channels(x, y), input(x, y, i));
}
{
Halide::Buffer<uint8_t> output =
max_channels.realize({input.width(), input.height(), 1});
save_image(output, "maximum.png");
}
{
Halide::Buffer<uint8_t> output =
min_channels.realize({input.width(), input.height(), 1});
save_image(output, "minimum.png");
}
printf("Success!\n");
return 0;
}
However, I was wondering if it would be possible to achieve this without the explicit for
loop. Going by the documentation, it should be possible to achieve this using the Halide::RDom
class. From the examples given there, I would think that the following should work.
#include "Halide.h"
#include "halide_image_io.h"
using namespace Halide::Tools;
int main(int argc, char **argv) {
Halide::Buffer<uint8_t> input = load_image("rgb.png");
Halide::Var x, y;
Halide::Func max_channels, min_channels;
Halide::RDom r(input);
min_channels(x, y) = uint8_t{255};
max_channels(x, y) = uint8_t{0};
min_channels(r.x, r.y) = minimum(input(r.x, r.y, r.z));
max_channels(r.x, r.y) = maximum(input(r.x, r.y, r.z));
{
Halide::Buffer<uint8_t> output =
max_channels.realize({input.width(), input.height(), 1});
save_image(output, "maximum.png");
}
{
Halide::Buffer<uint8_t> output =
min_channels.realize({input.width(), input.height(), 1});
save_image(output, "minimum.png");
}
printf("Success!\n");
return 0;
}
This compiles, but unfortunately crashes with the following error message:
terminate called after throwing an instance of 'Halide::CompileError'
what(): Error: In update definition 0 of Func "f1":
Tuple element 0 of update definition has type uint8, but pure definition has type int32
Aborted (core dumped)
This message does not make any sense to me, as all the values involved are uint8_t
, so I am not sure where the programme is getting its int32_t
from.
Thus, my question: What is the correct way to calculate the maximum/minimum over all channels of an image using domains? Or is this something that is not possible, and a for
loop has to be used instead?
解决方案
Here's a way using the maximum
and minimum
helpers:
#include <Halide.h>
#include <halide_image_io.h>
using namespace Halide;
using namespace Halide::Tools;
int main() {
Buffer<uint8_t> input = load_image("rgb.png");
Var x, y;
RDom r_chan(0, input.dim(2).extent());
Func min_img;
Func max_img;
min_img(x, y) = minimum(input(x, y, r_chan));
max_img(x, y) = maximum(input(x, y, r_chan));
Buffer<uint8_t> min_out = min_img.realize({input.width(), input.height()});
Buffer<uint8_t> max_out = max_img.realize({input.width(), input.height()});
save_image(min_out, "min_out.png");
save_image(max_out, "max_out.png");
}
You just need to create a reduction domain that spans the number of channels (given by input.dim(2).extent()
). If you don't want to use the helpers, you can instead write (for example with min_img
):
min_img(x, y) = cast<uint8_t>(255);
min_img(x, y) = min(input(x, y, r_chan), min_img(x, y));
The way reduction domains work (absent other scheduling directives) is they insert an innermost loop that repeats the rule for each point in the reduction domain. Thus the loops in this second example look like this:
for y:
for x:
min_img(...) = ...
for y:
for x:
for r_chan in [0, 2]:
min_img(...) = ...
If you're worried about the optimizer not eliminating the first loop where it initializes everything to 255
, then you could instead write:
min_img(x, y) = undef(type_of<uint8_t>()); // skip the pure step
min_img(x, y) = min(
input(x, y, r_chan),
select(r_chan == 0, cast<uint8_t>(255), min_img(x, y))
);
I would expect LLVM to peel the first iteration (at least) in this case.
推荐阅读
- java - glReadPixels 给出错误的颜色
- python - Pandas:Dataframe 只有一个条目
- math - 视野局限,忽略视野背后的东西
- oauth-2.0 - 如何将客户端 ID 放入 OAuth 同意屏幕的 URL?
- python-3.x - 在从 tkinter 文本小部件检索的列表上运行函数
- html - iphone 不支持 HTML Div 背景图像的解决方案是什么?
- javascript - 为什么 JavaScript 中的 "0" == "" 为假
- mysql - 即使没有记录,如何在 MySQL 中返回所有前 30 天的列表?
- php - 基于视图生成 PDF 作为电子邮件附件 - Laravel
- jquery - 如何使用“输入”创建构造函数?