首页 > 解决方案 > 在没有联合查找数据结构的情况下实现 CCL(连接组件标签)算法?

问题描述

各位程序员好!

一周前,我被分配了实现连接组件算法的任务,主要是从图像中提取对象的数量。

您可以在此处阅读有关该算法的更多信息(https://en.wikipedia.org/wiki/Connected-component_labeling),我尝试实现的变体是两个通过一个。

这是我目前的尝试:

% ------------------------------------------------------------------------------
% -> Connected Component Labeling (CCL) Algorithm
% -> 4-Connectivity Version
% ------------------------------------------------------------------------------

% ------------------------------------------------------------------------------
% - [ Pre-Scan Code To Get Everything Ready ] -
% ------------------------------------------------------------------------------
% Starting With A Clean (Workspace) And (Command Window).
clear, clc;

% Instead Of Loading An Actual Image, We Are Creating A Matrix Of Zeros And Ones, Representing A Binary Image.
originalImage = [ ...
                0 1 0
                1 0 1
                0 1 0 ];

% Creating A Bigger Matrix That We Will Use To Store The Original Image In Its Middle, This Will Help Us Eliminate Border Checking In The Raster Scan.
binaryImage = zeros(size(originalImage) + 2);

% Copying The Pixels From The Original Image Into The Middle Of The Larger Matrix We Created.
binaryImage(2:size(originalImage, 1) + 1, 2:size(originalImage, 2) + 1) = originalImage;

% Getting The Number Of Rows (Height) And Number Of Columns (Width) Of The Binary Image.
[imageRows, imageColumns] = size(binaryImage);

% Creating A Matrix The Same Dimensions As The Binary Image In Which The Labeling Will Happen.
labeledImage = zeros(imageRows, imageColumns);

% Creating A Label Counter That We Will Use To Assign When We Create New Labels.
labelCounter = 1;

% ------------------------------------------------------------------------------
% - [First Scan: Assigning Labels To Indices] -
% ------------------------------------------------------------------------------
% Going Over Each Row In The Image One By One.
for r = 1:imageRows
   % Going Over Each Column In The Image One By One.
   for c = 1:imageColumns
       % If The Pixel Currently Being Scanned Is A Foreground Pixel (1).
       if (binaryImage(r, c) == 1)
           % Since We Are Working With 4-Connectivity We Only Need To Read 2 Labels, Mainly The (Left) And (Top) Labels.
           % Storing Them In Variables So Referencing Them Is Easier.
           left = labeledImage(r, c - 1);
           top = labeledImage(r - 1, c);
           % If Left == 0 And Top == 0 -> Create A New Label, And Increment The Label Counter, Also Add The Label To The Equivalency List.
           if (left == 0 && top == 0)
               labeledImage(r, c) = labelCounter;
               labelCounter = labelCounter + 1;
           % If Left == 0 And Top >= 1 -> Copy The Top Label.
           elseif (left == 0 && top >= 1)
               labeledImage(r, c) = top;
           % If Left >= 1 And Top == 0 -> Copy The Left Label.
           elseif (left >= 1 && top == 0)
               labeledImage(r, c) = left;
           % If Left >= 1 And Top >= 1 -> Find The Minimum Of The Two And Copy It, Also Add The Equivalent Labels To The Equivalency List, So We Can Fix Them In The Second Scan.
           elseif (left >= 1 && top >= 1)
               labeledImage(r, c) = min(left, top);
           end
       end
   end
end

% ------------------------------------------------------------------------------
% - [Second Scan: Fixing The Connected Pixels But Mismatched Labels] -
% ------------------------------------------------------------------------------
for r = 1:imageRows
   for c = 1:imageColumns

   end
end

第一遍没有任何问题,我已经尝试了多次测试,但是我不知道如何实现第二遍,其中我必须修复标记矩阵中的等效标签。

我确实在网上做过研究,首选的方法是使用 union-find(不相交集)数据结构来存储标签之间的等价性。

但是,由于我使用的是MATLAB,而union-find数据结构并没有实现,我必须自己实现,由于MATLAB是一种解释型语言,这很麻烦,而且需要大量的时间和精力。

因此,我对实现第二遍的想法持开放态度,而不必使用 union-find 数据结构。

提前致谢!

标签: algorithmmatlabimage-processing

解决方案


如果您不想使用 union-find,那么您真的应该使用 flood-fill-each-component 算法(维基百科中的另一个)。

然而,Union-find 既不困难也不慢,这就是我用来解决这个问题的方法。要进行简单快速的实施:

  • 使用与图像形状相同的整数矩阵来表示集合——矩阵中的每个整数都表示相应像素的集合。
  • 整数x表示如下集合:如果x < 0,则它是大小为-x的根集。如果x>=0则它是带有父集x的子集(即行 x/宽度,列 x% 宽度)。将所有集合初始化为 -1,因为它们都以大小为 1 的根开始。
  • 使用 union-by-size 和路径压缩

推荐阅读