-潼
Stereo Vision
This
example shows how to compute the depth map between
two rectified
stereo images. See the
Image Rectification ExampleImage Rectification
Example
to
learn
about
the
details
behind
rectification.
In
this
example
we use block matching, which is the
standard algorithm for high-speed
stereo vision in hardware systems [8].
We first explore basic block
matching,
and then apply dynamic programming to improve
accuracy, and
image pyramiding to
improve speed.
Stereo
vision is the process of recovering depth from
camera images by
comparing two or more
views of the same scene. Simple, binocular stereo
uses only two images, typically taken
with parallel cameras that were
separated by a horizontal distance
known as the
of the stereo computation
is a disparity map (which is translatable to
a range image) which tells how far each
point in the physical scene was
from
the camera.
Step
1.
Read
Stereo
Image
PairHere
we
read
in
the
color
stereo
image
pair
and
convert
the
images
to
gray
scale
for
the
matching
process.
Using
color
images
may
provide
some
improvement
in
accuracy,
but
it
is
more
efficient
to work with only one-channel images.
For this we use the
ImageDataTypeConverter
and
the
ColorSpaceConverter
System
objects.
Below
we
show
the
left
camera
image
and
a
color
composite
of
both
images
so
that
one can
easily see the disparity between them.
hIdtc = ataTypeConverter;
hCsc =
paceConverter('Conversion','RGB to intensity');
leftI3chan =
step(hIdtc,imread('vipstereo_'));
leftI
= step(hCsc,leftI3chan);
rightI3chan =
step(hIdtc,imread('vipstereo_'));
rightI = step(hCsc,rightI3chan);
figure(1), clf; clf;
用来清除图形的命令。一般在画图之前用。
imshow(rightI3chan), title('Right
image');
figure(2), clf;
p>
imshowpair(
rightI,leftI
,'ColorChannels','red-cyan'), axis image;
title('Color composite (right=red,
left=cyan)');
Step 2. Basic Block MatchingNext we
perform basic block matching. For
every
pixel
in
the
right
image,
we
extract
the
7-by-7-pixel
block
around
it
and
search
along
the
same
row
in
the
left
image
for
the
block
that
best
matches
it. Here we search in a range of
pixels
around
the
pixel's
location
in
the
first
image,
and
we
use
the
sum
of
absolute
differences
(SAD)
to
compare
the
image
regions.
We
need
only
search
over columns and not over rows because the images
are rectified.
We use the
TemplateMatcher System object to perform this
block matching
between each block and
the region of interest.
Dbasic =
zeros(size(leftI), 'single');
disparityRange = 15;
%
Selects (2*halfBlockSize+1)-by-(2*halfBlockSize+1)
block.
halfBlockSize = 3;
blockSize = 2*halfBlockSize+1;
% Allocate space for all template
matcher System objects.
tmats =
cell(blockSize);
%
Initialize progress bar
hWaitBar =
waitbar(0, 'Performing basic block matching...');
nRowsLeft = size(leftI,
1);//
返回的是左侧图像的行数
% Scan over all
为行
scan
扫描所有行
for
m=1:nRowsLeft
扫描所有行
% Set min/max row bounds for image
block.
图像块的行界起始
minr = max(1,m-halfBlockSize);
maxr
= min(nRowsLeft,m+halfBlockSize);
一个窗口上下
行
% Scan over all
columns.
for n=1:size(leftI,2)
minc = max(1,n-halfBlockSize);
maxc = min(size(leftI,2),n+half
BlockSize);
左右列
% Compute disparity bounds.
mind = max( -disparityRange, 1-minc );
maxd = min( disparityRange, size(leftI,2)-maxc );
% Construct
template and region of interest.
template = rightI(minr:maxr,minc:maxc);
一块,上下行、左右列
templateCenter = floor((size(template)+1)/2);
向下取整
roi =
[minc+templateCenter(2)+mind-1 ...
minr+templateCenter(1)-1 ...
maxd-mind+1 1];
% Lookup proper
TemplateMatcher object; create if
empty.//
模板匹配
if
isempty(tmats{size(template,1),size(template,2)})
tmats{size(template,1),size(template,2)} = ...
teMatcher('ROIInputPort',true);
end
thisTemplateMatcher =
tmats{size(template,1),size(template,2)};
% Run
TemplateMatcher object.
loc =
step(thisTemplateMatcher, leftI, template, roi);
Dbasic(m,n) = loc(1) - roi(1) +
mind;
end
waitbar(m/nRowsLeft,hWaitBar);
end
close(hWaitBar);
In
the
results
below,
the
basic
block
matching
does well,
as
the
correct
shape
of
the
stereo
scene
is
recovered.
However,
there
are
noisy
patches
and
bad
depth
estimates
everywhere,
especially
on
the
ceiling.
These
are
caused when no strong
image features appear inside of the 7-by-7-pixel
windows being compared. Then the
matching process is subject to noise
since each pixel chooses its disparity
independently of all the other
pixels.
For display purposes, we saturate the depth map to
have only
positive values. In general,
slight angular misalignment of the stereo
cameras used for image acquisition can
allow both positive and negative
disparities to appear validly in the
depth map. In this case, however,
the
stereo
cameras
were
near
perfectly
parallel,
so
the
true
disparities
have only one
sign. Thus this correction is valid.
figure(3), clf;
imshow(Dbasic,[]), axis image,
colormap('jet'), colorbar;
caxis([0
disparityRange]);
title('Depth map from
basic block matching');
Step 3. Sub-pixel Estimation The
disparity estimates returned by block
matching are all integer-valued, so the
above depth map exhibits
contouring
effects
where
there
are
no
smooth
transitions
between
regions
of different disparity. This can be
ameliorated by incorporating
sub-pixel
computation
into
the
matching
metric.
Previously
we
only
took
the location of the minimum cost as the
disparity, but now we take into
consideration the minimum cost and the
two neighboring cost values.
We
fit a parabola to these three values,
and analytically solve for the
minimum
to get the sub-pixel correction.
DbasicSubpixel= zeros(size(leftI),
'single');
tmats =
cell(2*halfBlockSize+1);
hWaitBar=waitbar(0,'Performing sub-
pixel estimation...');
for
m=1:nRowsLeft
% Set min/max row
bounds for image block.
minr =
max(1,m-halfBlockSize);
maxr =
min(nRowsLeft,m+halfBlockSize);
%
Scan over all columns.
for
n=1:size(leftI,2)
minc =
max(1,n-halfBlockSize);
maxc =
min(size(leftI,2),n+halfBlockSize);
% Compute disparity bounds.
mind = max( -disparityRange, 1-minc );
maxd = min( disparityRange, size(leftI,2)-maxc );
% Construct
template and region of interest.
template = rightI(minr:maxr,minc:maxc);
templateCenter =
floor((size(template)+1)/2);
roi = [minc+templateCenter(2)+mind-1 ...
minr+templateCenter(1)-1
...
maxd-mind+1 1];
% Lookup proper TemplateMatcher
object; create if empty.
if
isempty(tmats{size(template,1),size(template,2)})
tmats{size(template,1),size(template,2)} = ...
teMatcher('ROIInputPort',true,...
'BestMatchNeighborhoodOutputPort',true);
end
thisTemplateMatcher =
tmats{size(templa
te,1),size(template,2)};
% Run TemplateMatcher object.
[loc,a2] = step(thisTemplateMatcher, leftI,
template, roi);
ix =
single(loc(1) - roi(1) + mind);
% Subpixel refinement of index.
DbasicSubpixel(m,n) = ix - 0.5
* (a2(2,3) - a2(2,1)) ...
/
(a2(2,1) - 2*a2(2,2) + a2(2,3));
end
waitbar(m/nRowsLeft,hWaitBar);
end
close(hWaitBar);
Re-running
basic block matching, we achieve the result below
where the
contouring effects are mostly
removed and the disparity estimates are
correctly refined. This is especially
evident along the walls.
figure(1),
clf;
imshow(DbasicSubpixel,[]), axis
image, colormap('jet'), colorbar;
caxis([0 disparityRange]);
title('Basic block matching with sub-
pixel accuracy');
Step 4.
Dynamic Programming As mentioned above, basic
block matching
creates a noisy
disparity image. This can be improved by
introducing a
smoothness
constraint.
Basic
block
matching
chooses
the
optimal
disparity
for
each
pixel
based
on
its
own
cost
function
alone.
Now
we
want
to
allow
a
pixel
to
have
a
disparity
with
possibly
sub-optimal
cost
for
it
locally.
This extra cost must be offset by
increasing that pixel's agreement in
disparity
with
its
neighbors.
In
particular,
we
constrain
each
disparity
estimate to lie
with
values of its
neighbors' disparities, where its neighbors are
the
adjacent pixels along an image row.
The problem of finding the optimal
disparity estimates for a row of pixels
now becomes one of finding the
path
from
one
side
of
the
image
to
the
other.
To
find
this
optimal
path, we use the underlying block
matching metric as the cost function
and
constrain
the
disparities
to
only
change
by
a
certain
amount
between
adjacent pixels.
This is a problem that can be solved efficiently
using
the technique of dynamic
programming [3,4].
Ddynamic =
zeros(size(leftI), 'single');
finf =
1e3; % False infinity
disparityCost
=
finf*ones(size(leftI,2),
2*disparityRange
+
1,
'single');
n
行(按列)
31
列
disparityPenalty = 0.5; % Penalty for
disparity disagreement between
pixels
hWaitBar = waitbar(0,'Using dynamic
programming for smoothing...');
% Scan
over all rows.
for m=1:nRowsLeft
disparityCost(:) = finf;
% Set min/max row bounds for image
block.
minr =
max(1,m-halfBlockSize);
maxr =
min(nRowsLeft,m+halfBlockSize);
%
Scan over all columns.
for
n=1:size(leftI,2)
minc =
max(1,n-halfBlockSize);
maxc =
min(size(leftI,2),n+halfBlockSize);
% Compute disparity bounds.
mind = max( -disparityRange, 1-minc );
maxd = min( disparityRange, size(leftI,2)-maxc );
% Compute and save all matching
costs.
for d=mind:maxd
disparityCost(n, d +
disparityRange + 1) = ...
sum(sum(abs(leftI(minr:maxr,(minc:maxc)
+d
) ...
-
rightI(minr:maxr,minc:maxc))));
End SAD
end
% Process scan line
disparity costs with dynamic programming.
optimalIndices =
zeros(size(disparityCost), 'single');
cp = disparityCost(end,:);//
最后一行,所有列
cp
是个数组
for j=size(disparityCost,1)-1:-1:1 // n
< br>(列)
-1
步长逐次到
1
% False infinity for this level
(j+2)
cfinf =
(size(disparityCost,1) - j + 1)*finf;
(
n-j+1
)
*finf
% Construct matrix for finding
optimal move for each column
%
individually.
[v,ix] =
min([cfinf cfinf cp(1:end-4)+3*disparityPenalty;
cfinf
cp(1:end-3)+2*disparityPenalty;
cp(1:end-2)+disparityPenalty;
cp(2:end-1);
同列数,
min
求列的最小值,返回每列
p>
cp(3:en
d)+disparityPenalty;
最小值,并返回所在的
位置,注意是每列
cp(4:end)+2*disparityPenalty cfinf;
cp(5:end)+3*disparityPenalty
cfinf
cfinf],[],1);
这样
m
in
返回
[Y,U],Y
为列中的最小
值,
U
记录一列中的行号。
cp = [cfinf
disparityCost(j,2:end-1)+v cfinf];
% Record optimal routes.
optimalIndices(j,2:end-1)
=
(2:size(disparityCost,2)-1)
+
(ix
-
4);
end
%
Recover optimal route.
[~,ix] =
min(cp);
Ddynamic(m,1) = ix;
for k=1:size(Ddynamic,2)-1
Ddynamic(m,k+1) =
optimalIndices(k, ...
max(1, min(size(optimalIndices,2),
round(Ddynamic(m,k)) ) ) );
end
waitbar(m/nRowsLeft, hWaitBar);
end
close(hWaitBar);
Ddynamic = Ddynamic - disparityRange -
1;
The image below shows the stereo
result refined by applying dynamic
programming
to
each
row
individually.
Dynamic
programming
does
introduce
errors of its own
by blurring the edges around object boundaries due
to
the smoothness
constraint.
Also, it does nothing to
smooth ''between''
rows,
which is why a striation pattern now
appears on the left side
foreground
chair.
Despite
these
limitations,
the
result
is
significantly
improved, with
the noise along the walls and ceiling nearly
completely
removed, and with many of
the foreground objects being better
reconstructed.
figure(3),
clf;
imshow(Ddynamic,[]), axis image,
colormap('jet'), colorbar;
caxis([0
disparityRange]);
title('Block matching
with dynamic programming');
Step 5. Image Pyramiding While dynamic
programming can improve the
accuracy
of
the
stereo
image,
basic
block
matching
is
still
an
expensive
operation,
and
dynamic
programming
only
adds
to
the
burden.
One
solution
is to use image
pyramiding and telescopic search to guide the
block
matching
[5,7].
With
the
full-
size
image,
we
had
to
search
over
a
-pixel
range
to
properly
detect
the
disparities
in
the
image.
If
we
had
down-sized
the
image
by
a
factor
of
two,
however,
this
search
could
have
been
reduced
to
pixels
on
an
image
a
quarter
of
the
area,
meaning
this
step
would
cost
a factor of 8 less.
Then we use the disparity estimates from this
down-sized
operation
to
seed
the
search
on
the
larger
image,
and
therefore
we only need to
search over a smaller range of disparities. The
below
example performs this telescoping
stereo matching using a four-level
image pyramid. We use the Pyramid and
Geometric Scaler System objects,
and
we
have
wrapped
up
the
preceding
block
matching
code
into
the
function
vipstereo_blockmatch.
mvipstereo_blockmatch. m for simplicity. The
disparity search range is only pixels
at each level, making it over 5x
faster
to compute than basic block matching. Yet the
results compare
favorably.
%
Construct a four-level pyramid
pyramids
= cell(1,4); 1
行
4
列的单元矩阵,元素随便放什么
pyramids{1}.L
= leftI;
pyramids{1}.R = rightI;
for i=2:length(pyramids) 2 4
1
到
n
是
n
hPyr = d('PyramidLevel',1);
默认
reduce
图像
pyramids{i}.L =
single(step(hPyr,pyramids{i-1}.L));
pyramids{i}.R =
single(step(hPyr,pyramids{i-1}.R));
end
都是名称值的形式
一对对
% Declare
original search radius as +/-4 disparities for
every pixel.
smallRange = single(3);
disparityMin = repmat(-smallRange,
size(pyramids{end}.L));
disparityMax =
repmat( smallRange, size(pyramids{end}.L));
% Do telescoping search over pyramid
levels.
for i=length(pyramids):-1:1
Dpyramid =
vipstereo_blockmatch(pyramids{i}.L,pyramids{i}.R,
...
disparityMin,disparityMax,...
false,true,3,...
true,... %
Waitbar
['Performing
level-',num2str(length(pyramids)-i+1),...
' pyramid block matching...']);
%Waitbar title
if i > 1
% Scale disparity values for
next level.
hGsca =
ricScaler(...
'InterpolationMethod','Nearest neighbor',...
'SizeMethod','Number of
output rows and columns',...
'Size',size(pyramids{i-1}.L));
Dpyramid = 2*step(hGsca, Dpyramid);
% Maintain search radius of +/-smallRange.
disparityMin = Dpyramid -
smallRange;
disparityMax =
Dpyramid + smallRange;
end
end
figure(3),
clf;
imshow(Dpyramid,[]),
colormap('jet'), colorbar, axis image;
caxis([0 disparityRange]);
title('Four-level pyramid block
matching');
Step
6.
Combined
Pyramiding
and
Dynamic
ProgrammingFinally
we
merge
the
above
techniques
and
run
dynamic
programming
along
with
image
pyramiding,
where the
dynamic programming is run on the disparity
estimates output
by
every
pyramid
level.
The
results
compare
well
with
the
highest-quality
results we have obtained so far, and
are still achieved at a reduced
computational
burden
versus
basic
block
is
also
possible
to
use sub-pixel methods with dynamic
programming, and we show the results
of
all three techniques in the second image. As
before, subpixel
refinement
reduces
contouring
effects
and
clearly
improves
accuracy.
The
previous code has been bundled into
vipstereo_blockmatch_ereo_blockmatch_combi
ned.m, which
exposes all of the options
previously presented as parameter-value
idDynamic =
vipstereo_blockm
atch_combined(leftI,rightI, ...
'NumPyramids',3, 'DisparityRange',4,
'DynamicProgramming',true, ...
'Waitbar', true, ...
'WaitbarTitle', 'Performing combined pyramid and
dynamic
programming');
figure(3), clf;
imshow(DpyramidDynamic,[]),
axis('image'), colorbar, colormap jet;
caxis([0 disparityRange]);
title('4-level pyramid with dynamic
programming');
DdynamicSubpixel =
vipstereo_blockmatch_combined(leftI,rightI, ...
'NumPyramids',3,
'DisparityRange',4,
'DynamicProgramming',true, ...
'Subpixel', true, ...
'Waitbar', true, ...