-
Digital Image Processing and
Edge Detection
Digital
Image Processing
Interest in digital image processing
methods stems from two principal application
areas:
improvement of pictorial
information for human interpretation; and
processing of image data
for storage,
transmission, and representation for autonomous
machine perception.
An
image
may
be
defined
as
a
two-
dimensional
function,
f(x,
y),
where
x
and
y
are
spatial (plane) coordinates, and the
amplitude of f at any pair of coordinates (x, y)
is called
the intensity or gray level
of the image at that point. When x, y, and the
amplitude values of f
are all finite,
discrete quantities, we call the image a digital
image. The field of digital image
processing refers to
processing digital
images
by means of a digital
computer. Note that
a
digital
image
is
composed
of
a
finite
number
of
elements,
each
of
which
has
a
particular
location and
value. These elements are referred to as picture
elements, image elements, pixels,
and
pixels. Pixel is the term most widely used to
denote the elements of a digital image.
Vision
is the
most
advanced
of our
senses, so it is not
surprising
that
images play
the
single
most
important
role
in
human
perception.
However,
unlike
humans,
who
are
limited
to
the
visual
band
of
the
electromagnetic
(EM)
spec- trum,
imaging
machines
cover
almost
the
entire
EM
spectrum,
ranging
from
gamma
to
radio
waves.
They
can
operate
on
images
generated
by
sources
that
humans
are
not
accustomed
to
associating
with
images.
These
include
ultra-
sound,
electron
microscopy,
and
computer-generated
images.
Thus,
digital
image
processing
encompasses
a
wide and
varied
field of
applications.
There
is
no
general
agreement
among
authors
regarding
where
image
processing
stops
and
other
related
areas,
such
as
image
analysis
and
computer
vi-
sion,
start.
Sometimes
a
distinction
is
made
by
defining
image
processing
as
a
discipline in which both
the input and output
of a
process are images. We believe this to be
a
limiting
and
somewhat
artificial
boundary.
For
example,
under
this
definition,
even
the
trivial
task
of
computing
the
average
intensity
of
an
image (which
yields
a
single number)
would
not
be
considered
an image
processing
operation.
On
the
other
hand,
there
are
fields
such
as
computer
vision
whose
ultimate
goal
is
to
use
computers
to
emulate
human
vision,
including
learning
and
being
able
to
make
inferences
and
take
actions
based
on
visual
inputs.
This
area
itself
is
a
branch
of
artificial
intelligence
(AI)
whose
objective
is to emulate human
intelligence. The field
of
AI
is in its earliest
stages
of infancy
in terms of
development,
with
progress
having
been
much
slower
than
originally
anticipated.
The
area
of
image
analysis
(also
called
image
understanding)
is in be- tween
image
processing
and
computer
vision.
There
are
no
clearcut
boundaries
in
the
continuum
from
image
processing
at
one
end
to
computer
vision
at
the
other.
However,
one
useful
paradigm
is
to
consider
three
types
of
computerized
processes
in
this
continuum:
low-,
mid-,
and
high
level
processes.
Low-level
processes
involve
primitive
opera-
tions
such
as
image
preprocessing
to
reduce
noise,
contrast
enhancement, and
image sharpening. A low-level process is
characterized by the fact that both its
inputs and outputs are images. Mid-
level processing on images involves tasks such as
segmentation
(partitioning an image
into regions or objects), description of those
objects to reduce them to a form
suitable for computer processing, and
classification (recognition) of individual
objects. A midlevel
process is
characterized by the fact that its inputs
generally are images, but its outputs are
attributes
extracted from those images
(e.g., edges, contours, and the identity of
individual objects). Finally,
higher
level processing involves “making sense” of an
ensemble of recognized objects, as in image
analysis,
and,
at
the
far
end
of
the
continuum,
performing
the
cognitive
functions
normally
associated with
vision.
Based
on
the
preceding
comments,
we see
that
a logical
place
of overlap
between
image
processing
and
image
analysis
is
the
area
of
recognition
of individual
regions
or
objects
in
an
image.
Thus,
what
we
call
in
this
book
digital
image
processing
encompasses
processes
whose
inputs
and
outputs
are
images
and,
in
addition,
encompasses
processes
that
extract
attributes
from
images,
up to
and
including
the
recognition
of
individual
objects.
As
a
simple
illustration
to
clarify
these
concepts,
consider
the
area
of
automated
analysis
of
text.
The
processes
of
acquiring
an
image
of
the
area
containing
the
text,
preprocessing
that
image,
extracting
(segmenting)
the
individual
characters,
describing
the
characters
in
a
form
suitable
for
computer
processing,
and
recognizing
those
individual
characters
are
in the scope of what we call digital image
processing in this book. Making
sense
of the
content
of the
page
may
be viewed as being in the
domain
of image
analysis
and
even
computer
vision,
depending
on
the
level
of
complexity
implied
by
the
statement
“making
sense.
”
As
will
become
evident
shortly,
digital
image
processing,
as
we
have
defined
it,
is
used
successfully
in
a
broad
range
of
areas
of
exceptional
social
and
economic
value.
The areas of application
of digital image processing
are so varied that some form of
organization
is desirable
in
attempting
to
capture
the
breadth
of
this field. One
of the
simplest
ways to develop
a basic understanding
of the
extent
of
image
processing
applications
is
to
categorize
images
according
to
their
source
(e.g., visual, X-ray,
and
so
on). The principal
energy
source
for images
in use today is the
electromagnetic
energy
spectrum.
Other
important
sources
of
energy
include
acoustic,
ultrasonic,
and
electronic
(in
the
form
of
electron
beams
used
in
electron
microscopy).
Synthetic
images, used
for modeling
and
visualization,
are generated
by
computer.
In
this section
we discuss
briefly
how images
are
generated
in these
various
categories
and
the
areas
in
which
they
are
applied.
Images
based
on
radiation
from
the
EM
spectrum
are
the
most
familiar,
especially
images
in
the
X-ray
and
visual
bands
of
the
spectrum.
Electromagnet-
ic
waves
can
be
conceptualized
as
propagating
sinusoidal
waves
of
varying
wavelengths,
or
they
can
be
thought of as a stream of massless
particles, each traveling in a wavelike pattern
and moving
at the speed of light. Each
massless particle contains a certain amount (or
bundle) of energy.
Each bundle of
energy is called a photon. If spectral bands are
grouped according to energy
per photon,
we obtain the spectrum shown in fig. below,
ranging from gamma rays (highest
energy) at one end to radio waves
(lowest energy) at the other. The bands are shown
shaded
to
convey
the
fact
that
bands
of
the
EM
spectrum
are
not
distinct
but
rather
transition
smoothly
from
one
to
the
other.
Image
acquisition
is
the
first
process. Note that
acquisition
could
be as simple
as
being
given
an
image
that
is already
in
digital
form.
Generally,
the
image
acquisition
stage
involves
preprocessing,
such as scaling.
Image
enhancement
is
among
the
simplest
and
most
appealing
areas
of
digital
image
processing.
Basically, the
idea
behind
enhancement
techniques
is to
bring out
detail
that
is obscured,
or simply
to
highlight
certain
features
of interest in
an
image.
A
familiar
example
of
enhancement
is
when
we
increase
the
contrast
of
an
image
because
“it
looks
better.
”
It
is important
to
keep
in mind
that enhancement
is a very
subjective
area
of image
processing.
Image
restoration
is an
area
that
also
deals
with
improving
the
appearance
of
an
image.
However,
unlike
enhancement,
which
is
subjective,
image
restoration
is
objective,
in
the
sense
that
restoration
techniques
tend
to
be
based
on
mathematical
or
probabilistic
models
of
image
degradation.
Enhancement,
on
the
other
hand,
is
based
on
human
subjective
preferences
regarding
what
constitutes
a
“good”
enhancement
result.
Color
image
processing
is
an
area
that
has
been
gaining
in
importance
because of the significant
increase
in the
use of digital images over the Internet. It covers
a number
of
fundamental
concepts
in color
models
and
basic color
processing
in a
digital
domain.
Color
is used
also
in later
chapters
as the
basis
for
extracting
features
of interest
in
an image.
Wavelets
are
the
foundation
for
representing
images
in
various
degrees
of
resolution.
In particular,
this material
is
used in this book
for image data
compression
and
for
pyramidal
representation,
in
which
images
are
subdivided
successively
into
smaller
regions.
Compression,
as
the
name
implies,
deals
with
techniques
for
reducing
the
storage
required
to
save
an
image,
or
the
bandwidth
required
to
transmit
gh
storage
technology
has
improved
significantly
over
the
past
decade,
the
same
cannot
be
said
for
transmission
capacity.
This
is
true
particularly
in
uses
of
the
Internet,
which
are
characterized
by
significant
pictorial
content.
Image
compression
is
familiar
(perhaps
inadvertently) to
most users of computers in the form of image file
extensions, such as the jpg
file
extension
used
in
the
JPEG
(Joint
Photographic
Experts
Group)
image
compression
standard.
Morphological
processing
deals
with
tools
for
extracting
image
components
that
are
useful
in
the
representation
and
description
of
shape.
The
material
in
this
chapter
begins a transition
from processes
that
output
images to
processes that
output
image
attributes.
Segmentation
procedures
partition
an
image
into
its
constituent
parts
or
objects.
In
general,
autonomous
segmentation
is
one
of
the
most
difficult
tasks
in
digital
image
processing.
A rugged
segmentation
procedure
brings
the
process
a
long
way toward successful
solution of imaging problems
that require
objects to be identified
individually.
On
the
other
hand,
weak
or
erratic
segmentation
algorithms
almost
always
guarantee
eventual
failure.
In
general,
the
more
accurate
the
segmentation,
the
more
likely
recognition
is to succeed.
Representation
and
description
almost
always
follow
the
output
of
a
segmentation
stage,
which usually
is raw
pixel
data,
constituting
either
the
boundary
of
a
region
(i.e.,
the
set
of
pixels
separating
one
image
region
from
another) or
all
the
points
in
the
region
itself.
In
either
case,
converting
the
data
to
a
form
suitable
for
computer
processing
is necessary. The first decision
that
must be made
is
whether
the
data
should
be
represented
as
a boundary
or
as a complete
region.
Boundary
representation
is
appropriate
when
the
focus
is
on
external
shape
characteristics,
such
as
corners
and
inflections.
Regional
representation
is appropriate
when
the
focus
is on
internal
properties,
such
as texture
or
skeletal
shape.
In some
applications,
these
representations
complement each other. Choosing
a
representation
is
only
part
of
the
solution
for
trans-
forming
raw
data
into
a
form
suitable
for
subsequent
computer
processing.
A method
must
also be
specified
for
describing
the
data
so that
features
of
interest are
highlighted.
Description
,
also
called
feature
selection
,
deals
with
extracting
attributes
that
result
in
some
quantitative
information
of
interest
or
are
basic
for
differentiating
one
class
of
objects
from
another.
Recognition
is
the
process
that
assigns
a
label
(e.g.,
“vehicle”)
to
an
object
based
on
its
descriptors.
As
detailed
before,
we
conclude
our
coverage
of
digital
image
processing
with
the
development
of
methods
for
recognition
of
individual
objects.
So far we have said
nothing about the need for prior knowledge or
about the interaction
between the
knowledge base and the processing modules in Fig 2
above. Knowledge about a
problem
domain
is
coded
into
an
image
processing
system
in
the
form
of
a
knowledge
database.
This
knowledge
may
be
as
simple
as
detailing
regions
of
an
image
where
the
information
of
interest
is
known
to
be
located,
thus
limiting
the
search
that
has
to
be
conducted
in seeking that information. The knowledge base
also can be quite complex, such
as an
interrelated list
of all major possible
defects
in
a materials
inspection problem or an
image
database
containing
high-resolution
satellite
images
of
a
region
in
connection
with
change-detection
applications.
In
addition
to
guiding
the
operation
of
each
processing
module, the
knowledge base also controls the interaction
between modules. This distinction
is
made in Fig 2 above by the use of double-headed
arrows between the processing modules
and the knowledge base, as opposed to
single-headed arrows linking the processing
modules.
Edge
detection
Edge detection is a
terminology in image processing and computer
vision, particularly
in the areas of
feature detection and feature extraction, to refer
to algorithms which aim at
identifying
points in a digital image at which the image
brightness changes sharply or more
formally has gh point and line
detection certainly are important in any
discussion on segmentation,edge
detection is by far the most common approach for
detecting
meaningful discounties in
gray level.
Although certain
literature has
considered
the detection of ideal
step edges, the
edges
obtained from natural images are
usually not at all ideal step edges. Instead they
are normally
affected
by
one
or
several
of
the
following
effects:
blur
caused
by
a
finite
depth-of-field and
finite point spread function; ral blur caused by
shadows created
by light sources of
non-zero radius; g at a smooth object edge;
specularities
or interreflections in
the vicinity of object edges.
A typical edge might for instance be
the border between a block of red color and a
block
of yellow. In contrast a line (as
can be extracted by a ridge detector) can be a
small number
of pixels of a different
color on an otherwise unchanging background. For a
line, there may
therefore
usually be one edge on each side of the line.
To
illustrate
why
edge
detection
is
not
a
trivial
task,
let
us
consider
the
problem
of
detecting edges in the following one-
dimensional signal. Here, we may intuitively say
that
there should be an edge between
the 4th and 5th pixels.
5
7
6
4
152
148
149
If
the
intensity
difference
were
smaller
between
the
4th
and
the
5th
pixels
and
if
the
intensity differences between the
adjacent neighbouring pixels were higher, it would
not be
as easy to say that there should
be an edge in the corresponding region. Moreover,
one could
argue that this case is one
in which there are several , to firmly state a
specific
threshold on how large the
intensity change between two neighbouring pixels
must be for us
to
say
that
there
should
be
an
edge
between
these
pixels
is
not
always
a
simple
problem.
Indeed, this is one
of the reasons why edge detection may be a non-
trivial problem unless the
objects
in
the
scene
are
particularly
simple
and
the
illumination
conditions
can
be
well
controlled.
There are many
methods for edge detection, but most of them can
be grouped into two
categories,search-
based and zero-crossing based. The search-based
methods detect edges by
first computing
a measure of edge strength, usually a first-order
derivative expression such as
the
gradient
magnitude,
and
then
searching
for
local
directional
maxima
of
the
gradient
magnitude using a
computed estimate of the local orientation of the
edge, usually the gradient
direction.
The
zero-crossing
based
methods
search
for
zero
crossings
in
a
second-order
derivative
expression
computed
from
the
image
in
order
to
find
edges,
usually
the
zero-
crossings of the Laplacian of the zero-crossings
of a non-linear differential expression,
as
will
be
described
in
the
section
on
differential
edge
detection
following
below.
As
a
pre-processing
step
to
edge
detection,
a
smoothing
stage,
typically
Gaussian
smoothing,
is
almost always applied (see also noise
reduction).
The
edge
detection
methods
that
have
been
published
mainly
differ
in
the
types
of
smoothing filters that are applied and
the way the measures of edge strength are
computed.
As many edge detection
methods rely on the computation of image
gradients, they also differ
in the
types of filters used for computing gradient
estimates in the x- and y-directions.
Once we have computed a measure of edge
strength (typically the gradient magnitude),
the next stage is to apply a threshold,
to decide whether edges are present or not at an
image
point.
The
lower
the
threshold,
the
more
edges
will
be
detected,
and
the
result
will
be
increasingly susceptible
to noise, and also to picking out irrelevant
features from the image.
Conversely a
high threshold may miss subtle edges, or result in
fragmented edges.
If the edge
thresholding is
applied to
just
the gradient magnitude
image, the resulting
edges will in
general be thick and some type of edge thinning
post-processing is necessary.
For
edges
detected
with
non-maximum
suppression
however,
the
edge
curves
are
thin
by
definition
and
the
edge
pixels
can
be
linked
into
edge
polygon
by
an
edge
linking
(edge
tracking)
procedure.
On
a
discrete
grid,
the
non-
maximum
suppression
stage
can
be
implemented by estimating the gradient
direction using first-order derivatives, then
rounding
off the gradient direction to
multiples of 45 degrees, and finally comparing the
values of the
gradient magnitude in the
estimated gradient direction.
A
commonly
used
approach
to
handle
the
problem
of
appropriate
thresholds
for
thresholding is by using
thresholding with hysteresis. This method uses
multiple thresholds
to find edges. We
begin by using the upper threshold to find the
start of an edge. Once we
have
a
start
point,
we
then
trace
the
path
of
the
edge
through
the
image
pixel
by
pixel,
marking
an edge whenever we are above the lower threshold.
We stop marking our edge only
when
the
value
falls
below
our
lower
threshold.
This
approach
makes
the
assumption
that
edges are likely to be
in continuous curves, and allows us to follow a
faint section of an edge
we
have
previously
seen,
without
meaning
that
every
noisy
pixel
in
the
image
is
marked
down as
an edge. Still, however, we have the problem of
choosing appropriate thresholding