人脸识别英文原文(可供中文翻译)_高中生题库网|高考真题|高考试题-「密云二中」

-

2021年2月6日发(作者：瓦斯)

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

1998

IEEE.

Personal

use

this

material

permitted.

However,

permission to

reprint/republish this

material

for

advertising or

promotional pur- poses or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of

this work in other works must be obtained from the IEEE.

Neural Network-Based Face Detection

Henry

Rowley,

Shumeet

Baluja,

and

Takeo

Kanade

Abstract

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

We present a neural network-based upright frontal face detection system. A retinally con- nected

neural network examines small windows of an image, and decides whether each win- dow contains

face.

The

system

arbitrates

between multiple

networks to

improve

performance

over

single

network.

We present a straightforward procedure for aligning positive face ex- amples for training.

collect negative examples,

we use a

bootstrap algorithm, which

adds false detections into the

training set as training progresses. This eliminates the difficult task of manually selecting nonface

training

examples,

which

must

chosen

span

the

entire

space

nonface

images.

Simple

heuristics,

such

using

the

fact

that

faces

rarely

overlap

images,

can

further

improve

the

accuracy.

Comparisons with several other state- of-the-art face detec- tion systems are presented;

showing that our system has comparable performance in terms of detection and false- positive rates.

Keywords:

Face detection, Pattern recognition, Computer vision, Artificial neural networks,

Ma- chine learning

Introduction

In this paper, we present a neural network-based algorithm to detect upright, frontal views

faces

gray-scale

images

The

algorithm

works

applying

one

neural

networks

directly

portions

the

input

image,

and

arbitrating

their

results.

Each

network

trained

output

the

presence

absence

face.

The

algorithms

and

training methods are designed to be general, with little customization for faces.

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

Many

face

detection

researchers

have

used

the

idea

that

facial

images

can

characterized directly in terms of pixel intensities. These images can be characterized by

probabilistic models of the set of face images [4, 13, 15], or implicitly by neural networks or

other mechanisms [3, 12, 14,

19, 21, 23, 25, 26]. The parameters for these models are adjusted either automatically from

example

images (as in our work) or by hand.

A few authors have taken the approach of extracting

features

and

applying

either

manually

automatically

generated

rules

for

evaluating

these features [7,

1].

Training

neural

network

for

the

face

detection

task

challenging

because

the

difficulty in characterizing prototypical

“non

ace”

images.

Unlike face

recognition

, in which

the classes to be discriminated are different faces, the two classes to be discriminated in

face

detection

are

“

images

containing

faces

”

and

“

images

not

containing

aces”

easy

get

representative

sample

images

which

contain

faces,

but

much

harder

get

representative sample of those which do not. We avoid the problem of using a huge training

set for nonfaces by selectively adding images to the

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

training

set

training

progresses

[21].

This

“

< p>
bootstra

p”

method

reduces

the

size

the

training

set

needed.

The

use

arbitration

between

multiple

networks

and

heuristics to clean up the results significantly improves the accuracy of the detector.

Detailed

descriptions

the

example

collection

and

training

methods,

network

architecture,

and arbitration methods are

given

in Section 2. In

Section 3,

the performance of the

system

examined.

find

that

the

system

able

detect

90.5%

the

faces

over

test

set

130 complex

images,

with

acceptable

number

false

positives.

Section

briefly discusses

some techniques that can be used to make the system run

faster,

and

Section

compares

this

system

with

similar

systems.

Conclusions

and

directions for future research are presented in Section 6.

Description of the System

Our system operates in two stages: it first applies a set of neural network-based filters

to an

image, and then uses an arbitrator to combine the outputs.

The filters examine

each location in the image at several scales, looking for locations that might contain a

face.

The

arbitrator

then

merges

detections

from

individual

filters

and

eliminates

overlapping detections.

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

2.1

Stage One: A Neural Network-Based

Filter

The

first

component

our

system

filter

that

receives

input

20x20

pixel

region

the

image,

and

generates

output

ranging

from

-1,

signifying

the

presence or absence of a face, respectively. To detect faces anywhere in the input, the

filter is applied at every location in the image.

To detect faces larger than the window

size, the input image is repeatedly reduced in size (by subsampling),

and

the filter is

applied at each size. This filter must have some invariance to position and scale. The

amount of invariance determines the number of scales and positions at which it must

be applied.

For the work presented here, we apply the filter at every pixel position in

the image, and scale the image down by a factor of 1.2 for each step in the pyramid.

The

filtering

algorithm

shown

Fig.

First,

preprocessing

step,

adapted

from

[21], is

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

applied

window

the

image.

The

window

then

passed

through

neural

network,

which

decides

whether

the

window

contains

face.

The

preprocessing

first attempts

equalize

the intensity values in across the window.

We fit a function

which

varies

linearly

across

the

window

the

intensity

values

oval

region

inside

the

window.

Pixels

outside

the

oval

(shown

Fig.

2a)

may

represent

the

background,

those

intensity

values

are

ignored

computing

the

lighting

variation across the face.

The linear function will approximate the overall brightness

of each

part

the

window,

and

can

subtracted

from

the

window

compensate

for

variety of lighting conditions. Then histogram equalization is performed, which

non-linearly

maps

the

intensity

values

expand

the

range

intensities

the

window.

The histogram is computed for pixels inside an oval region in the window. This

compensates

for

differences

camera

input

gains,

well

improving

contrast

some cases. The preprocessing steps are shown in Fig. 2.

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

The

preprocessed

window

then

passed

through

neural

network.

The

network

has retinal

connections to its input layer; the receptive fields of hidden units are shown in Fig. 1.

There

are

three

types

hidden

units:

which

look

10x10

pixel

subregions,

which

look

5x5

pixel

subregions,

and

which

look

overlapping

20x5

pixel

horizontal stripes of pixels.

Each of these types

was

chosen

allow the

hidden

units

detect

local

features

that

might

important

forface

detection.

particular, the

horizontal stripes allow the hidden units to detect such features as mouths or pairs of

eyes, while the hidden units with square receptive fields might detect features such as

individual

eyes,

the

nose,

corners

the

mouth.

Although

the

figure

shows

single hidden unit for each

subregion of the input, these units can be replicated.

For

the experiments which are described later, we use networks with two and three sets of

these

hidden

units.

Similar

input

connection

patterns

are

commonly

used

speech

and character recognition tasks [10, 24]. The network has a single, real-valued output,

which indicates whether or not the window contains a face.

Examples of output from a single network are shown in Fig. 3.

In the figure, each box

represents

the

position

and

size

window

which

the

neural

network

gave

positive

response.

The

network

has

some

invariance

position

and

scale,

which

results

multiple boxes around some faces. Note also that there are some false detections; they

will be eliminated by methods presented in Section 2.2.

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

TTo train the

neural network used in stage

one

to serve as an accurate filter, a

large

number of

face and nonface images are needed. Nearly 1050 face examples were gathered from face

databases

CMU,

Harvard

and

from

the

World

Wide

Web.

The

images

contained

faces of various sizes, orientations, positions, and intensities. The eyes, tip of nose, and

corners and center of the mouth of each face were labelled manually.

hese points were

used to normalize each face to the same scale, orientation, and position, as follows:

Initialize

, a vector which will be the average positions of each labelled feature over

all the faces, with the feature locations in the first face

The feature coordinates in

are rotated, translated, and scaled,

so that the average

locations of the eyes will appear at predetermined locations in a 20x20 pixel window.

For each face

, compute the best rotation, translation, and scaling to align the face

’

features

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

with the average feature locations

linear

Such

transformations

can

written

function

their

parameters.

Thus,

can

write

system

linear

equations

mapping the

features from

The

least

squares

solution

this

over-constrained

system

yields the

arameters for the best alignment transformation. Call the aligned feature locations

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

Update

by averaging the aligned feature locations

for each face .

Go to step 2.

The

alignment

algorithm

converges

within

five

iterations,

yielding

for

each

face

function which

maps that face to a

20x20

pixel

window.

Fifteen face

examples are

generated for the

training set

from each original image, by randomly rotating the images (about their center points)

up to

scaling

between

90%

and

110%,

translating

half

pixel,

and

mirroring.

Each

20x20

window

the

set

then

preprocessed

(by

applying

lighting

correction

and

histogram equalization). A few example images are shown in Fig. 4.

The randomization

gives

the

filter

invariance

translations

less

than

pixel

and

scalings

20%.

Larger

changes

in translation

and

scale

are

dealt

with by

applying

the

filter at

every

pixel position in an image pyramid, in which the images are scaled by factors of 1.2.

Practically

any

image

can

serve

nonface

example

because

the

space

nonface

images is

much larger than the space of face images. However, collecting a

“

< br>representativ

e”

set of

nonfaces

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

difficult. Instead of

collecting the images before training is started,

the images are

collected during training, in the following manner, adapted from [21]:

Create

initial

set

nonface

images

generating

1000

random

images.

Apply

the pre- processing steps to each of these images.

Train a neural network to produce an output of 1 for the face examples, and -1 for the

nonface

examples.

The

training

algorithm

standard

error

backpropogation

with

momentum [8]. On the first iteration of this loop, the network

’

s weights are initialized

randomly.

After

the

first

iteration,

use

the

weights

computed

training

the

previous iteration as the starting point.

Run the system on an image of scenery

which contains no faces

. Collect subimages in

which the network incorrectly identifies a face (an output activation ).

Select up to 250 of these subimages at random, apply the preprocessing steps, and

add them into the training set as negative examples. Go to step 2.

Some

examples

nonfaces

that

are

collected

during

training

are

shown

Fig.

Note

that

some

the

examples

resemble

faces,

although

they

are

not

very

the

positive

examples

shown

Fig.

The

presence

these

examples

forces

the

neural

network

learn

the

precise

boundary between face and nonface images.

used 120 images of scenery for collecting negative examples

in the

bootstrap manner

described above.

A typical training run selects approximately

8000 nonface images from the 146,212,178 subimages that are available at all locations

and scales

Rowley, Baluja, and Kanade: Neural Network-Based Face Detection (PAMI, January 1998)

the

training

scenery

images.

similar

training

algorithm

was

described

[5],

where

each

iteration

entirely

new

network

was

trained

with

the

examples

which the previous networks had made mistakes.

2.2

Stage Two: Merging Overlapping Detections and Arbitration

The examples in Fig. 3 showed that the raw output from a single network will contain a

number

false

detections.

this

section,

present

two strategies

improve

the

reliability of the detector: merging overlapping detections from a single network and

arbitrating among multiple networks.

2.2.1

Merging Overlapping Detections

Note that in Fig. 3, most faces are detected at multiple nearby positions or scales, while

false

detec-

tions

often

occur

with

less

consistency.

This

observation

leads

heuristic which can eliminate many

false detections. For

each

location and

scale,

the

number of detections within a specified neighborhood of that location can be counted.

the

number

above

threshold,

then

that

lo-

cation

classified

face.

The

centroid of the nearby

detections defines the location of the detection result, thereby

collapsing

multiple

detections.

the

experiments

section,

this

heuristic

will

referred to as

“

thres

holdi ng”.

-

本文更新与2021-02-06 00:58，由作者提供，不代表本网站立场，转载请注明出处：https://www.bjmy2z.cn/gaokao/604291.html

返回列表：英语

所有学科英文翻译

高速铣削英文加中文翻译中英文对照

当前您在：主页 > 英语 >

人脸识别英文原文(可供中文翻译)

-

-

-

-

-

-

-

-

-

返回列表：英语

人脸识别英文原文(可供中文翻译)的相关文章

余华爱情经典语录,余华爱情句子

心情低落的图片压抑,心情低落的图片发朋友圈

经典古训100句图片大全,古训名言警句

关于青春奋斗的名人名言鲁迅,关于青年奋斗的名言鲁迅

三国群英单机版手游礼包码,三国群英手机单机版攻略

不收费的情感挽回专家电话,情感挽回免费咨询

新婚贺语怎么说祝福语,新

适合小学生包容的句子经

开启美好一天的句子,开启

林徽因传,林徽因传主要内

结婚祝福语句句暖心,结婚

正能量的句子经典简短1

沈从文语录经典语录关于

史铁生的简介和作品,史铁

打动人心的爱情句子:我的

平凡的生活.简单的幸福的

母爱的最经典金句,母亲的

相守一生不离不弃的句子

余华的作品值得初中生看

奇妙萌可珍珠公主变好,彩

喝酒后的心情经典句子,适

努力挣钱的霸气图片,努力

有深度有涵养的句子精选

高情商女人分手说的话,高

当前您在： 主页 > 英语 >

-

-

-

-

-

-

-

-

-

人脸识别英文原文(可供中文翻译)的相关文章

当前您在：主页 > 英语 >