-
机器学习大作业
基于机器学习的主题
Web
挖掘技术<
/p>
(英文报告)
The Technology of Topical
Web Mining Based on Machine
Learning
This article studies and analyses Web
Mining and Machine e
Learning is an
important branch of the field of artificial
intelligence
.
This paper
presents
the model of Machine
Learning,classification
,
and
the development process
。
Mean
while
,
№
chine
Learning in the field of Web Mining application is
described
.
The
calculation of
Web
page
’
s authority radio is an
important issue for Web
Mining
.
Based on the HITS
algorithm,we advance a new algorithmfor
calculating the
importance
—
the WHITS
algorithm
Keylrords
:
A: Web
Mining;
B:Machine Learning;
A: HITS
Introduction :
With
the
rapid
growth
of
online
information
resources,
people
are
increasingly concerned about how
quickly and efficiently from the mass
Network
information
,
and
access
to
potentially
valuable
information
,
making it effectively play
a role in the management and decision-making
With
.
But
when
users
face
this
massive
,
heterogeneous,
semi-
structured
repository
of
information
,
which
is
often
found
to
be
searched
Information
required to spend a lot of time and effort , and
even harder
to
find
,
resulting
in
overload
,
knowledge
KuiLack
research
topic
of
widespread
concern
,
which
utilizes
the
distribution
characteristics
of
the
theme
page
on
the
Web
,
according
to
user
or
system -defined target theme,
intelligent way to crawl Web pages online ,
with
the
goal
of
collecting
topics
related
pages
,
and
pages
collected
intelligent
analysis
and
processing
,
and
finally
a
collection
of
pages
processed to provide a
flexible and convenient way for users to retrieve
research results show that many
projects , themes Web mining method
can
topics
related
to
maintaining
a
high
degree
of
accuracy
improves
query
,
which
is
to
improve
the
efficiency
of
user
queries
,
provides
a
new
research direction.
1 Introduction
1.1 Background
With
the
rapid
development
of
the
Internet
(Internet)
,
the
impact
on
our
network
has
been
growing,
becoming
a
necessary
way
people
get
information
and
important
means
.
And
the
fastest
growing
online
WWW (World Wide Web) technology, with
its intuitive, easy to use and
rich
expressive
power
,
has
been
developed
into
a
global
information
space
for
development.
With
the
advent
and
development
of
the
information
age,
information
on
the
Web
have
sprung
up
rapidly .
23
May
2007
,
the
China
Internet
Network
Information
Center
(CNNIC)
in
Beijing issued ' Nineteenth Statistical
Report on Internet Development in
China
.
Internet
users
in
China
reached
13,700
million,
compared
to
last
year
grew by 23.4 %, of
which the number of broadband Internet users has
exceeded one million . At present ,
China's number of Internet users and
ranks
second
in
the
world
wide
'
.
But
the
face
of
this
mass
of
information scattered disorderly
libraries, Web users often find it difficult
to find what they need to meet the
interest , resulting in
overload
,
lack
of
knowledge
,
status
quo.
on
the
one
hand
information
online
varied,
colorful,
on
the
one
hand
the
user
can
not
find the information they need . main
reason for this contradiction is that
people
in
such
a
large
repository
of
information
,
it
is
difficult
to
find
their
own
way
to
browse
the
required
the
information
in
this
phenomenon
prompted
a
Web
-based
search
engines
to
extract
valid
network resources for information
retrieval technology came into being .
Google, Infoseek, Baidu, AltaVista,
Skynet and other well -known search
engines are it is the outcome of the
online information retrieval in order
to
solve
the
problem
,
and
in
the
field
of
information
retrieval
after
extensive research .
1.2 Web
content mining and related research
Web
mining
∽
1
is
from
the
Web
pages
and
Web
users
access
to
activities
found
to
extract
potentially
interesting
patterns
and
hidden
information .
It
is
based
on
the
mining
of
useful
knowledge
from
the
Web as the goal, data
mining, text mining , mining -based body , and the
integrated use of computer networks ,
databases and data warehousing,
artificial
intelligence,
information
retrieval
,
visualization,
natural
language
understanding
technology,
will
traditional
data
mining
technology
and
Web
combine
an
emerging
discipline .
Introduce
Web
mining research in related fields , as
well as their contact with the Web
mining .