蛋白质家族和结构域DOC_高中生题库网|高考真题|高考试题-「密云二中」

-润

2021年1月29日发(作者：acutely)

蛋白质家族和结构域

数据库

1.1

蛋白质模体及结构域数据库

模体和结构域

PROSITE

数据库

PRINTS

数据库

BLOCKS

数据库

ProDom

数据库

Pfam

数据库

SMART

数据库

InterPro

数据库

Conserved Domain

数据库

CDART

模体（

motifs

）和结构域

（

domai ns

）：

Biologists

can

gain

insight

the

protein

function

based

identification

short

consensus

sequences related to known functions. These consensus sequence patterns are termed motifs and

domains.

A motif is

short

conserved

sequence

pattern

associated

with

distinct

functions

protein

DNA.

It is often associated with a distinct structural site performing a particular function.

A typical motif, such as a Zn-finger motif, is ten to twenty amino acids long.

A domain is

also

conserved

sequence

pattern,

defined

independent

functional

and

structural unit.

Domains are normally longer than motifs.

A domain consists of more than 40 residues and up to 700 residues, with an average length of 100

residues.

A domain may or may not include motifs within its boundaries.

Examples

，

transmembrane domains

，

ligand-binding domains.

Identification

motifs

and

domains

heavily

relies

on multiple

sequence

alignment as

well

profile and hidden Markov model (HMM) construction

PROSITE

（蛋白质家族及结构域数据库）：

The first established sequence pattern database /prosite/

是蛋白质家族和结构域数据库，

包含具有生物学意义的位点、

模式、

可帮助识别蛋白质家族

的统计特征。

PROSITE

中涉及的序列模式包括酶的催化位点、配体结合位点、与金属离子结合的残基、

二硫键的半胱氨酸、与小分子或其它蛋白质结合的区域等。

PR OSITE

还包括根据多序列比对而构建的序列统计特征，能更敏感地发现一个（未知）序

列是否具有相应的特征。

The functional information of these patterns is primarily based on published literature.

PRINTS

（蛋白质模体指纹数据库）：

A fingerprint is a group of conserved motifs used to characterise a protein family; its diagnostic

power is refined by iterative scanning of a SWISS- PROT/TrEMBL composite. Usually the motifs

not

overlap,

but

are

separated

along

sequence,

though

they

may

contiguous

3D-space.. /dbbrowser/PRINTS/

提供蛋白质同源性分析，

蛋白质模体指纹分析，

系统发生和序列进化分析，

以及微阵列分析，

并提供生物信息学和

PRINTS

数据库数据下载。

BLOCKS:

A database of blocks

Blocks

：

ungapped

multiple

alignments

derived

from

the

most

conserved,

ungapped

regions

homologous protein sequences.

The blocks, which are usually longer than motifs, are subsequently converted to PSSMs.

Because blocks often encompass motifs, the functional annotation of blocks is thus consistent with

that for the motifs

/blocks.

检测和鉴定蛋白质模体，有

BLOCK search

、

Get Blocks

和

Block Maker

工具

query

sequence

can

used

align

with

precomputed

profiles

the

database

select

the

highest scored matches.

ProDom

Domain database

ProDom

comprehensive

set

protein

domain

families

automatically

generated

from

the

SWISS-PROT and TrEMBL sequence databases

The domains are built using recursive iterations of PSI-BLAST.

/prodom/current/html/

提供相似性搜索、来自

SWISSPROT

相关结构域的多序列比对

Pfam

（

Protein families database of alignments and HMMs

）

A database with protein domain

derived from sequences in SWISSPROT and TrEMBL. Each motif or domain is represented by an

HMM

profile

generated

from

the

seed

alignment

of

a

number

of

conserved

homologous

proteins. /

The Pfam database is composed of two parts

Pfam-A involves manual alignments

Pfam-B, automatic alignment in a way similar to ProDom

（

PSI-BLAST

）

.

The functional annotation of motifs in Pfam-A is often related to that in PROSITE. Pfam-B only

contains sequence families not covered in Pfam-A.

Because of the automatic nature, Pfam-B has a much larger coverage but is also more error prone

because some HMMs are generated from unrelated sequences.

SMART (Simple Modular Architecture Research Tool

）：

Contains

HMM

profiles

constructed

from

manually

refined

protein

domain

alignments. /

Alignments in the database are built based on

tertiary structures whenever available

or based on PSI-BLAST profiles.

Alignments

are

further

checked

and

refined

by

human

annotators

before

HMM

profile

construction.

Protein functions are also manually curated.

The database may be of better quality than Pfam with more extensive functional annotations.

Compared

to

Pfam,

the

SMART

database

contains

an

independent

collection

of

HMMs,

with

emphasis on signaling, extracellular, and chromatin-associated motifs and domains.

Sequence searching in this database produces a graphical output of domains with well- annotated

information

with

respect

to

cellular

localization,

functional

sites,

superfamily,

and

tertiary

structure

InterPro

：

An integrated pattern database /interpro/

The

database

integrates

information

from

PROSITE,

Pfam,

PRINTS,

ProDom,

and

SMART

databases.

The sequence patterns from the five databases are further processed. Only overlapping motifs and

domains in a protein sequence derived by all five databases are included.

A popular feature of this database is a graphical output that summarizes

motif

matches and has

links to more detailed information.

CDD( Conserved Domain Database)

a

collection

of

multiple

sequence

alignments

for

ancient

domains

and

full-length

proteins. /Structure/cdd/

The CD-Search service may be used to identify the conserved domains present in a protein query

sequence: /Structure/cdd/

RPS-BLAST (Reverse PSI- BLAST) is the search tool used in the CD-Search service.

uses a query sequence to search against a pre-computed profile database generated by PSI-BLAST.

The

role

of

the

PSSM

has

changed

from

to

hence

the

term

in

RPS-BLAST.

It

performs

only

one

iteration

of

regular

BLAST

searching

against

a

database

of

PSI-BLAST

profiles to find the high-scoring gapped matches.

CDART (Conserved Domain Architecture) :

A domain search program /BLAST/

Combines the results from RPS- BLAST, SMART, and Pfam.

The

resulting

domain

architecture

of

a

query

sequence can

be

graphically

presented

along

with

related sequences.

CDART is not a substitute for individual database searches because it often misses certain features

that can be found in SMART and Pfam.

1.2

蛋白质家族数据库

COG (Cluster of Orthologous Groups ):

A

protein

family

database

based

on

phylogenetic

classification. /COG/

It is constructed by comparing protein sequences encoded in completely sequenced genomes.

Unicellular clusters

：检索工具为

COGnitor program

Eukaryotic Clusters

：检索工具为

KOGnitor

A

query

sequence

can

be

assigned

function

if

it

has

significant

similarity

matches

with

any

member of the cluster.

ProtoNet:

A

database

of

clusters

of

homologous

proteins

similar

to

COG

. /

Orthologous

protein

sequences

in

the

SWISSPROT

database

are

clustered

based

on

pairwise

sequence comparisons between all possible protein pairs using BLAST.

Protein relatedness is defined by the E-values from the BLAST alignments.

A query protein sequence can be submitted to the server for cluster identification and functional

annotation.

1.3

、蛋白质结构数据库

PDB

（

Protein Data Bank

）

PDB
 中含有通过实验（

X

射线晶体衍射，核磁共振

NMR

）测定的生物大分子的三维结构

蛋白质

核酸

糖类

其它复合物

/pdb

SCOP

（

Structural Classification of Proteins

）蛋白质结构分类数据库

提供关于已知结构的蛋白质之间结构和进化关系的详细描述，包括蛋白质结构数据库

PDB

中的所有条目。

/scop/

SCOP

数据库除了提供蛋白质结构和进化关系信息外，对于每一个蛋白质还包括下述信息：

到

PDB

的连接，序列，参考文献，结构的图像等。

可以按结构和进化关系对蛋白质分类，

分类结果是一个具有层次结构的树，

其主要的层次是

家族、超家族和折叠

:

家族：具有明显的进化关系

超家族：具有远源进化关系，具有共同的进化源

折叠类：主要结构相似

DSSP

（蛋白质二级结构数据库）

对生物大分子数据库

PDB

中的任何一个蛋白质，根据其三维结构推导出对应的二级结构。

/dssp/

对研究蛋白质序列与蛋白质二级结构及空间结构的关系非常有用

除了二级结构以外，

DSSP

还包括蛋白质的几何特征及溶剂。

HSS P

（蛋白质同源序列比对数据库）

二级数据库

/hssp/

数据来源于

PDB

，或来源于

SWISS-PROT

对于

P DB

中的每一个蛋白质，

HSSP

将与其同源的所有蛋白质序列对比排列起来，从而将

相似序列的蛋白质聚集成结构同源的家族。

HSSP

有助于分析蛋白质的保守区域，研究蛋白质的进化关系，有助于蛋白质的分子设计。

1.4

、其它生物大分子数据库

MMDB

（

Molecular Modeling Database

）

MMDB

是

（
NCBI

）

Entrez

的一个部分，

数据库的内容包括来自于实验的生物大分子结构数

据。

/entrez/?db=Structure

与

PDB

相比，对于数据库中的每一个生物大分子结构，

MMDB

具有许多附加的信息，如分

子的生物学功能、产生功能的机制、分子的进化历史等

。

还提供生物大分子三维结构模型显示、结构分析和结构比较工具。

dbSNP

（

Single nucleotide polymorphisms

，单核苷酸多态性数据库）

/entrez/?db=snp

OMIM (Online Mendelian Inheritance in Man)

是关于人类基因和遗传疾病的分类数据库

该数据库收集了已知的人类基因及由于这些基因突变或者缺失而导致的遗传疾病。

/entrez/?db=OMIM

EPD

真核基因启动子数据库

/

提供从

EMB L

中得到的真核基因的启动子序列，目标是帮助实验研究人员、生物信息

学研究人员分析真核基因的转录信号。

TRRD

（

Transcription Regulatory Regions Database

）

关于基因调控信息的集成数据库

该数据库搜集真核生物基因转录调控区域结构和功能的信息。

每

一

个

TRR D

的

条

目

对

应

于

一

个

基

因

，

包

含

特

定

基

因

各

种

结
构

－

功

能

特

性

/mgs/gnw/trrd/

2

蛋白质功能预测

蛋白质结构与功能的研究已有相当长的历史，

由于其复杂性，

对其结构与功能的预测不论是

方法论还是基础理论方面均较复杂。

蛋白质功能预测的一般过程：

数据库同源性搜索——根据同源信息预测功能

未知蛋白质序列（结构）是否和已知功能蛋白质的序列（结构）相似

根据序列特征预测功能

蛋白质的许多特性可直接从序列上分析获得，

如疏水性，

它可以用于预测序列是否位跨膜螺

旋

(transmenbrane helix)

或是前导序列

(leader sequence)

。

模体或结构域搜索——通过比对模体或结构域数据库确定功能

未知蛋白包含保守的模体或结构域，则具有该模体和结构域的功能

-润

-润

-润

-润

-润

-润

-润

-润

本文更新与2021-01-29 09:54，由作者提供，不代表本网站立场，转载请注明出处：https://www.bjmy2z.cn/gaokao/584858.html

返回列表：英语

上一篇：数据迁移中需要考虑的问题
下一篇：pep人教版四年级下册英语期末复习练习试题(可编辑)

当前您在：主页 > 英语 >

蛋白质家族和结构域DOC

-润

-润

-润

-润

-润

-润

-润

-润

-润

返回列表：英语

蛋白质家族和结构域DOC的相关文章

余华爱情经典语录,余华爱情句子

心情低落的图片压抑,心情低落的图片发朋友圈

经典古训100句图片大全,古训名言警句

关于青春奋斗的名人名言鲁迅,关于青年奋斗的名言鲁迅

三国群英单机版手游礼包码,三国群英手机单机版攻略

不收费的情感挽回专家电话,情感挽回免费咨询

新婚贺语怎么说祝福语,新

适合小学生包容的句子经

开启美好一天的句子,开启

林徽因传,林徽因传主要内

结婚祝福语句句暖心,结婚

正能量的句子经典简短1

沈从文语录经典语录关于

史铁生的简介和作品,史铁

打动人心的爱情句子:我的

平凡的生活.简单的幸福的

母爱的最经典金句,母亲的

相守一生不离不弃的句子

余华的作品值得初中生看

奇妙萌可珍珠公主变好,彩

喝酒后的心情经典句子,适

努力挣钱的霸气图片,努力

有深度有涵养的句子精选

高情商女人分手说的话,高

当前您在： 主页 > 英语 >

-润

-润

-润

-润

-润

-润

-润

-润

-润

蛋白质家族和结构域DOC的相关文章

当前您在：主页 > 英语 >