生物信息学教学资料:生物信息学常用数据库.ppt
《生物信息学教学资料:生物信息学常用数据库.ppt》由会员分享,可在线阅读,更多相关《生物信息学教学资料:生物信息学常用数据库.ppt(90页珍藏版)》请在三一文库上搜索。
1、,生物信息学方法与实践Bioinformatics Method and Practice,1,一级数据库 数据库中的数据直接来源于实验获得的原始数据,只经过简单的归类整理和注释。 二级数据库 对原始生物分子数据进行整理、分类的结果,是在一级数据库、实验数据和理论分析的基础上针对特定的应用目标而建立的。,生物信息学常用数据库,2,(1)美国生物技术信息中心的GenBank http:/www.ncbi.nlm.nih.gov/Web/Genbank/index.html (2)欧洲分子生物学实验室的EMBL http:/www.embl-heidelberg.de (3)日本遗传研究所的DDB
2、J http:/www.ddbj.nig.ac.jp/,GenBank,DDBJ,EMBL,三个数据库中的数据基本一致,仅在数据格式上有所差别,对于特定的查询,三个数据库的响应结果一样。,1.Nucleotide Sequence Databases,3,GenBank 1979年建设,1982年运行,4,5,Submissions to GenBank Many journals require submission of sequence information to a database prior to publication so that an accession number m
3、ay appear in the paper. NCBI has a WWW form, called BankIt, for convenient and quick submission of sequence data. Sequin, NCBIs stand-alone submission software for MAC, PC, and UNIX platforms, is also available by FTP. When using Sequin, the output files for direct submission should be sent to GenBa
4、nk by electronic mail. There are specialized, streamlined procedures for batch submissions of sequences, such as EST, STS, and HTG sequences. Updating or Revising a Sequence Revisions or updates to GenBank entries can be made at any time and can be accepted as BankIt or Sequin files or as the text o
5、f an e-mail message. Click on the link for more information about updating information on GenBank records.,6,Access to GenBank GenBank is available for searching at NCBI via several methods. The GenBank database is designed to provide and encourage access within the scientific community to the most
6、up to date and comprehensive DNA sequence information. Therefore, NCBI places no restrictions on the use or distribution of the GenBank data. However, some submitters may claim patent, copyright, or other intellectual property rights in all or a portion of the data they have submitted. NCBI is not i
7、n a position to assess the validity of such claims, and therefore cannot provide comment or unrestricted permission concerning the use, copying, or distribution of the information contained in GenBank. New Developments NCBI is continuously developing new tools and enhancing existing ones to improve
8、both submission and access to GenBank. The easiest way to keep abreast of these and other developments is to check the “Whats New“ section of the NCBI Web page and to read the NCBI News, which is also available by free subscription.,7,EMBL1982年 运行,8,http:/www.ebi.ac.uk/embl/index.html,9,DDBJ 1984年建立
9、,1987年启用,10,小鼠(Mouse) http:/www.informatics.jax.org/mgd.html 大鼠(Rat) http:/ratmap.gen.gu.se 狗(Dog) http:/mendel.berkeley.edu/dog.html 牛(Cow) http:/locus.jouy.inra.fr/cgi-bin/bovmap/intro2.pl 猪(Pig) http:/www.ri.bbsrc.ac.uk/pigmap/pigbase/pigbase.html 羊(Sheep) http:/dirk.invermay.cri.nz 鸡(Chicken) ht
10、tp:/www.ri.bbsrc.ac.uk/chickmap/chickbase/manager.html 斑马鱼(Zebra fish) http:/zfish.uoregon.edu 线虫(C. elegans) http:/www.ddbj.nig.ac.jp/htmls/celegans/html/CE_INDEX.html 果蝇(Drosophila) http:/morgan.harvard.edu 蚊子(Mosquito) http:/klab.agsci.colostate.edu 拟南芥(Arabidopsis) http:/genome-www.stanford.edu/
11、Arabidopsis 棉花(Cotton) http:/algodon.tamu.edu 玉米(Maize) http:/www.agron.missouri.edu 水稻(Rice) http:/www.staff.or.jp 大豆(Soya) http:/mendel.agron.iastate.edu:8000/main.html 杨树(Trees) http:/s27w007.pswfs.gov,2. Genome Databases,11,human,Arabidopsis,Thermotoga maritima,Escherichia coli,Buchnerasp. APS,R
12、ickettsia prowazekii,Ureaplasma urealyticum,Bacillus subtilis,Drosophila melanogaster,Thermoplasma acidophilum,Plasmodium falciparum,Helicobacter pylori,mouse,Caenorhabitis elegans,rat,Borrelia burgorferi,Borrelia burgorferi,Aquifex aeolicus,Neisseria meningitidis Z2491,Mycobacterium tuberculosis,Mo
13、del organism,12,Model organism databases,Escherichia coli E. coli Genome Center (Wisconsin University, USA) The E. coli index (University of Birmingham, UK) S. cerevisiae (Bakers yeast) SGD (Yeast genome database at Stanford, USA) CYGD (MIPS Comprehensive Yeast Genome Database, Neuherberg, Germany)
14、Arabidopsis thaliana MATDB (MIPS A. thaliana database, Munich, Germ.) TAIR (The Arabidopsis Information Resource, previously AtDB, at Stanford, USA) KAOS (Kazusa Arabidopsis data Opening Site at Kazusa DNA Research Institute, Jp) Arabidopsis Genome Analysis (at Cold Spring Harbor laboratories, USA)
15、TIGR Arabidopsis thaliana Database (TIGR, Rockeville MD, USA) Oryza sativa (Rice) RGP (Rice Genome Research Programme, Jp) Gramene (Comparative mapping resource for graines) INE (Integrated rice genome explorer: common database of the International Rice Genome Sequencing Project, IRGSP, Jp),13,Model
16、 organism databases,Caenorhabditis elegans WormBase (C. elegans database at Cold Spring Harbor Laboratories, USA) Drosophila melanogaster (Fruit fly) FlyBase (Drosophila genome database) BDGP (Berkeley Drosphila genome project) Danio rerio (Zebrafish) ZFIN (Zebrafish Information Network at Universit
17、y of Oregon, USA) WashU-Zebrafish Genome Resources (Zebrafish EST database at Washington University, USA) Mus musculus (Mouse) MGI (Mouse genome informatics) Homo sapiens GDB (The human Genome Database, Toronto, Canada) HIB (HumanInfoBase of annotated UniGene clusters - putative human gene transcrip
18、ts - at MIPS, Germany) Human genome resources (at NCBI, USA) Human genome browser (at the University of California Santa Cruz, USA) HGP (Human Genome Project at the Sanger Institute, Cambridge, UK) GeneLinks (Portal to hyperlinks for each human gene at the Center for Genomics and Bioinformatics, Kar
19、olinska Institutet, Stockholm, Sweden),14,Prokaryotes include: Escherichia coli (E. coli) - This common, Gram-negative gut bacterium is the most widely-used organism in molecular genetics. Bacillus subtilis - an endospore forming Gram-positive bacterium,15,Table of model genetic organisms,16,The Gen
20、ome database provides views for a variety of genomes, complete chromosomes, sequence maps with contigs, and integrated genetic and physical maps. The database is organized in six major organism groups: Archaea, Bacteria, Eukaryotae, Viruses, Viroids, and Plasmids and includes complete chromosomes, o
21、rganelles and plasmids as well as draft genome assemblies.,17,viruses,plasmids,bacteria,fungi,plants,algae,insects,mollusks,reptiles,birds,mammals,Genome sizes in nucleotide pairs (base-pairs),104,108,105,106,107,1011,1010,109,The size of the human genome is 3 X 109 bp; almost all of its complexity
22、is in single-copy DNA. The human genome is thought to contain 20,000 to 30,000 genes.,bony fish,amphibians,18,19,20,Escherichia coli 大肠杆菌,大肠杆菌是研究得最为详尽的一个模式生物。这种只有1.6微米长的、可以迅速繁殖的单细胞原核生物,已经成为实验室和基因工程的重要工具。,Escherichia coli O157:H7,Escherichia coli K12,模式生物(Model Organism),21,酿酒酵母:16个染色体,全基因组1996年测定。,2
23、2,秀丽线虫: 雌雄同体成虫细胞数目只有959个,其中包括302个神经元; 6条染色体,全基因组于1998年测定,长9.7Mb,23,果蝇:繁殖很快, 基因组:180Mb。,24,拟南芥:个体生活周期只有6周的十字花科小草,是一种理想的模式植物。,25,非洲瓜蟾(Xenopus lavias) 1个受精卵在24小时内分裂到各种器官初具雏形的程度;,26,斑马鱼(Danio rerio) 身体透明的小鱼,生活周期约3个月,是研究脊椎动物发育过程的良好对象。,27,小鼠(Mus musculus) 基因组大小与人类相近,有19条常染色体;,28,29,BLAST,基本局部比对搜索工具 (Basic
24、 Local Alignment Search Tool) NCBI上BLAST服务的网址: http:/blast.ncbi.nlm.nih.gov/ NCBI上BLAST程序的下载: ftp:/ftp.ncbi.nlm.nih.gov/blast/executables/release/ NCBI的BLAST数据库下载网址: ftp:/ftp.ncbi.nlm.nih.gov/blast/db/,30,31,QuerySequence,AminoacidSequence,DNASequence,tBLASTx,BLASTx,BLASTn,tBLASTn,BLASTp,Nucleotide
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 生物 信息学 教学 资料 常用 数据库
链接地址:https://www.31doc.com/p-3061371.html