Bioinformatics Links

on the World Wide Web

 

 

International Nucleotide Sequence Database

  • NCBI - National Center for Biotechnology Information (GenBank database and other stuff)
  • EBI - European Bioinformatics Institute (EMBL database and other stuff)
  • DDBJ - DNA Data Bank of Japan
  • Feature Table Definition - the format of entries in these databases

 

 

Protein Sequence Databases

  • SWISS-PROT & TrEMBL - annotated protein sequence database and computer annotated supplement
  • PIR - Protein Information Resource
  • MIPS - Munich Information centre for Protein Sequences

 

 

Database Searching by Sequence Similarity

 

 

Sequence Alignment

  • NCBI - National Center for Biotechnology Information (GenBank database and other stuff)
  • EBI - European Bioinformatics Institute (EMBL database and other stuff)
  • DDBJ - DNA Data Bank of Japan

 

 

Protein Sequence Databases

  • USC Sequence Alignment Server - align 2 sequences with all possible varieties of dynamic programming
  • T-COFFEE - multiple sequence alignment
  • ClustalW @ EBI - multiple sequence alignment
  • MSA 2.1 - optimal multiple sequence alignment using the Carrillo-Lipman method
  • BOXSHADE - pretty printing and shading of multiple alignments
  • SIM4 - a program to align cDNA and genomic DNA
  • Wise2 - align a protein or profile HMM against genomic sequence to predict a gene structure, and related tools
  • PipMaker - computes alignments of similar regions in two (long) DNA sequences
  • VISTA - align + detect conserved regions in long genomic sequences

 

 

Human Genome Databases

 

 

Databases for Other Organisms

 

 

Protein Domain Families: Databases and Search Tools

  • InterPro - integration of Pfam, PRINTS, PROSITE, SWISS-PROT + TrEMBL
  • PROSITE - database of protein families and domains
  • Pfam - alignments and hidden Markov models covering many common protein domains
  • SMART - analysis of domains in proteins
  • ProDom - protein domain database
  • PRINTS Database - groups of conserved motifs used to characterise protein families
  • Blocks - multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins
  • TIGRFAMs - yet more protein families based on Hidden Markov Models

 

 

Motif and Pattern Finding in Sequences

  • Gibbs Motif Sampler - identification of conserved motifs in DNA or protein sequences
  • AlignACE Homepage - gene regulatory motif finding
  • MEME  - motif discovery and search in protein and DNA sequences
  • SAM - tools for creating and using Hidden Markov Models
  • Pratt - discover patterns in unaligned protein sequences

 

 

Protein 3D Structure Resources

 

 

Gene Prediction

 

 

Gene Regulation Resources

  • TRANSFAC - database of eukaryotic cis-acting regulatory DNA elements and trans-acting factors
  • EPD - eukaryotic promoter database
  • SCPD - Saccharomyces cerevisiae promoter database
  • RegulonDB - a database on transcriptional regulation in E. coli
  • DPInteract - protein binding sites on E. coli DNA
  • PromoterInspector - prediction of promoter regions in mammalian genomic sequences

 

 

Metabolic, Gene Regulatory & Signal Transduction Network Databases
  • KEGG - Kyoto Encyclopedia of Genes and Genomes
  • BioCarta
  • stke - Signal Transduction Knowledge Environment
  • BIND - Biomolecular Interaction Network Database
  • EcoCyc
  • WIT
  • SPAD - Signaling Pathway Database
  • CSNDB - Cell Signalling Networks Database
  • PathDB
  • Transpath
  • DIP - Database of Interacting Proteins
  • PFBP - Protein Function and Biochemical Networks

 

 

5 Big Genome Sequencing Centers

 

 

 

Phylogeny and Taxonomy