Accession number (bioinformatics)

Accession number (bioinformatics)

An accession number in bioinformatics is a unique identifier given to a DNA or protein sequence record to allow for tracking of different versions of that sequence record and the associated sequence over time in a single data repository. Because of its relative stability, accession numbers can be utilized as foreign keys for referring to a sequence object, but not necessarily to a unique sequence. All sequence information repositories implement the concept of "accession number" but might do so with subtle variations.

Accession numbers in specific data resources

UniProt (SwissProt) Knowledgebase

In UniProt documentation, the stated role of the accession number is "to provide a stable way of identifying entries from release to release." One entry (or record) might be associated with multiple accession numbers. Thus, in UniProt, there is no specific relationship between accession number and sequence; the primary relationship is between accession number and knowledgebase record, and a single knowledgebase record can refer to multiple sequences. In the flat version of the data, AC is the field delimiter for the accession number, the first being the "primary accession number" and all subsequent values being "seconary accession numbers". The proper key field for tracking a UniProt record is the primary accession number. The group of accession numbers associated with a knowledgebase record depends on the history of the record with respect to mergers and splits. New accession numbers arise in two main ways: new sequences (common) and knowledgebase record splits (rare).ref|uniprot

GenBank

EMBL

DDBJ

Commonly encountered accession numbers

* [http://www.pir.uniprot.org/database/knowledgebase.shtml Uniprot ID]
*Unified Uniprot Accession
* [http://ca.expasy.org/ Uniprot-Swissprot Accession]
* [http://ca.expasy.org/ Uniprot-Swissprot ID]
*Unified Uniprot ID
* [http://www.ncbi.nlm.nih.gov/RefSeq/ Refseq DNA ID]
* [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene Entrez Gene ID]
* [http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi CCDS ID]
*Vega translation ID
* [http://vega.sanger.ac.uk/index.html Vega Transcript ID]
* [http://vega.sanger.ac.uk/index.html Vega Peptide ID]
* [http://vega.sanger.ac.uk/index.html Vega Gene ID]
* [http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/searchgenes.pl HUGO ID]
* [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM MIM ID]

Notes and references

# cite web | author=Amos Bairoch, Rolf Apweiler, Cathy H. Wu | title=User Manual | work=UniProt Knowledgebase | url=http://www.expasy.org/sprot/userman.html#AC_line | accessdate=October 20 | accessyear=2005
#


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Accession number — may mean: * Accession number (bioinformatics), a unique identifier given to a biological polymer sequence (DNA, protein) when it is submitted to a sequence database. * Accession number (library science), the sequential number given to each new… …   Wikipedia

  • Accession — (from Lat. accedere , to go to, to approach), in law, a method of acquiring property adopted from Roman law (see: accessio ), by which, in things that have a close connection with or dependence on one another, the property of the principal draws… …   Wikipedia

  • Numéro d'accession (bioinformatique) — Un numéro d accession (le terme français est numéro d ordre, numéro d accession étant un anglicisme maladroit couramment utilisé) en bio informatique est un identifiant unique donné à toute séquence d ADN ou de protéine enregistrée dans un dépôt …   Wikipédia en Français

  • Sequence alignment — In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.[1]… …   Wikipedia

  • Biomolecular Object Network Databank — The Biomolecular Object Network Databank (BOND) is a bioinformatics databank containing information on small molecule and protein sequences, structures and interactions. The databank integrates a number of existing databses to provide a… …   Wikipedia

  • dbSNP — Content Description Single Nucleotide Polymorphism Database Organism(s) all …   Wikipedia

  • C11orf73 — chromosome 11 open reading frame 73 [[file:‎|border|250px|alt=]] A Phyre homology model of the human C11orf73 protein.[1] Identifiers Symbols …   Wikipedia

  • GenBank — The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. This database is produced at National Center for Biotechnology Information (NCBI) as part of the… …   Wikipedia

  • Applied Biosystems — Applied Biosystems, Inc. (formerly nasdaq2|ABIO) is the original name of a pioneer biotechnology company founded in 1981 in Foster City, California, in the San Francisco Bay Area. [http://marketing.appliedbiosystems.com/mk/get/25YRSEMS HERRITAGE… …   Wikipedia

  • Voltage-gated potassium channel — Ion channel (eukariotic) Potassium channel, structure in a membrane like environment. Calculated hydrocarbon boundaries of the lipid bilayer are indicated by red and blue dots. Identifiers Symbol Ion trans …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”