Annotating a genome means telling which gene code for which type of protein.
Worldwide databases where all gene sequences from all organisms studied are listed exist. Databases of proteins also exist. The different databases available are: Gen Pep at the NCBI (USA), the EMBL in the U.K., etc... All these banks contain about the same information as new data are transferred from one to the other and form what is called a non-redundant bank.
When a scientist obtains a new sequence, he submits his sequence to the database in order to look if homology exists or not with one or several listed known sequences. Annotation is done by using bioinformatics.
SuissProt in Switzerland is another bank which has the advantage that all the data contained in it are manually verified or annotated. It contains much less data but it is much more reliable. In other banks, the annotation part is left to the responsibility of the scientists who put the data on the bank which means about any scientists in the world who aren’t necessarily experts in this domain.
However, automatic annotation of the genome does not always give optimal results as it is a rapid statistical analysis of data which aren’t always verified. The genome will then also have to be manually annotated by an expert.
Annotating a genome manually means redoing all the steps done automatically (listed in the read more section) and checking the results obtained successively to see if everything is coherent and if the result given in the end is not too hasty. SuissProt can be interrogated a second time, for example, as it is a quite reliable bank.
Finally, both automatic and manual annotations are suggestions but manual annotation is a better suggestion of what proteins the genes of the genome are coding for.
The biochemists, molecular biologists and microbiologists will then be able to say if the annotation is correct. In fact, they will be able to demonstrate the link between genes and protein in a physical way by over expressing the genes in Escherichia coli for instance (see genetics).
Contributed by Stephanie Ries