Our genome contains the genes that code for the proteins that carry out most biological functions, as well as a lot of other sequence. Some of that other sequence serves various regulatory roles, and a lot of it has unclear or unknown functions. Protein-coding genes also carry extra sequence; when a gene is used to make a protein, it's first transcribed into a molecule called RNA, and the portions that aren't needed to make the protein, called introns, are removed. This process is outlined in the video. Scientists have now found that many genes that play roles in the function of the brain have particularly lengthy introns. The findings have been reported in PLOS One.
Scientists have proposed various explanations for the existence of introns. Introns allow the cellular machinery to splice the same genetic code in different ways, allowing for different proteins to be created from the same genomic sequence. Introns also allow cells to shuffle exons, the portions of genomic sequence that code for protein. This allows for the introduction of foreign genetic material as well, whereby a totally new gene could be created. The process is called exon shuffling.
Since genetic and computational technologies have advanced so rapidly, researchers now have access to vast amounts of data on the sequences of whole genomes from various organisms. This has enabled scientists to analyze introns closely. They are now classified by their location in the genomic sequence, where every three letters, a triplet called a codon, codes for one amino acid of a protein. Phase-0 introns are before the first nucleotide base of a codon, phase-1 introns are after the first nucleotide, and phase-2 introns are after the second nucleotide of a codon.
Bioinformaticists at the Moscow Institute of Physics and Technology (MIPT) and the Institute of Mathematical Problems of Biology, RAS investigated how intron phase was linked to intron length.
"No one had thought of investigating a potential link between intron length and phase before us. Common sense says there shouldn't be any connection at all, similarly to how a person's height has nothing to do with their eye color," noted Eugene Baulin, a researcher at the Applied Mathematics Lab at IMPB RAS, and the Algorithms and Programming Technologies Department at MIPT.
The investigators found, however, that there were a lot of phase 1 introns that were more than 50,000 nucleotides in length in one group of genes. These genes are involved in the transmission of nerve impulses in the brain.
Further study revealed that there were many instances of phase 1 introns in the genes when a particular sequence of amino acids happened at the start of a protein. This 'signal peptide' appears to send the protein where it needs to go to do its duty. For nerve cells, these proteins get sent to the membrane. The signal peptide occurs at the start of the protein, so the DNA sequence encoding it is also lengthy. The long intron contains regulatory information that's needed to synthesize the protein.
This study can help show how phase 1 introns are involved in exon shuffling as well. "That mechanism speeds up the evolution of intercellular and membrane proteins in animals, particularly the younger ones [in evolutionary terms], and these are the proteins that enable nerve impulse transmission in brain cells," Baulin added.
Sources: AAAS/Eurekalert! via MIPT, PLOS One