Genetic code biology. Genetic code as a way to record hereditary information

The genetic code is usually understood as a system of signs indicating the sequential arrangement of nucleotide compounds in DNA and RNA, which corresponds to another sign system displaying the sequence of amino acid compounds in a protein molecule.

This is important!

When scientists managed to study the properties of the genetic code, universality was recognized as one of the main ones. Yes, strange as it may sound, everything is united by one, universal, common genetic code. It was formed over a long period of time, and the process ended about 3.5 billion years ago. Consequently, in the structure of the code one can trace traces of its evolution, from the moment of its inception to today.

When we talk about the sequence of arrangement of elements in the genetic code, we mean that it is far from chaotic, but has a strictly defined order. And this also largely determines the properties of the genetic code. This is equivalent to the arrangement of letters and syllables in words. Once we break the usual order, most of what we read on the pages of books or newspapers will turn into ridiculous gobbledygook.

Basic properties of the genetic code

Usually the code contains some information encrypted in a special way. In order to decipher the code, you need to know distinctive features.

So, the main properties of the genetic code are:

  • triplicity;
  • degeneracy or redundancy;
  • unambiguity;
  • continuity;
  • the versatility already mentioned above.

Let's take a closer look at each property.

1. Triplety

This is when three nucleotide compounds form a sequential chain within a molecule (i.e. DNA or RNA). As a result, a triplet compound is created or encodes one of the amino acids, its location in the peptide chain.

Codons (they are also code words!) are distinguished by their sequence of connections and by the type of those nitrogenous compounds (nucleotides) that are part of them.

In genetics, it is customary to distinguish 64 codon types. They can form combinations of four types 3 nucleotides each. This is equivalent to raising the number 4 to the third power. Thus, the formation of 64 nucleotide combinations is possible.

2. Redundancy of the genetic code

This property is observed when several codons are required to encrypt one amino acid, usually in the range of 2-6. And only tryptophan can be encoded using one triplet.

3. Unambiguity

It is included in the properties of the genetic code as an indicator of healthy genetic inheritance. For example, about good blood condition, about normal hemoglobin The GAA triplet, who is in sixth place in the chain, can tell the doctors. It is he who carries information about hemoglobin, and it is also encoded by it. And if a person has anemia, one of the nucleotides is replaced by another letter of the code - U, which is a signal of the disease.

4. Continuity

When recording this property of the genetic code, it should be remembered that codons, like links in a chain, are located not at a distance, but in direct proximity, one after another in the nucleic acid chain, and this chain is not interrupted - it has no beginning or end.

5. Versatility

We should never forget that everything on Earth is united by a common genetic code. And therefore, in primates and humans, in insects and birds, in a hundred-year-old baobab tree and a blade of grass that has barely emerged from the ground, similar triplets are encoded by similar amino acids.

It is in genes that the basic information about the properties of a particular organism is contained, a kind of program that the organism inherits from those who lived earlier and which exists as a genetic code.

The genetic code is a recording system hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA, forming codons corresponding to amino acids in the protein.

Properties of the genetic code.

The genetic code has several properties.

    Tripletity.

    Degeneracy or redundancy.

    Unambiguity.

    Polarity.

    Non-overlapping.

    Compactness.

    Versatility.

It should be noted that some authors also propose other properties of the code related to the chemical characteristics of the nucleotides included in the code or the frequency of occurrence of individual amino acids in the body’s proteins, etc. However, these properties follow from those listed above, so we will consider them there.

A. Tripletity. The genetic code, like many things, is complicated organized system has the smallest structural and smallest functional unit. A triplet is the smallest structural unit of the genetic code. It consists of three nucleotides. A codon is the smallest functional unit of the genetic code. Typically, triplets of mRNA are called codons. In the genetic code, a codon performs several functions. Firstly, its main function is that it encodes a single amino acid. Secondly, the codon may not code for an amino acid, but in this case it performs another function (see below). As can be seen from the definition, a triplet is a concept that characterizes elementary structural unit genetic code (three nucleotides). Codon – characterizes elementary semantic unit genome - three nucleotides determine the attachment of one amino acid to the polypeptide chain.

The elementary structural unit was first deciphered theoretically, and then its existence was confirmed experimentally. Indeed, 20 amino acids cannot be encoded with one or two nucleotides because there are only 4 of the latter. Three out of four nucleotides give 4 3 = 64 variants, which more than covers the number of amino acids available in living organisms (see Table 1).

The 64 nucleotide combinations presented in table have two features. Firstly, of the 64 triplet variants, only 61 are codons and encode any amino acid; they are called sense codons. Three triplets do not encode

amino acids a are stop signals indicating the end of translation. There are three such triplets - UAA, UAG, UGA, they are also called “meaningless” (nonsense codons). As a result of a mutation, which is associated with the replacement of one nucleotide in a triplet with another, a meaningless codon can arise from a sense codon. This type of mutation is called nonsense mutation. If such a stop signal is formed inside the gene (in its information part), then during protein synthesis in this place the process will be constantly interrupted - only the first (before the stop signal) part of the protein will be synthesized. A person with this pathology will experience a lack of protein and will experience symptoms associated with this deficiency. For example, this kind of mutation was identified in the gene encoding the hemoglobin beta chain. A shortened inactive hemoglobin chain is synthesized, which is quickly destroyed. As a result, a hemoglobin molecule devoid of a beta chain is formed. It is clear that such a molecule is unlikely to fully fulfill its duties. A serious illness occurs, developing according to the type hemolytic anemia(beta-zero thalassemia, from the Greek word "Thalas" - Mediterranean Sea, where this disease was first discovered).

The mechanism of action of stop codons differs from the mechanism of action of sense codons. This follows from the fact that for all codons encoding amino acids, corresponding tRNAs have been found. No tRNAs were found for nonsense codons. Consequently, tRNA does not take part in the process of stopping protein synthesis.

CodonAUG (in bacteria sometimes GUG) not only encode the amino acids methionine and valine, but are alsobroadcast initiator .

b. Degeneracy or redundancy.

61 of the 64 triplets encode 20 amino acids. This three-fold excess of the number of triplets over the number of amino acids suggests that two coding options can be used in the transfer of information. Firstly, not all 64 codons can be involved in encoding 20 amino acids, but only 20 and, secondly, amino acids can be encoded by several codons. Research has shown that nature used the latter option.

His preference is obvious. If out of 64 variant triplets only 20 were involved in coding amino acids, then 44 triplets (out of 64) would remain non-coding, i.e. meaningless (nonsense codons). Previously, we pointed out how dangerous it is for the life of a cell to transform a coding triplet as a result of mutation into a nonsense codon - this significantly disrupts normal work RNA polymerases, ultimately leading to the development of diseases. Currently, three codons in our genome are nonsense, but now imagine what would happen if the number of nonsense codons increased by about 15 times. It is clear that in such a situation the transition of normal codons to nonsense codons will be immeasurably higher.

A code in which one amino acid is encoded by several triplets is called degenerate or redundant. Almost every amino acid has several codons. Thus, the amino acid leucine can be encoded by six triplets - UUA, UUG, TSUU, TsUC, TsUA, TsUG. Valine is encoded by four triplets, phenylalanine by two and only tryptophan and methionine encoded by one codon. A property that is associated with recording the same information different symbols is called degeneracy.

The number of codons designated for one amino acid correlates well with the frequency of occurrence of the amino acid in proteins.

And this is most likely not accidental. The higher the frequency of occurrence of an amino acid in a protein, the more often the codon of this amino acid is represented in the genome, the higher the likelihood of its damage by mutagenic factors. Therefore, it is clear that a mutated codon has a greater chance of encoding the same amino acid if it is highly degenerate. From this perspective, the degeneracy of the genetic code is a mechanism that protects the human genome from damage.

It should be noted that the term degeneracy is used in molecular genetics in another sense. Thus, the bulk of the information in a codon is contained in the first two nucleotides; the base in the third position of the codon turns out to be of little importance. This phenomenon is called “degeneracy of the third base.” The latter feature minimizes the effect of mutations. For example, it is known that the main function of red blood cells is to transport oxygen from the lungs to the tissues and carbon dioxide from the tissues to the lungs. This function is performed by the respiratory pigment - hemoglobin, which fills the entire cytoplasm of the erythrocyte. It consists of a protein part - globin, which is encoded by the corresponding gene. In addition to protein, the hemoglobin molecule contains heme, which contains iron. Mutations in globin genes lead to the appearance of different variants of hemoglobins. Most often, mutations are associated with replacing one nucleotide with another and the appearance of a new codon in the gene, which may encode a new amino acid in the hemoglobin polypeptide chain. In a triplet, as a result of mutation, any nucleotide can be replaced - the first, second or third. Several hundred mutations are known that affect the integrity of the globin genes. Near 400 of which are associated with the replacement of single nucleotides in a gene and the corresponding amino acid replacement in a polypeptide. Of these only 100 replacements lead to instability of hemoglobin and various kinds of diseases from mild to very severe. 300 (approximately 64%) substitution mutations do not affect hemoglobin function and do not lead to pathology. One of the reasons for this is the above-mentioned “degeneracy of the third base,” when a replacement of the third nucleotide in a triplet encoding serine, leucine, proline, arginine and some other amino acids leads to the appearance of a synonymous codon encoding the same amino acid. Such a mutation will not manifest itself phenotypically. In contrast, any replacement of the first or second nucleotide in a triplet in 100% of cases leads to the appearance of a new hemoglobin variant. But even in this case, there may not be severe phenotypic disorders. The reason for this is the replacement of an amino acid in hemoglobin with another one similar to the first one. physical and chemical properties. For example, if an amino acid with hydrophilic properties is replaced by another amino acid, but with the same properties.

Hemoglobin consists of the iron porphyrin group of heme (oxygen and carbon dioxide molecules are attached to it) and protein - globin. Adult hemoglobin (HbA) contains two identical-chains and two-chains. Molecule-chain contains 141 amino acid residues,-chain - 146,- And-chains differ in many amino acid residues. The amino acid sequence of each globin chain is encoded by its own gene. Gene encoding-the chain is located in the short arm of chromosome 16,-gene - in the short arm of chromosome 11. Substitution in the gene encoding-the hemoglobin chain of the first or second nucleotide almost always leads to the appearance of new amino acids in the protein, disruption of hemoglobin functions and severe consequences for the patient. For example, replacing “C” in one of the triplets CAU (histidine) with “Y” will lead to the appearance of a new triplet UAU, encoding another amino acid - tyrosine. Phenotypically this will manifest itself in serious illness.. Similar replacement in position 63-chain of histidine polypeptide to tyrosine will lead to destabilization of hemoglobin. The disease methemoglobinemia develops. Replacement, as a result of mutation, of glutamic acid with valine in the 6th position-chain is the cause of the most severe disease - sickle cell anemia. Let's not continue the sad list. Let us only note that when replacing the first two nucleotides, an amino acid with physicochemical properties similar to the previous one may appear. Thus, replacement of the 2nd nucleotide in one of the triplets encoding glutamic acid (GAA) in-chain with “U” leads to the appearance of a new triplet (GUA), encoding valine, and replacing the first nucleotide with “A” forms the triplet AAA, encoding the amino acid lysine. Glutamic acid and lysine are similar in physicochemical properties - they are both hydrophilic. Valine is a hydrophobic amino acid. Therefore, replacing hydrophilic glutamic acid with hydrophobic valine significantly changes the properties of hemoglobin, which ultimately leads to the development of sickle cell anemia, while replacing hydrophilic glutamic acid with hydrophilic lysine changes the function of hemoglobin to a lesser extent - patients develop light form anemia. As a result of the replacement of the third base, the new triplet can encode the same amino acids as the previous one. For example, if in the CAC triplet uracil was replaced by cytosine and a CAC triplet appeared, then practically no phenotypic changes will be detected in humans. This is understandable, because both triplets code for the same amino acid – histidine.

In conclusion, it is appropriate to emphasize that the degeneracy of the genetic code and the degeneracy of the third base from a general biological point of view are protective mechanisms that are inherent in evolution in the unique structure of DNA and RNA.

V. Unambiguity.

Each triplet (except nonsense) encodes only one amino acid. Thus, in the direction codon - amino acid the genetic code is unambiguous, in the direction amino acid - codon it is ambiguous (degenerate).

Unambiguous

Amino acid codon

Degenerate

And in this case, the need for unambiguity in the genetic code is obvious. In another option, when translating the same codon, different amino acids would be inserted into the protein chain and, as a result, proteins with different primary structures and different functions would be formed. Cell metabolism would switch to the “one gene – several polypeptides” mode of operation. It is clear that in such a situation the regulatory function of genes would be completely lost.

g. Polarity

Reading information from DNA and mRNA occurs only in one direction. Polarity is important for defining higher order structures (secondary, tertiary, etc.). Earlier we talked about how lower-order structures determine higher-order structures. Tertiary structure and higher order structures in proteins are formed as soon as the synthesized RNA chain leaves the DNA molecule or the polypeptide chain leaves the ribosome. While the free end of an RNA or polypeptide acquires a tertiary structure, the other end of the chain continues to be synthesized on DNA (if RNA is transcribed) or a ribosome (if a polypeptide is transcribed).

Therefore, the unidirectional process of reading information (during the synthesis of RNA and protein) is essential not only for determining the sequence of nucleotides or amino acids in the synthesized substance, but for the strict determination of secondary, tertiary, etc. structures.

d. Non-overlapping.

The code may be overlapping or non-overlapping. Most organisms have a non-overlapping code. Overlapping code is found in some phages.

The essence of a non-overlapping code is that a nucleotide of one codon cannot simultaneously be a nucleotide of another codon. If the code were overlapping, then the sequence of seven nucleotides (GCUGCUG) could encode not two amino acids (alanine-alanine) (Fig. 33, A) as in the case of a non-overlapping code, but three (if there is one nucleotide in common) (Fig. . 33, B) or five (if two nucleotides are common) (see Fig. 33, C). In the last two cases, a mutation of any nucleotide would lead to a violation in the sequence of two, three, etc. amino acids.

However, it has been established that a mutation of one nucleotide always disrupts the inclusion of one amino acid in a polypeptide. This is a significant argument that the code is non-overlapping.

Let us explain this in Figure 34. Bold lines show triplets encoding amino acids in the case of non-overlapping and overlapping code. Experiments have clearly shown that the genetic code is non-overlapping. Without going into details of the experiment, we note that if you replace the third nucleotide in the sequence of nucleotides (see Fig. 34)U (marked with an asterisk) to some other thing:

1. With a non-overlapping code, the protein controlled by this sequence would have a substitution of one (first) amino acid (marked with asterisks).

2. With an overlapping code in option A, a substitution would occur in two (first and second) amino acids (marked with asterisks). Under option B, the replacement would affect three amino acids (marked with asterisks).

However, numerous experiments have shown that when one nucleotide in DNA is disrupted, the disruption in the protein always affects only one amino acid, which is typical for a non-overlapping code.

GZUGZUG GZUGZUG GZUGZUG

GCU GCU GCU UGC GCU GCU GCU UGC GCU GCU GCU

*** *** *** *** *** ***

Alanin - Alanin Ala - Cis - Ley Ala - Ley - Ley - Ala - Ley

A B C

Non-overlapping code Overlapping code

Rice. 34. A diagram explaining the presence of a non-overlapping code in the genome (explanation in the text).

The non-overlapping nature of the genetic code is associated with another property - the reading of information begins from a certain point - the initiation signal. Such an initiation signal in mRNA is the codon encoding methionine AUG.

It should be noted that humans still have a small number of genes that deviate from general rule and overlap.

e. Compactness.

There is no punctuation between codons. In other words, triplets are not separated from each other, for example, by one meaningless nucleotide. The absence of “punctuation marks” in the genetic code has been proven in experiments.

and. Versatility.

The code is the same for all organisms living on Earth. Direct evidence of the universality of the genetic code was obtained by comparing DNA sequences with corresponding protein sequences. It turned out that all bacterial and eukaryotic genomes use the same sets of code values. There are exceptions, but not many.

The first exceptions to the universality of the genetic code were found in the mitochondria of some animal species. This concerned the terminator codon UGA, which reads the same as the codon UGG, encoding the amino acid tryptophan. Other rarer deviations from universality were also found.

DNA code system.

The genetic code of DNA consists of 64 triplets of nucleotides. These triplets are called codons. Each codon codes for one of the 20 amino acids used in protein synthesis. This gives some redundancy in the code: most amino acids are coded for by more than one codon.
One codon performs two interrelated functions: it signals the beginning of translation and encodes the inclusion of the amino acid methionine (Met) in the growing polypeptide chain. The DNA coding system is designed so that the genetic code can be expressed either as RNA codons or DNA codons. RNA codons are found in RNA (mRNA) and these codons are able to read information during the synthesis of polypeptides (a process called translation). But each mRNA molecule acquires a nucleotide sequence in transcription from the corresponding gene.

All but two amino acids (Met and Trp) can be encoded by 2 to 6 different codons. However, the genome of most organisms shows that certain codons are favored over others. In humans, for example, alanine is encoded by GCC four times more often than by GCG. This probably indicates greater translation efficiency of the translation apparatus (for example, the ribosome) for some codons.

The genetic code is almost universal. The same codons are assigned to the same section of amino acids and the same start and stop signals are overwhelmingly the same in animals, plants and microorganisms. However, some exceptions have been found. Most involve assigning one or two of the three stop codons to an amino acid.

Nucleotides DNA and RNA
  1. Purines: adenine, guanine
  2. Pyrimidine: cytosine, thymine (uracil)

Codon- a triplet of nucleotides encoding a specific amino acid.

tab. 1. Amino acids that are commonly found in proteins
Name Abbreviation
1. AlanineAla
2. ArginineArg
3. AsparagineAsn
4. Aspartic acidAsp
5. CysteineCys
6. Glutamic acidGlu
7. GlutamineGln
8. GlycineGly
9. HistidineHis
10. IsoleucineIle
11. LeucineLeu
12. LysineLys
13. MethionineMet
14. PhenylalaninePhe
15. ProlinePro
16. SeriesSer
17. ThreonineThr
18. TryptophanTrp
19. TyrosineTyr
20. ValinVal

The genetic code, also called the amino acid code, is a system for recording information about the sequence of amino acids in a protein using the sequence of nucleotide residues in DNA that contain one of 4 nitrogenous bases: adenine (A), guanine (G), cytosine (C) and thymine (T). However, since the double-stranded DNA helix is ​​not directly involved in the synthesis of the protein that is encoded by one of these strands (i.e., RNA), the code is written in RNA language, which contains uracil (U) instead of thymine. For the same reason, it is customary to say that a code is a sequence of nucleotides, and not pairs of nucleotides.

The genetic code is represented by certain code words, called codons.

The first code word was deciphered by Nirenberg and Mattei in 1961. They obtained from coli an extract containing ribosomes and other factors necessary for protein synthesis. The result was a cell-free system for protein synthesis, which could assemble proteins from amino acids if the necessary mRNA was added to the medium. By adding synthetic RNA consisting only of uracils to the medium, they discovered that a protein was formed consisting only of phenylalanine (polyphenylalanine). Thus, it was established that the triplet of nucleotides UUU (codon) corresponds to phenylalanine. Over the next 5-6 years, all codons of the genetic code were determined.

The genetic code is a kind of dictionary that translates text written with four nucleotides into protein text written with 20 amino acids. The remaining amino acids found in protein are modifications of one of the 20 amino acids.

Properties of the genetic code

The genetic code has the following properties.

  1. Triplety- Each amino acid corresponds to a triple of nucleotides. It is easy to calculate that there are 4 3 = 64 codons. Of these, 61 are semantic and 3 are nonsense (termination, stop codons).
  2. Continuity(no separating marks between nucleotides) - absence of intragenic punctuation marks;

    Within a gene, each nucleotide is part of a significant codon. In 1961 Seymour Benzer and Francis Crick experimentally proved the triplet nature of the code and its continuity (compactness) [show]

    The essence of the experiment: “+” mutation - insertion of one nucleotide. "-" mutation - loss of one nucleotide.

    A single mutation ("+" or "-") at the beginning of a gene or a double mutation ("+" or "-") spoils the entire gene.

    A triple mutation ("+" or "-") at the beginning of a gene spoils only part of the gene.

    A quadruple “+” or “-” mutation again spoils the entire gene.

    The experiment was carried out on two adjacent phage genes and showed that

    1. the code is triplet and there is no punctuation inside the gene
    2. there are punctuation marks between genes
  3. Presence of intergenic punctuation marks- the presence among triplets of initiating codons (they begin protein biosynthesis), and terminator codons (indicating the end of protein biosynthesis);

    Conventionally, the AUG codon, the first after the leader sequence, also belongs to punctuation marks. It functions as a capital letter. In this position it encodes formylmethionine (in prokaryotes).

    At the end of each gene encoding a polypeptide there is at least one of 3 stop codons, or stop signals: UAA, UAG, UGA. They terminate the broadcast.

  4. Colinearity- correspondence of the linear sequence of codons of mRNA and amino acids in the protein.
  5. Specificity- each amino acid corresponds only to certain codons that cannot be used for another amino acid.
  6. Unidirectionality- codons are read in one direction - from the first nucleotide to the subsequent ones
  7. Degeneracy or redundancy, - one amino acid can be encoded by several triplets (amino acids - 20, possible triplets - 64, 61 of them are semantic, i.e., on average, each amino acid corresponds to about 3 codons); the exceptions are methionine (Met) and tryptophan (Trp).

    The reason for the degeneracy of the code is that the main semantic load is carried by the first two nucleotides in the triplet, and the third is not so important. From here code degeneracy rule : If two codons have the same first two nucleotides and their third nucleotides belong to the same class (purine or pyrimidine), then they code for the same amino acid.

    However, there are two exceptions to this ideal rule. This is the AUA codon, which should correspond not to isoleucine, but to methionine, and the UGA codon, which is a stop codon, whereas it should correspond to tryptophan. The degeneracy of the code obviously has an adaptive significance.

  8. Versatility- all of the above properties of the genetic code are characteristic of all living organisms.
    Codon Universal code Mitochondrial codes
    Vertebrates Invertebrates Yeast Plants
    U.G.A.STOPTrpTrpTrpSTOP
    AUAIleMetMetMetIle
    CUALeuLeuLeuThrLeu
    A.G.A.ArgSTOPSerArgArg
    AGGArgSTOPSerArgArg

    Recently, the principle of code universality has been shaken in connection with the discovery by Berrell in 1979 of the ideal code of human mitochondria, in which the rule of code degeneracy is satisfied. In the mitochondrial code, the UGA codon corresponds to tryptophan, and AUA to methionine, as required by the code degeneracy rule.

    Perhaps at the beginning of evolution, all simple organisms had the same code as mitochondria, and then it underwent slight deviations.

  9. Non-overlapping- each of the triplets of the genetic text is independent of each other, one nucleotide is included in only one triplet; In Fig. shows the difference between overlapping and non-overlapping code.

    In 1976 The DNA of phage φX174 was sequenced. It has single-stranded circular DNA consisting of 5375 nucleotides. The phage was known to encode 9 proteins. For 6 of them, genes located one after another were identified.

    It turned out that there is an overlap. Gene E is located entirely within gene D. Its start codon appears as a result of a frame shift of one nucleotide. Gene J begins where gene D ends. The start codon of gene J overlaps with the stop codon of gene D as a result of a two-nucleotide shift. The construction is called a “reading frame shift” by a number of nucleotides not a multiple of three. To date, overlap has only been shown for a few phages.

  10. Noise immunity- the ratio of the number of conservative substitutions to the number of radical substitutions.

    Nucleotide substitution mutations that do not lead to a change in the class of the encoded amino acid are called conservative. Nucleotide substitution mutations that lead to a change in the class of the encoded amino acid are called radical.

    Since the same amino acid can be encoded by different triplets, some substitutions in triplets do not lead to a change in the encoded amino acid (for example, UUU -> UUC leaves phenylalanine). Some substitutions change an amino acid to another from the same class (non-polar, polar, basic, acidic), other substitutions also change the class of the amino acid.

    In each triplet, 9 single substitutions can be made, i.e. There are three ways to choose which position to change (1st or 2nd or 3rd), and the selected letter (nucleotide) can be changed to 4-1=3 other letters (nucleotide). The total number of possible nucleotide substitutions is 61 by 9 = 549.

    By direct calculation using the genetic code table, you can verify that of these: 23 nucleotide substitutions lead to the appearance of codons - translation terminators. 134 substitutions do not change the encoded amino acid. 230 substitutions do not change the class of the encoded amino acid. 162 substitutions lead to a change in amino acid class, i.e. are radical. Of the 183 substitutions of the 3rd nucleotide, 7 lead to the appearance of translation terminators, and 176 are conservative. Of the 183 substitutions of the 1st nucleotide, 9 lead to the appearance of terminators, 114 are conservative and 60 are radical. Of the 183 substitutions of the 2nd nucleotide, 7 lead to the appearance of terminators, 74 are conservative, 102 are radical.


Leading scientific journal Nature reported the discovery of a second genetic code - a kind of "code within a code" that was recently cracked by molecular biologists and computer programmers. Moreover, in order to identify it, they did not use evolutionary theory, but information technology.

New code called the Splicing Code. It is located inside DNA. This code controls the underlying genetic code in a very complex yet predictable manner. The splicing code controls how and when genes and regulatory elements are assembled. Unraveling this code within a code helps shed light on some of the long-standing mysteries of genetics that have surfaced since the Human Genome Sequencing Project. One of these mysteries was why in such a complex organism as the human there are only 20,000 genes? (Scientists expected to find much more.) Why are genes broken up into segments (exons), which are separated by noncoding elements (introns), and then joined together (that is, spliced) after transcription? And why do genes turn on in some cells and tissues, but not others? For two decades molecular biologists tried to figure out the mechanisms of genetic regulation. This article points out very important point in understanding what is really happening. It doesn't answer all the questions, but it does demonstrate that the internal code exists. This code is a system of information transmission that can be deciphered so clearly that scientists could predict how the genome might behave in certain situations and with inexplicable precision.

Imagine that you hear an orchestra in the next room. You open the door, look inside and see three or four musicians playing musical instruments in the room. This is what Brandon Frey, who helped break the code, says the human genome looks like. He says: “We could only detect 20,000 genes, but we knew they made up a huge number of protein products and regulatory elements. How? One method is called alternative splicing.". Different exons (parts of genes) can be assembled in different ways. “For example, three genes for the protein neurexin can create more than 3,000 genetic messages that help control the brain’s wiring.”, says Frey. The article immediately says that scientists know that 95% of our genes have alternative splicing, and in most cases in different types cells and tissues, transcripts (RNA molecules produced as a result of transcription) are expressed differently. There must be something that controls how these thousands of combinations are assembled and expressed. This is the task of the Splicing Code.

Readers who want a quick overview of the discovery can read the article at Science Daily called "Researchers Who Cracked the 'Splicing Code' Uncover the Mystery Behind Biological Complexity". The article says: “Scientists at the University of Toronto have gained fundamental new insights into how living cells use a limited number of genes to form incredibly complex organs like the brain.”. Nature itself begins with an article by Heidi Ledford, “Code Within Code.” This was followed by a paper by Tejedor and Valcárcel entitled “Gene Regulation: Cracking the Second Genetic Code. Finally, the clincher was a paper by a team of researchers from the University of Toronto led by Benjamin D. Blencowe and Brandon D. Frey, “Cracking the Splicing Code.”

This article is a victory information science, which reminds us of the codebreakers of World War II. Their methods included algebra, geometry, probability theory, vector calculus, information theory, program code optimization, and other advanced techniques. What they didn't need was evolutionary theory, which was never mentioned in scientific articles. Reading this article, you can see how much stress the authors of this overture are under:

“We describe a 'splicing code' scheme that uses combinations of hundreds of RNA properties to predict tissue-specific changes in the alternative splicing of thousands of exons. The code establishes new classes of splicing patterns, recognizes different regulatory programs in different tissues, and establishes mutation-controlled regulatory sequences. We have uncovered widespread regulatory strategies, including: the use of unexpectedly large pools of properties; identification low levels exon inclusions that are attenuated by the properties of specific tissues; the manifestation of properties in introns is deeper than previously thought; and modulation of splice variant levels by structural characteristics of the transcript. The code helped identify a class of exons whose inclusion silences expression in adult tissues by activating mRNA degradation, and whose exclusion promotes expression during embryogenesis. The code makes it easy to open and detailed description regulated genome-wide alternative splicing events."

The team that cracked the code included specialists from the Department of Electronic and Computer Engineering, as well as from the Department of Molecular Genetics. (Frey himself works for a division of Microsoft Corporation, Microsoft Research) Like the codebreakers of yesteryear, Frey and Barash developed "a new method of computer-assisted biological analysis that detects 'code words' hidden within the genome". Using massive amounts of data generated by molecular geneticists, a team of researchers reverse-engineered the splicing code until they could not predict how he would act. Once the researchers had that figured out, they tested the code against mutations and saw how exons were inserted or deleted. They found that the code could even cause tissue-specific changes or act differently depending on whether the mouse was an adult or an embryo. One gene, Xpo4, is associated with cancer; The researchers noted: “These data support the conclusion that Xpo4 gene expression must be strictly controlled to avoid possible deleterious consequences, including tumorigenesis (cancer), since it is active during embryogenesis but is reduced in abundance in adult tissues. It turns out that they were absolutely surprised by the level of control they saw. Intentionally or not, Frey used the language of intelligent design rather than random variation and selection as a clue. He noted: "Understanding the complex biological system like understanding a complex electronic circuit.”

Heidi Ledford said that the apparent simplicity of the Watson-Crick genetic code, with its four bases, triplet codons, 20 amino acids and 64 DNA "characters" - hides a whole world of complexity underneath. Prisoner inside this more simple code,The splicing code is much more complex.

But between DNA and proteins lies RNA, a world of complexity all its own. RNA is a transformer that sometimes carries genetic messages and sometimes controls them, involving many structures that can influence its function. In a paper published in the same issue, a team of researchers led by Benjamin D. Blencowe and Brandon D. Frey from the University of Toronto in Ontario, Canada, report efforts to unravel a second genetic code that can predict how segments of messenger RNA transcribed from a specific gene, can mix and match to form a variety of products in different tissues. This process is known as alternative splicing. This time there is no simple table - instead there are algorithms that combine more than 200 different properties of DNA with determinations of RNA structure.

The work of these researchers points to the rapid progress that computational methods have made in assembling a model of RNA. In addition to understanding alternative splicing, computer science helps scientists predict RNA structures and identify small regulatory pieces of RNA that do not code for proteins. "It's a wonderful time", says Christopher Berg, a computational biologist at the Massachusetts Institute of Technology in Cambridge. “We will have great success in the future”.

Computer science, computational biology, algorithms and codes—these concepts were not part of Darwin's vocabulary when he developed his theory. Mendel had a very simplified model of how traits are distributed during inheritance. Additionally, the idea that features are encoded was only introduced in 1953. We see that the original genetic code is regulated by an even more complex code included within it. These are revolutionary ideas. Moreover, there are all signs that this level of control is not the last. Ledford reminds us that RNA and proteins, for example, have a three-dimensional structure. The functions of molecules can change when their shape changes. There must be something that controls the folding so that the three-dimensional structure does what the function requires. In addition, access to genes appears to be controlled another code, histone code. This code is encoded by molecular markers or “tails” on histone proteins that serve as centers for DNA twisting and supercoiling. Describing our times, Ledford talks about "continuous renaissance in RNA informatics".

Tejedor and Valcárcel agree that complexity lies behind the simplicity. “The concept is very simple: DNA makes RNA, which then makes protein.”, - they begin their article. “But in reality everything is much more complicated”. In the 1950s, we learned that all living organisms, from bacteria to humans, have a basic genetic code. But we soon realized that complex organisms (eukaryotes) have some unnatural and difficult to understand property: their genomes have peculiar sections, introns, that must be removed so that the exons can join together. Why? Today the fog is clearing: “The main advantage of this mechanism is that it allows different cells choose alternative ways splicing of the precursor messenger RNA (pre-mRNA) and thus one gene forms different messages,”- they explain, - "and then different mRNAs can code for different proteins with various functions» . You get more information out of less code, provided there is this other code inside the code that knows how to do it.

What makes breaking the splicing code so difficult is that the factors that control exon assembly are set by many other factors: sequences located near exon boundaries, intron sequences, and regulatory factors that either help or inhibit the splicing machinery. Besides, “the effects of a particular sequence or factor may vary depending on its location relative to intron-exon boundaries or other regulatory motifs”, Tejedor and Valcarcel explain. “Therefore, the greatest challenge in predicting tissue-specific splicing is calculating the algebra of the myriad motifs and the relationships among the regulatory factors that recognize them.”.

To solve this problem, a team of researchers fed a huge amount of data into a computer about RNA sequences and the conditions under which they were formed. “The computer was then tasked with identifying the combination of properties that would best explain the experimentally established tissue-specific selection of exons.”. In other words, the researchers reverse engineered the code. Like the codebreakers of World War II, once scientists know the algorithm, they can make predictions: “It correctly and accurately identified alternative exons and predicted their differential regulation between pairs of tissue types.” And just like any good scientific theory, the discovery provided new insight: “This allowed us to provide new insight into previously identified regulatory motifs and pointed to previously unknown properties of known regulators, as well as unexpected functional connections between them.”, the researchers noted. “For example, the code implies that the inclusion of exons leading to processed proteins is common mechanism controlling the process of gene expression during the transition from embryonic tissue to adult tissue".

Tejedor and Valcárcel consider the publication of their article important first step: “The work... is better viewed as the discovery of the first fragment of a much larger Rosetta Stone needed to decipher the alternative messages of our genome.” According to these scientists, future research will undoubtedly improve their knowledge of this new code. At the conclusion of their article, they briefly mention evolution, and they do so in a very unusual way. They say, “It doesn't mean that evolution created these codes. This means that progress will require understanding how the codes interact. Another surprise was that the degree of conservation observed to date raises the question of the possible existence of “species-specific codes.”.

The code probably operates in every single cell and therefore must be responsible for more than 200 types of mammalian cells. It must also cope with a huge variety of alternative splicing patterns, not to mention simple solutions about the inclusion or omission of a separate exon. The limited evolutionary conservation of alternative splicing regulation (estimated to be about 20% between humans and mice) raises the question of the existence of species-specific codes. Moreover, the coupling between DNA processing and gene transcription influences alternative splicing, and recent evidence points to DNA packaging by histone proteins and covalent modifications of histones (the so-called epigenetic code) in regulating splicing. Therefore, future methods will have to establish the precise interaction between the histone code and the splicing code. The same applies to the still little understood influence of complex RNA structures on alternative splicing.

Codes, codes and more codes. The fact that scientists say virtually nothing about Darwinism in these articles indicates that evolutionary theorists who adhere to old ideas and traditions have a lot to think about after they read these articles. But those who are enthusiastic about the biology of codes will find themselves at the forefront. They have a great opportunity to take advantage of the exciting web application that codebreakers have created to encourage further research. It can be found on the University of Toronto website called Alternative Splicing Prediction Website. Visitors will look in vain for any mention of evolution here, despite the old axiom that nothing in biology makes sense without it. The new 2010 version of this expression might sound like this: “Nothing in biology makes sense unless viewed in the light of computer science.” .

Links and notes

We're glad we were able to tell you about this story the day it was published. This may be one of the most significant scientific articles of the year. (Of course, every big discovery made by other groups of scientists, like Watson and Crick's, is significant.) The only thing we can say to this is: “Wow!” This discovery is a remarkable confirmation of Creation by design and a huge challenge to the Darwinian empire. I wonder how evolutionists will try to correct their simplistic story of random mutation and natural selection, which dates back to the 19th century, in light of these new data.

Do you understand what Tejedor and Valcarcel are talking about? Species can have their own code, unique only to these species. “It will therefore be up to future methods to establish the precise interaction between the histone [epigenetic] code and the splicing code,” they note. Translated, this means: “Darwinists have nothing to do with this. They just can't handle it." If the simple Watson-Crick genetic code was a problem for Darwinians, what would they now say about a splicing code that creates thousands of transcripts from the same genes? How do they cope with the epigenetic code that controls gene expression? And who knows, maybe in this incredible “interaction”, which we are just beginning to learn about, other codes are involved, reminiscent of the Rosetta Stone, just beginning to emerge from the sand?

Now, when we think about codes and computer science, we begin to think about different paradigms for new research. What if the genome acts in part as a storage network? What if it involves cryptography or compression algorithms? We should remember about modern information systems and information storage technologies. We may even discover elements of steganography. There are undoubtedly additional mechanisms of resistance, such as duplications and corrections, that may help explain the existence of pseudogenes. Copies of the entire genome may be a response to stress. Some of these phenomena may be useful indicators of historical events that have nothing to do with a universal common ancestor, but help explore comparative genomics within computer science and resistance design, and help understand the cause of disease.

Evolutionists find themselves in a great difficulty. Researchers tried to modify the code, but all they got was cancer and mutations. How are they going to navigate the field of fitness if it is all mined with disasters waiting in the wings as soon as someone starts interfering with these inextricably linked codes? We know that there is some built-in stability and portability, but the whole picture is an incredibly complex, designed, optimized information system, not a haphazard collection of parts that can be played with endlessly. The whole idea of ​​code is the concept of intelligent design.

A. E. Wilder-Smith attached particular importance to this. The code assumes an agreement between the two parts. An agreement is an agreement in advance. It involves planning and purpose. We use the SOS symbol, as Wilder-Smith would say, by convention as a distress signal. SOS doesn't look like a disaster. It doesn't smell like a disaster. It doesn't feel like a disaster. People would not understand that these letters represent disaster if they did not understand the essence of the agreement itself. Likewise, codon for alanine, HCC, does not look, smell or feel like alanine. The codon would have nothing to do with alanine unless there was a pre-established agreement between the two coding systems (the protein code and the DNA code) that "GCC must mean alanine." To convey this agreement, a family of transducers, aminoacyl-tRNA synthetases, are used, which translate one code into another.

This was to strengthen design theory in the 1950s and many creationists preached it effectively. But evolutionists are like smooth-talking salesmen. They created their tales about the fairy Tinker Bell, who breaks the code and creates new species through mutation and selection, and convinced many people that miracles could still happen today. Well, well, today we are in the 21st century and we know the epigenetic code and the splicing code - two codes that are much more complex and dynamic than the simple DNA code. We know about codes within codes, about codes above codes and below codes - we know a whole hierarchy of codes. This time, the evolutionists cannot simply stick their finger in the gun and bluff us with their beautiful speeches, when on both sides there are guns - an entire arsenal aimed at their main design elements. It's all a game. A whole era of computer science has grown up around them, they have long gone out of fashion and look like the Greeks who are trying to climb modern tanks and helicopters with spears.

It's sad to say, but evolutionists don't understand this, or even if they do, they're not going to give up. By the way, this week, just as the article on the Splicing Code was published, the most angry and hateful rhetoric against creationism and intelligent design in recent memory poured out from the pages of pro-Darwin magazines and newspapers. We are yet to hear of many more similar examples. And as long as they hold the microphones and control the institutions, many people will fall for their bait, thinking that science continues to give them good reason. We tell you all this so that you will read this material, study it, understand it, and equip yourself with the information you need to defeat this bigoted, misleading nonsense with the truth. Now, go ahead!

The genetic code is a way of encoding the sequence of amino acids in a protein molecule using the sequence of nucleotides in a nucleic acid molecule. The properties of the genetic code arise from the characteristics of this coding.

Each protein amino acid is matched to three consecutive nucleic acid nucleotides - triplet, or codon. Each nucleotide can contain one of four nitrogenous bases. In RNA these are adenine (A), uracil (U), guanine (G), cytosine (C). By combining nitrogenous bases in different ways (in in this case nucleotides containing them) you can get many different triplets: AAA, GAU, UCC, GCA, AUC, etc. The total number of possible combinations is 64, i.e. 43.

The proteins of living organisms contain about 20 amino acids. If nature “planned” to encode each amino acid not with three, but with two nucleotides, then the variety of such pairs would not be enough, since there would be only 16 of them, i.e. 42.

Thus, the main property of the genetic code is its triplicity. Each amino acid is encoded by a triplet of nucleotides.

Since there are significantly more possible different triplets than the amino acids used in biological molecules, the following property has been realized in living nature: redundancy genetic code. Many amino acids began to be encoded not by one codon, but by several. For example, the amino acid glycine is encoded by four different codons: GGU, GGC, GGA, GGG. Redundancy is also called degeneracy.

The correspondence between amino acids and codons is shown in tables. For example, these:

In relation to nucleotides, the genetic code has the following property: unambiguity(or specificity): each codon corresponds to only one amino acid. For example, the GGU codon can only code for glycine and no other amino acid.

Again. Redundancy means that several triplets can code for the same amino acid. Specificity - each specific codon can code for only one amino acid.

There are no special punctuation marks in the genetic code (except for stop codons, which indicate the end of polypeptide synthesis). The function of punctuation marks is performed by the triplets themselves - the end of one means that another will begin next. This implies the following two properties of the genetic code: continuity And non-overlapping. Continuity refers to the reading of triplets immediately after each other. Non-overlapping means that each nucleotide can be part of only one triplet. So the first nucleotide of the next triplet always comes after the third nucleotide of the previous triplet. A codon cannot begin with the second or third nucleotide of the preceding codon. In other words, the code does not overlap.

The genetic code has the property versatility. It is the same for all organisms on Earth, which indicates the unity of the origin of life. There are very rare exceptions to this. For example, some triplets in mitochondria and chloroplasts encode amino acids other than their usual ones. This may suggest that at the dawn of life there were slightly different variations of the genetic code.

Finally, the genetic code has noise immunity, which is a consequence of its property as redundancy. Point mutations, which sometimes occur in DNA, usually result in the replacement of one nitrogenous base with another. This changes the triplet. For example, it was AAA, but after the mutation it became AAG. However, such changes do not always lead to a change in the amino acid in the synthesized polypeptide, since both triplets, due to the redundancy property of the genetic code, can correspond to one amino acid. Considering that mutations are often harmful, the property of noise immunity is useful.

The genetic, or biological, code is one of the universal properties of living nature, proving the unity of its origin. Genetic code is a method of encoding the sequence of amino acids of a polypeptide using a sequence of nucleic acid nucleotides (messenger RNA or a complementary DNA section on which mRNA is synthesized).

There are other definitions.

Genetic code- this is the correspondence of each amino acid (part of living proteins) to a specific sequence of three nucleotides. Genetic code is the relationship between nucleic acid bases and protein amino acids.

In the scientific literature, the genetic code does not mean the sequence of nucleotides in the DNA of an organism that determines its individuality.

It is incorrect to assume that one organism or species has one code, and another has another. The genetic code is how amino acids are encoded by nucleotides (i.e. principle, mechanism); it is universal for all living things, the same for all organisms.

Therefore, it is incorrect to say, for example, “The genetic code of a person” or “The genetic code of an organism,” which is often used in pseudo-scientific literature and films.

In these cases, we usually mean the genome of a person, an organism, etc.

The diversity of living organisms and the characteristics of their life activity is primarily due to the diversity of proteins.

The specific structure of a protein is determined by the order and quantity of the various amino acids that make up its composition. The amino acid sequence of a peptide is encoded in DNA using biological code. From the point of view of the diversity of the set of monomers, DNA is a more primitive molecule than a peptide. DNA is various options alternating only four nucleotides. This has long prevented researchers from considering DNA as the material of heredity.

How are amino acids coded by nucleotides?

1) Nucleic acids(DNA and RNA) are polymers made up of nucleotides.

Each nucleotide can contain one of four nitrogenous bases: adenine (A, en: A), guanine (G, G), cytosine (C, en: C), thymine (T, en: T). In the case of RNA, thymine is replaced by uracil (U, U).

When considering the genetic code, only nitrogenous bases are taken into account.

Then the DNA chain can be represented as their linear sequence. For example:

Complimentary this code the mRNA section will be like this:

2) Proteins (polypeptides) are polymers consisting of amino acids.

In living organisms, 20 amino acids are used to build polypeptides (a few more are very rare). To designate them, you can also use one letter (although more often they use three - an abbreviation for the name of the amino acid).

The amino acids in a polypeptide are also connected linearly by a peptide bond. For example, suppose there is a section of a protein with the following sequence of amino acids (each amino acid is designated by one letter):

3) If the task is to encode each amino acid using nucleotides, then it comes down to how to encode 20 letters using 4 letters.

This can be done by matching letters of a 20-letter alphabet with words made up of several letters of a 4-letter alphabet.

If one amino acid is encoded by one nucleotide, then only four amino acids can be encoded.

If each amino acid is associated with two consecutive nucleotides in the RNA chain, then sixteen amino acids can be encoded.

Indeed, if there are four letters (A, U, G, C), then the number of their different pair combinations will be 16: (AU, UA), (AG, GA), (AC, CA), (UG, GU), ( UC, CU), (GC, CG), (AA, UU, GG, CC).

[Brackets are used for ease of perception.] This means that only 16 different amino acids can be encoded with such a code (a two-letter word): each will have its own word (two consecutive nucleotides).

From mathematics, the formula to determine the number of combinations looks like this: ab = n.

Here n is the number of different combinations, a is the number of letters of the alphabet (or the base of the number system), b is the number of letters in the word (or digits in the number). If we substitute the 4-letter alphabet and words consisting of two letters into this formula, we get 42 = 16.

If three consecutive nucleotides are used as the code word for each amino acid, then 43 = 64 different amino acids can be encoded, since 64 different combinations can be made from four letters taken in groups of three (for example, AUG, GAA, CAU, GGU, etc.).

d.). This is already more than enough to encode 20 amino acids.

Exactly three letter code used in genetic code. Three consecutive nucleotides coding for one amino acid are called triplet(or codon).

Each amino acid is associated with a specific triplet of nucleotides.

In addition, since the combinations of triplets overlap the number of amino acids in excess, many amino acids are encoded by several triplets.

Three triplets do not code for any of the amino acids (UAA, UAG, UGA).

They indicate the end of the broadcast and are called stop codons(or nonsense codons).

The AUG triplet encodes not only the amino acid methionine, but also initiates translation (plays the role of a start codon).

Below are tables of amino acid correspondence to nucleoitide triplets.

Using the first table, it is convenient to determine the corresponding amino acid from a given triplet. For the second - for a given amino acid, the triplets corresponding to it.

Let's consider an example of the implementation of a genetic code. Let there be an mRNA with the following content:

Let's split the nucleotide sequence into triplets:

Let us associate each triplet with the amino acid of the polypeptide it encodes:

Methionine - Aspartic acid - Serine - Threonine - Tryptophan - Leucine - Leucine - Lysine - Asparagine - Glutamine

The last triplet is a stop codon.

Properties of the genetic code

The properties of the genetic code are largely a consequence of the way amino acids are encoded.

The first and obvious property is triplicity.

It refers to the fact that the unit of code is a sequence of three nucleotides.

An important property of the genetic code is its non-overlapping. A nucleotide included in one triplet cannot be included in another.

That is, the sequence AGUGAA can only be read as AGU-GAA, but not, for example, like this: AGU-GUG-GAA. That is, if a GU pair is included in one triplet, it cannot already be integral part another.

Under unambiguity The genetic code understands that each triplet corresponds to only one amino acid.

For example, the AGU triplet codes for the amino acid serine and nothing else.

Genetic code

This triplet uniquely corresponds to only one amino acid.

On the other hand, several triplets can correspond to one amino acid. For example, the same serine, in addition to AGU, corresponds to the AGC codon. This property called degeneracy genetic code.

Degeneracy allows many mutations to remain harmless, since often replacing one nucleotide in DNA does not lead to a change in the value of the triplet. If you look closely at the table of amino acid correspondence to triplets, you can see that if an amino acid is encoded by several triplets, they often differ in the last nucleotide, i.e. it can be anything.

Some other properties of the genetic code are also noted (continuity, noise immunity, universality, etc.).

Resilience as the adaptation of plants to living conditions. Basic reactions of plants to the action of unfavorable factors.

Plant resistance is the ability to withstand the effects of extreme environmental factors (soil and air drought).

The uniqueness of the genetic code is manifested in the fact that

This property was developed during the process of evolution and was genetically fixed. In areas with unfavorable conditions, stable decorative forms and local varieties of drought-resistant cultivated plants have formed. A particular level of resistance inherent in plants is revealed only under the influence of extreme environmental factors.

As a result of the onset of such a factor, the irritation phase begins - a sharp deviation from the norm of a number of physiological parameters and their rapid return to normal. Then there is a change in metabolic rate and damage to intracellular structures. At the same time, all synthetic ones are suppressed, all hydrolytic ones are activated, and the overall energy supply of the body decreases. If the effect of the factor does not exceed the threshold value, the adaptation phase begins.

An adapted plant reacts less to repeated or increasing exposure to an extreme factor. At the organismal level, interaction between organs is added to the adaptation mechanisms. Weakening the movement of water flows, mineral and organic compounds exacerbates competition between organs, their growth stops.

Biostability in plants defined. the maximum value of the extreme factor at which plants still form viable seeds. Agronomic stability is determined by the degree of yield reduction. Plants are characterized by their resistance to a specific type of extreme factor - wintering, gas-resistant, salt-resistant, drought-resistant.

The type of roundworms, unlike flatworms, have a primary body cavity - a schizocoel, formed due to the destruction of parenchyma that fills the gaps between the body wall and internal organs - its function is transport.

It maintains homeostasis. The body shape is round in diameter. The integument is cuticulated. The muscles are represented by a layer of longitudinal muscles. The intestine is through and consists of 3 sections: anterior, middle and posterior. The mouth opening is located on the ventral surface of the anterior end of the body. The pharynx has a characteristic triangular lumen. The excretory system is represented by protonephridia or special skin glands - hypodermal glands. Most species are dioecious and reproduce only sexually.

Development is direct, less often with metamorphosis. They have consistency cellular composition body and lack of ability to regenerate. Anterior section The intestine consists of the oral cavity, pharynx, and esophagus.

They do not have a middle or posterior section. The excretory system consists of 1-2 giant cells of the hypodermis. Longitudinal excretory canals lie in the lateral ridges of the hypodermis.

Properties of the genetic code. Evidence for triplet code. Decoding codons. Stop codons. The concept of genetic suppression.

The idea that a gene encodes information in the primary structure of a protein was concretized by F.

Crick in his sequence hypothesis, according to which the sequence of gene elements determines the sequence of amino acid residues in the polypeptide chain. The validity of the sequence hypothesis is proven by the colinearity of the structures of the gene and the polypeptide it encodes. The most significant development in 1953 was the idea that. That the code is most likely triplet.

; DNA base pairs: A-T, T-A, G-C, C-G - can only encode 4 amino acids if each pair corresponds to one amino acid. As you know, proteins contain 20 basic amino acids. If we assume that each amino acid has 2 base pairs, then 16 amino acids (4*4) can be encoded - this is again not enough.

If the code is triplet, then 4 base pairs can make up 64 codons (4*4*4), which is more than enough to encode 20 amino acids. Crick and his colleagues assumed that the code was triplet; there were no “commas” between the codons, i.e., separating marks; The code within a gene is read from a fixed point in one direction. In the summer of 1961, Kirenberg and Mattei reported the decoding of the first codon and suggested a method for establishing the composition of codons in a cell-free protein synthesis system.

Thus, the codon for phenylalanine was transcribed as UUU in mRNA. Further, as a result of the application of methods developed by Korana, Nirenberg and Leder in 1965.

a code dictionary was compiled in his modern form. Thus, the occurrence of mutations in T4 phages caused by the loss or addition of bases was evidence of the triplet nature of the code (property 1). These deletions and additions, leading to frame shifts when “reading” the code, were eliminated only by restoring the correctness of the code, this prevented the appearance of mutants. These experiments also showed that triplets do not overlap, that is, each base can belong to only one triplet (property 2).

Most amino acids have several codons. Code in which the number of amino acids less number Codons are called degenerate (property 3), i.e.

e. a given amino acid can be encoded by more than one triplet. In addition, three codons do not code for any amino acid at all (“nonsense codons”) and act as a “stop signal.” A stop codon is the end point of a functional unit of DNA, the cistron. Stop codons are the same in all species and are represented as UAA, UAG, UGA. A notable feature of the code is that it is universal (property 4).

In all living organisms, the same triplets code for the same amino acids.

The existence of three types of mutant codon terminators and their suppression have been demonstrated in E. coli and yeast. The discovery of suppressor genes that “interpret” nonsense alleles of different genes indicates that the translation of the genetic code can change.

Mutations affecting the anticodon of tRNAs change their codon specificity and create the possibility of suppression of mutations at the translation level. Suppression at the translational level can occur due to mutations in the genes encoding certain ribosomal proteins. As a result of these mutations, the ribosome “makes mistakes,” for example, in reading nonsense codons and “interprets” them using some non-mutant tRNAs. Along with genotypic suppression acting at the translation level, phenotypic suppression of nonsense alleles is also possible: when the temperature decreases, when cells are exposed to aminoglycoside antibiotics that bind to ribosomes, for example streptomycin.

22. Reproduction of higher plants: vegetative and asexual. Sporulation, spore structure, equal and heterosporous. Reproduction as a property of living matter, i.e. the ability of an individual to give rise to its own kind, existed in the early stages of evolution.

Forms of reproduction can be divided into 2 types: asexual and sexual. Asexual reproduction itself is carried out without the participation of germ cells, with the help of specialized cells - spores. They are formed in the organs of asexual reproduction - sporangia as a result of mitotic division.

During its germination, the spore reproduces a new individual, similar to the mother, with the exception of spores of seed plants, in which the spore has lost the function of reproduction and dispersal. Spores can also be formed by reduction division, with single-celled spores spilling out.

Reproduction of plants using vegetative (part of a shoot, leaf, root) or division of unicellular algae in half is called vegetative (bulb, cuttings).

Sexual reproduction is carried out by special sex cells - gametes.

Gametes are formed as a result of meiosis, there are female and male. As a result of their fusion, a zygote appears, from which a new organism subsequently develops.

Plants differ in the types of gametes. In some unicellular organisms certain time functions as a gamete. Organisms of different sexes (gametes) merge - this sexual process is called hologamia. If male and female gametes are morphologically similar and mobile, these are isogametes.

And the sexual process - isogamous. If female gametes are somewhat larger and less mobile than male ones, then these are heterogametes, and the process is heterogamy. Oogamy - female gametes are very large and immobile, male gametes are small and mobile.

12345678910Next ⇒

Genetic code - correspondence between DNA triplets and protein amino acids

The need to encode the structure of proteins in the linear sequence of nucleotides of mRNA and DNA is dictated by the fact that during translation:

  • there is no correspondence between the number of monomers in the mRNA matrix and the product - the synthesized protein;
  • there is no structural similarity between RNA and protein monomers.

This eliminates the complementary interaction between the matrix and the product - the principle by which the construction of new DNA and RNA molecules is carried out during replication and transcription.

From this it becomes clear that there must be a “dictionary” that allows one to find out which sequence of mRNA nucleotides ensures the inclusion of amino acids in a protein in a given sequence. This “dictionary” is called the genetic, biological, nucleotide, or amino acid code. It allows you to encrypt the amino acids that make up proteins using a specific sequence of nucleotides in DNA and mRNA. It is characterized by certain properties.

Tripletity. One of the main questions in determining the properties of the code was the question of the number of nucleotides, which should determine the inclusion of one amino acid in the protein.

It was found that the coding elements in the encryption of an amino acid sequence are indeed triples of nucleotides, or triplets, which were named "codons".

The meaning of codons.

It was possible to establish that out of 64 codons, the inclusion of amino acids in the synthesized polypeptide chain encodes 61 triplets, and the remaining 3 - UAA, UAG, UGA - do not encode the inclusion of amino acids in the protein and were originally called meaningless, or non-sense codons. However, it was later shown that these triplets signal the completion of translation, and therefore they came to be called termination or stop codons.

The codons of mRNA and triplets of nucleotides in the coding strand of DNA with the direction from the 5′ to the 3′ end have the same sequence of nitrogenous bases, except that in DNA instead of uracil (U), characteristic of mRNA, there is thymine (T).

Specificity.

Each codon corresponds to only one specific amino acid. In this sense, the genetic code is strictly unambiguous.

Table 4-3.

Unambiguousness is one of the properties of the genetic code, manifested in the fact that...

Main components of the protein synthesizing system

Required Components Functions
1. Amino acids Substrates for protein synthesis
2. tRNA tRNAs act as adapters. Their acceptor end interacts with amino acids, and their anticodon interacts with the codon of the mRNA.
3.

Aminoacyl-tRNA synthetase

Each aa-tRNA synthetase catalyzes the specific binding of one of 20 amino acids to the corresponding tRNA
4.mRNA The matrix contains a linear sequence of codons that determine the primary structure of proteins
5. Ribosomes Ribonucleoprotein subcellular structures that are the site of protein synthesis
6. Energy sources
7. Protein factors of initiation, elongation, termination Specific extraribosomal proteins required for the translation process (12 initiation factors: elF; 2 elongation factors: eEFl, eEF2, and termination factors: eRF)
8.

Magnesium ions

Cofactor that stabilizes ribosome structure

Notes: elF( eukaryotic initiation factors) — initiation factors; eEF ( eukaryotic elongation factors) — elongation factors; eRF ( eukaryotic releasing factors) are termination factors.

Degeneracy. There are 61 triplets in mRNA and DNA, each of which encodes the inclusion of one of 20 amino acids in the protein.

It follows from this that in information molecules the inclusion of the same amino acid in a protein is determined by several codons. This property of the biological code is called degeneracy.

In humans, only 2 amino acids are encoded with one codon - Met and Tri, while Leu, Ser and Apr - with six codons, and Ala, Val, Gly, Pro, Tre - with four codons (Table

Redundancy of coding sequences is the most valuable property of a code, since it increases the resistance of the information flow to the adverse effects of external and internal environment. When determining the nature of the amino acid to be included in a protein, the third nucleotide in a codon is not as important as the first two. As can be seen from table. 4-4, for many amino acids, replacing a nucleotide in the third position of a codon does not affect its meaning.

Linearity of information recording.

During translation, mRNA codons are “read” from a fixed starting point sequentially and do not overlap. The information record does not contain signals indicating the end of one codon and the beginning of the next. The AUG codon is the initiation codon and is read both at the beginning and in other parts of the mRNA as Met. The triplets following it are read sequentially without any gaps until the stop codon, at which the synthesis of the polypeptide chain is completed.

Versatility.

Until recently, it was believed that the code was absolutely universal, i.e. the meaning of code words is the same for all studied organisms: viruses, bacteria, plants, amphibians, mammals, including humans.

However, one exception later became known; it turned out that mitochondrial mRNA contains 4 triplets that have a different meaning than in nuclear-origin mRNA. Thus, in mitochondrial mRNA, the triplet UGA encodes Tri, AUA encodes Met, and ACA and AGG are read as additional stop codons.

Colinearity of gene and product.

In prokaryotes, a linear correspondence between the codon sequence of a gene and the amino acid sequence in the protein product has been found, or, as they say, there is colinearity between the gene and the product.

Table 4-4.

Genetic code

First base Second base
U WITH A G
U UUU Hairdryer UCU Cep UAU Shooting Range UGU Cis
UUC Hairdryer UCC Ser iASTir UGC Cis
UUA Lei UCA Cep UAA* UGA*
UUG Lei UCG Ser UAG* UGG April
WITH CUU Lei CCU Pro CAU Gis CGU April
CUC Lei SSS Pro SAS Gis CGC April
CUA Lei SSA Pro SAA Gln CGA April
CUG Lei CCG Pro CAG Gln CGG April
A AUU Ile ACU Tpe AAU Asn AGU Ser
AUC Ile ACC Tre AAS Asn AGG Gray
AUA Meth ASA Tre AAA Liz AGA April
AUG Met ACG Tre AAG Liz AGG April
G GUU Ban GCU Ala GAU Asp GGU Gli
GUC Val GCC Ala GAC Asp GGC Gli
GUA Val GSA Ala GAA Glu GGA Gli
GUG Val GСG Ala GAG Glu GGG Glee

Notes: U - uracil; C - cytosine; A - adenine; G - guanine; *—termination codon.

In eukaryotes, base sequences in a gene that are colinear with the amino acid sequence in the protein are interrupted by nitrones.

Therefore, in eukaryotic cells, the amino acid sequence of a protein is colinear with the sequence of exons in a gene or mature mRNA after post-transcriptional removal of introns.