Nucleic Acid
Nucleic Acid


Nucleic acid, naturally occurring chemical macromolecule that is capable of being broken down to yield phosphoric acid, sugars, and a mixture of organic bases (purines and pyrimidines). Nucleic acids are the main information-carrying molecules of the cell, and, by directing the process of protein synthesis, they determine the inherited characteristics of every living thing. 


A cell’s hereditary material is comprised of nucleic acids, which enable living organisms to pass on genetic information from one generation to next. There are two types of nucleic acids: deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). DNA and RNA differ very slightly in their chemical composition, yet play entirely different biological roles.

Chemically, nucleic acids are polynucleotides—chains of nucleotides. A nucleotide is composed of three components: a pentose sugar, a nitrogen base, and a phosphate group. The sugar and the base together form a nucleoside. Hence, a nucleotide is sometimes referred to as a nucleoside monophosphate. Each of the three components of a nucleotide plays a key role in the overall assembly of nucleic acids.

As the name suggests, a pentose sugar has five carbon atoms, which are labeled 1o, 2o, 3o, 4o, and 5o. The pentose sugar in RNA is ribose, meaning the 2o carbon carries a hydroxyl group. The sugar in DNA is deoxyribose, meaning the 2o carbon is attached to a hydrogen atom. The sugar is attached to the nitrogen base at the 1o carbon and the phosphate molecule at the 5o carbon.

The phosphate molecule attached to the 5o carbon of one nucleotide can form a covalent bond with the 3o hydroxyl group of another nucleotide, linking the two nucleotides together. This covalent bond is called a phosphodiester bond. The phosphodiester bond between nucleotides creates an alternating sugar and phosphate backbone in a polynucleotide chain. Linking the 5o end of one nucleotide to the 3o end of another imparts directionality to the polynucleotide chain, which plays a key role in DNA replication and RNA synthesis. At one end of the polynucleotide chain, called the 3o end, the sugar has a free 3o hydroxyl group. At the other end, the 5o end, the sugar has a free 5o phosphate group.

The nitrogen bases are molecules containing one or two rings made up of carbon and nitrogen atoms. These molecules are called “bases” because they are chemically basic, and can bind to hydrogen ions. There are two classes of nitrogen bases: pyrimidines and purines. The pyrimidines have a six-membered ring structure, whereas the purines are comprised of a six-membered ring fused to a five-membered ring. The pyrimidines include cytosine (C), thymine (T) and uracil (U). The purines include adenine (A) and guanine (G).

Cytosine, adenine, and guanine are present in both DNA and RNA. However, thymine is specific to DNA, and uracil is found only in RNA. The purines and pyrimidines can form hydrogen bonds with each other in a particular pattern, based on the presence of complementary chemical groups that are analogous to pieces of a jigsaw puzzle. Under normal cellular conditions, adenine forms hydrogen bonds with thymine (in DNA) or uracil (in RNA), whereas guanine forms hydrogen bonds with cytosine. This complementary base pairing is critical to DNA structure and function.

DNA adopts a double helical structure within the cell. A double helix is composed of two polynucleotide chains, called strands, that wind around each other in a helical (i.e., spiral) manner. The two strands are in opposite orientations, or are “antiparallel” to each other, meaning the 5o end of one strand is close to the 3o end of another. The two strands are held together through complementary base pairing (e.g., cytosine with guanine).

In a DNA double helix, the sugar-phosphate backbone is present on the outside, whereas the hydrogen-bonded bases are on the inside. RNA mostly occurs as a single-stranded molecule. The single RNA strand can form localized secondary structures through intra-strand complementary base pairing. Different types of RNA secondary structures have distinct functions within the cell.

Molecular composition

Nucleic acids are generally very large molecules. Indeed, DNA molecules are probably the largest individual molecules known. Well-studied biological nucleic acid molecules range in size from 21 nucleotides (small interfering RNA) to large chromosomes (human chromosome 1 is a single molecule that contains 247 million base pairs).

In most cases, naturally occurring DNA molecules are double-stranded and RNA molecules are single-stranded. There are numerous exceptions, however—some viruses have genomes made of double-stranded RNA and other viruses have single-stranded DNA genomes,and, in some circumstances, nucleic acid structures with three or four strands can form.

Nucleic acids are linear polymers (chains) of nucleotides. Each nucleotide consists of three components: a purine or pyrimidine nucleobase (sometimes termed nitrogenous base or simply base), a pentose sugar, and a phosphate group which makes the molecule acidic. The substructure consisting of a nucleobase plus sugar is termed a nucleoside. Nucleic acid types differ in the structure of the sugar in their nucleotides–DNA contains 2′-deoxyribose while RNA contains ribose (where the only difference is the presence of a hydroxyl group). Also, the nucleobases found in the two nucleic acid types are different: adeninecytosine, and guanine are found in both RNA and DNA, while thymine occurs in DNA and uracil occurs in RNA.

The sugars and phosphates in nucleic acids are connected to each other in an alternating chain (sugar-phosphate backbone) through phosphodiester linkages. In conventional nomenclature, the carbons to which the phosphate groups attach are the 3′-end and the 5′-end carbons of the sugar. This gives nucleic acids directionality, and the ends of nucleic acid molecules are referred to as 5′-end and 3′-end. The nucleobases are joined to the sugars via an N-glycosidic linkage involving a nucleobase ring nitrogen (N-1 for pyrimidines and N-9 for purines) and the 1′ carbon of the pentose sugar ring.

Non-standard nucleosides are also found in both RNA and DNA and usually arise from modification of the standard nucleosides within the DNA molecule or the primary (initial) RNA transcript. Transfer RNA (tRNA) molecules contain a particularly large number of modified nucleosides.

Double-stranded nucleic acids are made up of complementary sequences, in which extensive Watson-Crick base pairing results in a highly repeated and quite uniform Nucleic acid double-helical three-dimensional structure. In contrast, single-stranded RNA and DNA molecules are not constrained to a regular double helix, and can adopt highly complex three-dimensional structures that are based on short stretches of intramolecular base-paired sequences including both Watson-Crick and noncanonical base pairs, and a wide range of complex tertiary interactions.

Nucleic acid molecules are usually unbranched and may occur as linear and circular molecules. For example, bacterial chromosomes, plasmidsmitochondrial DNA, and chloroplast DNA are usually circular double-stranded DNA molecules, while chromosomes of the eukaryotic nucleus are usually linear double-stranded DNA molecules. Most RNA molecules are linear, single-stranded molecules, but both circular and branched molecules can result from RNA splicing reactions. The total amount of pyrimidines in a double-stranded DNA molecule is equal to the total amount of purines. The diameter of the helix is about 20Å.

One DNA or RNA molecule differs from another primarily in the sequence of nucleotides. Nucleotide sequences are of great importance in biology since they carry the ultimate instructions that encode all biological molecules, molecular assemblies, subcellular and cellular structures, organs, and organisms, and directly enable cognition, memory, and behavior. Enormous efforts have gone into the development of experimental methods to determine the nucleotide sequence of biological DNA and RNA molecules, and today hundreds of millions of nucleotides are sequenced daily at genome centers and smaller laboratories worldwide. In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology .

Types of nucleic acid


Ribonucleic acid, or RNA is one of the three major biological macromolecules that are essential for all known forms of life (along with DNA and proteins). A central tenet of molecular biology states that the flow of genetic information in a cell is from DNA through RNA to proteins: “DNA makes RNA makes protein”. Proteins are the workhorses of the cell; they play leading roles in the cell as enzymes, as structural components, and in cell signaling, to name just a few. DNA(deoxyribonucleic acid) is considered the “blueprint” of the cell; it carries all of the genetic information required for the cell to grow, to take in nutrients, and to propagate. RNA–in this role–is the “DNA photocopy” of the cell. When the cell needs to produce a certain protein, it activates the protein’s gene–the portion of DNA that codes for that protein–and produces multiple copies of that piece of DNA in the form of messenger RNA, or mRNA. The multiple copies of mRNA are then used to translate the genetic code into protein through the action of the cell’s protein manufacturing machinery, the ribosomes. Thus, RNA expands the quantity of a given protein that can be made at one time from one given gene, and it provides an important control point for regulating when and how much protein gets made.

For many years RNA was believed to have only three major roles in the cell–as a DNA photocopy (mRNA), as a coupler between the genetic code and the protein building blocks (tRNA), and as a structural component of ribosomes (rRNA). In recent years, however, we have begun to realize that the roles adopted by RNA are much broader and much more interesting. We now know that RNA can also act as enzymes (called ribozymes) to speed chemical reactions. In a number of clinically important viruses RNA, rather than DNA, carries the viral genetic information. RNA also plays an important role in regulating cellular processes–from cell division, differentiation and growth to cell aging and death. Defects in certain RNAs or the regulation of RNAs have been implicated in a number of important human diseases, including heart disease, some cancers, stroke and many others.

The ribose sugar of RNA is a cyclical structure consisting of five carbons and one oxygen. The presence of a chemically reactive hydroxyl (−OH) group attached to the second carbon group in the ribose sugar molecule makes RNA prone to hydrolysis. This chemical lability of RNA, compared with DNA, which does not have a reactive −OH group in the same position on the sugar moiety (deoxyribose), is thought to be one reason why DNA evolved to be the preferred carrier of genetic information in most organisms. The structure of the RNA molecule was described by R.W. Holley in 1965.

RNA typically is a single-stranded biopolymer. However, the presence of self-complementary sequences in the RNA strand leads to intrachain base-pairing and folding of the ribonucleotide chain into complex structural forms consisting of bulges and helices. The three-dimensional structure of RNA is critical to its stability and function, allowing the ribose sugar and the nitrogenous bases to be modified in numerous different ways by cellular enzymes that attach chemical groups (e.g., methyl groups) to the chain. Such modifications enable the formation of chemical bonds between distant regions in the RNA strand, leading to complex contortions in the RNA chain, which further stabilizes the RNA structure. Molecules with weak structural modifications and stabilization may be readily destroyed. As an example, in an initiator transfer RNA (tRNA) molecule that lacks a methyl group (tRNAiMet), modification at position 58 of the tRNA chain renders the molecule unstable and hence nonfunctional; the nonfunctional chain is destroyed by cellular tRNA quality control mechanisms.

RNAs can also form complexes with molecules known as ribonucleoproteins (RNPs). The RNA portion of at least one cellular RNP has been shown to act as a biological catalyst, a function previously ascribed only to proteins.

Of the many types of RNA, the three most well-known and most commonly studied are messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA (rRNA), which are present in all organisms. These and other types of RNAs primarily carry out biochemical reactions, similar to enzymes. Some, however, also have complex regulatory functions in cells. Owing to their involvement in many regulatory processes, to their abundance, and to their diverse functions, RNAs play important roles in both normal cellular processes and diseases.

In protein synthesis, mRNA carries genetic codes from the DNA in the nucleus to ribosomes, the sites of protein translation in the cytoplasm. Ribosomes are composed of rRNA and protein. The ribosome protein subunits are encoded by rRNA and are synthesized in the nucleolus. Once fully assembled, they move to the cytoplasm, where, as key regulators of translation, they “read” the code carried by mRNA. A sequence of three nitrogenous bases in mRNA specifies incorporation of a specific amino acid in the sequence that makes up the protein. Molecules of tRNA (sometimes also called soluble, or activator, RNA), which contain fewer than 100 nucleotides, bring the specified amino acids to the ribosomes, where they are linked to form proteins.

In addition to mRNA, tRNA, and rRNA, RNAs can be broadly divided into coding (cRNA) and noncoding RNA (ncRNA). There are two types of ncRNAs, housekeeping ncRNAs (tRNA and rRNA) and regulatory ncRNAs, which are further classified according to their size. Long ncRNAs (lncRNA) have at least 200 nucleotides, while small ncRNAs have fewer than 200 nucleotides. Small ncRNAs are subdivided into micro RNA (miRNA), small nucleolar RNA (snoRNA), small nuclear RNA (snRNA), small-interfering RNA (siRNA), and PIWI-interacting RNA (piRNA).

The miRNAs are of particular importance. They are about 22 nucleotides long and function in gene regulation in most eukaryotes. They can inhibit (silence) gene expression by binding to target mRNA and inhibiting translation, thereby preventing functional proteins from being produced. Many miRNAs play significant roles in cancer and other diseases. For example, tumour suppressor and oncogenic (cancer-initiating) miRNAs can regulate unique target genes, leading to tumorigenesis and tumour progression.

Also of functional significance are the piRNAs, which are about 26 to 31 nucleotides long and exist in most animals. They regulate the expression of transposons (jumping genes) by keeping the genes from being transcribed in the germ cells (sperm and eggs). Most piRNA are complementary to different transposons and can specifically target those transposons.

Circular RNA (circRNA) is unique from other RNA types because its 5′ and 3′ ends are bonded together, creating a loop. The circRNAs are generated from many protein-encoding genes, and some can serve as templates for protein synthesis, similar to mRNA. They can also bind miRNA, acting as “sponges” that prevent miRNA molecules from binding to their targets. In addition, circRNAs play an important role in regulating the transcription and alternative splicing of the genes from which circRNAs were derived.

RNA in disease

Important connections have been discovered between RNA and human disease. For example, as described previously, some miRNAs are capable of regulating cancer-associated genes in ways that facilitate tumour development. In addition, the dysregulation of miRNA metabolism has been linked to various neurodegenerative diseases, including Alzheimer disease. In the case of other RNA types, tRNAs can bind to specialized proteins known as caspases, which are involved in apoptosis (programmed cell death). By binding to caspase proteins, tRNAs inhibit apoptosis; the ability of cells to escape programmed death signaling is a hallmark of cancer. Noncoding RNAs known as tRNA-derived fragments (tRFs) are also suspected to play a role in cancer. The emergence of techniques such as RNA sequencing has led to the identification of novel classes of tumour-specific RNA transcripts, such as MALAT1 (metastasis associated lung adenocarcinoma transcript 1), increased levels of which have been found in various cancerous tissues and are associated with the proliferation and metastasis (spread) of tumour cells.

A class of RNAs containing repeat sequences is known to sequester RNA-binding proteins (RBPs), resulting in the formation of foci or aggregates in neural tissues. These aggregates play a role in the development of neurological diseases such as amyotrophic lateral sclerosis (ALS) and myotonic dystrophy. The loss of function, dysregulation, and mutation of various RBPs has been implicated in a host of human diseases.

The discovery of additional links between RNA and disease is expected. Increased understanding of RNA and its functions, combined with the continued development of sequencing technologies and efforts to screen RNA and RBPs as therapeutic targets, are likely to facilitate such discoveries.

Ribosomal RNA

The most abundant class of RNA in cells is ribosomal RNA (rRNA). This class comprises those molecular species that form part of the structure of ribosomes, which are components of the protein-synthesizing machinery in the cell cytoplasm. The predominant RNA molecules are of size 16S and 23S in bacteria (the S value denotes the sedimentation velocity of the RNA upon ultracentrifugation in water), and 18S and 28S in most mammalian cells. All of the ribosomal RNA species are relatively rich in G and C residues.

Messenger RNA

Another prominent class of RNAs consists of the messenger RNA (mRNA) molecules and their synthetic precursors. Messenger RNAs are those species that code for proteins. They are transcribed from specific genes in the cell nucleus, and they carry the genetic information to the cytoplasm, where their sequences are translated to determine amino acid sequences during the process of protein synthesis. The messenger RNAs thus consist primarily of triplet codons. Most messenger RNAs are derived from longer precursor molecules that are the primary products of transcription and that are found in the nucleus. These precursors undergo several steps known as RNA processing, eventually resulting in production of cytoplasmic messenger molecules ready for translation.

Transfer RNA

The third major RNA class is transfer RNA (tRNA). Transfer RNAs are small RNA molecules that possess a relatively high proportion of modified and unusual bases, such as methylinosine or pseudouridine. Each transfer RNA molecule possesses an anticodon and an amino acid–binding site. The anticodon is a triplet complementary to the messenger RNA codon for a particular amino acid. A transfer RNA molecule bound to its particular amino acid is termed charged. The charged transfer RNAs participate in protein synthesis; through base pairing, they bind to each appropriate codon in a messenger RNA molecule and thus order the sequence of attached amino acids for polymerization.

Conception and Forms of mRNA Vaccines

During the last two decades, there has been broad interest in RNA-based technologies for the development of prophylactic and therapeutic vaccines. Preclinical and clinical trials have shown that mRNA vaccines provide a safe and long-lasting immune response in animal models and humans. In this review, we summarize current research progress on mRNA vaccines, which have the potential to be quick-manufactured and to become powerful tools against infectious disease and we highlight the bright future of their design and applications.

Vaccination is the most successful medical approach to disease prevention and control. The successful development and use of vaccines has saved thousands of lives and large amounts of money. In the future, vaccines have the potential to be used not only against infectious diseases but also for cancer as a prophylactic and treatment tool, and for elimination of allergens . Prior to the 1980s, vaccines were developed for protection against disease-causing microorganisms. Empirically, inactivated vaccines were produced by heat or chemical treatment, and live attenuated vaccines were generally developed in animals, cell lines or unfavorable growth conditions. During vaccine development, the mechanisms involved in conferring immunity were unknown. Nevertheless, the use of live attenuated or killed whole organism-based vaccines had enormous success in the control and eradication of a number of severe human infectious diseases, including smallpox, polio, measles, mumps, rubella, and animal infectious disease, such as classic swine fever, cattle plague, and equine infectious anemia. More recently, live attenuated (LAV), subunit and peptide based vaccines have been developed thanks to advancements in molecular biology theory and technologies. The results obtained with LAV vaccination dramatically expanded our knowledge of the mechanisms related to the immune response elicited by these vaccines. For inactivated vaccines, antigen-specific antibodies largely contribute to the prevention and control of microbe-initiated infectious disease. In addition to specific humoral immune responses. LAVs elicit strong cellular immune responses, which are critical to eradicate many intracellular pathogens. Nevertheless, the failures that are sometimes caused by inactivated vaccines are ascribed to mutation of the surface antigens of pathogens. Additional concerns about LAV applications include the potential to cause disease in immuno-compromised individuals and the possibility of reversion to a virulent form due to the back-mutation, the acquisition of compensatory mutations, or recombination with circulating transmissible wild-type strains . Nevertheless, subunit and peptide vaccines are less effective at eliciting a robust CD8+ immune response, which is important for intracellular pathogens, including viruses and some bacteria .

Vaccination with non-viral delivered nucleic acid-based vaccines mimics infection or immunization with live microorganisms and stimulates potent T follicular helper and germinal center B cell immune response . Furthermore, non-viral delivered nucleic acid-based vaccine manufacturing is safe and time-saving, without the growth of highly pathogenic organisms at a large scale and less risks from contamination with live infectious reagents and the release of dangerous pathogens. Notably, for most emerging and re-emerging devastating infectious diseases, the main obstacle is obtaining a stockpile in a short timeframe . Non-viral delivered nucleic acid-based vaccines can fill the gap between a disease epidemic and a desperately needed vaccine . Non-viral delivered nucleic acids are categorized as DNA or RNA according to their type of 5-carbon sugar. From being administrated to antigen expression, DNA vaccine and RNA vaccines are processed through different pathways. In the steps between immunization with a DNA template and expression of the target antigen, the DNA has to overcome the cytoplasmic membrane and nuclear membrane, be transcribed into mRNA, and move back into the cytoplasm and initiate translation . Although promising and with shown safety, well-tolerability and immunogenicity, DNA vaccines were characterized by suboptimal potency in early clinical trials . Enhanced delivery technologies, such as electroporation, have increased the efficacy of DNA vaccines in humans , but have not reduced the potential risk of integration of exogenous DNA into the host genome, which may cause severe mutagenesis and induced new diseases . Since naked in vitro transcribed mRNA was found to be expressed in vivo after direct injection into mouse muscle, mRNA has been investigated extensively as a preventive and therapeutic platform . Due to the dramatic development of RNA-based vaccine studies and applications, a plethora of mRNA vaccines have entered into clinical trial . Comparatively, mRNA vaccines confer several advantages over viral vectored vaccines and DNA vaccines . The utilization of RNA as a therapeutic tool is not the focus of this manuscript and has been extensively reviewed elsewhere . In this review, we provide highlights on mRNA vaccines as promising tools in the prevention and control of infectious disease.

mRNA vaccines were reported to be effective for direct gene transfer for the first time by Woff et al. . Currently, two forms of mRNA vaccines have been developed: conventional mRNA vaccines and self-amplifying mRNA vaccines, which are derived from positive strand RNA viruses. Although mRNA vaccines were first tested in the early 1990s, these vaccines were not initially extensively utilized due to concerns about their fragile stability caused by omnipresent ribonucleases and small-scale production. Initial demonstration that mRNA stability can be improved by optimization and formulation was published by Ross and colleagues in 1995 . Since that time, studies on mRNA vaccines have exploded and mRNA can now be synthetically produced, through a cell-free enzymatic transcription reaction. The in vitro transcription reaction includes a linearized plasmid DNA encoding the mRNA vaccine, as a template, a recombinant RNA polymerase, and nucleoside triphosphates as essential components. A cap structure is enzymatically added to the transcriptional product at the end of the reaction or as a synthetic cap analog in a single step procedure. Finally, a poly(A) tail will be provided to form a mature mRNA sequence.

Conventional mRNA vaccines include in their simplest an ORF for the target antigen, flanked by untranslated regions (UTRs) and with a terminal poly(A) tail. After transfection, they drive transient antigen expression. In addition to conventional vaccines, there is another mRNA vaccine platform based on the genome of positive strand viruses, most commonly alphaviruses. These mRNA vaccines are based on an engineered viral genome containing the genes encoding the RNA replication machinery whereas the structural protein sequences are replaced with the gene of interest (GoI) and the resulting genomes are referred as replicons. These vaccines are named self-amplifying mRNA and are capable of directing their self-replication, through synthesis of the RNA-dependent RNA polymerase complex, generating multiple copies of the antigen-encoding mRNA, and express high levels of the heterologous gene when they are introduced into the cytoplasm of host cells, in a way that mimics production of antigens in vivo by viral pathogens, triggering both humoral and cellular immune responses . Self-amplifying mRNA can be derived from the engineered genomes of Sindbis virus, Semliki Forest virus, Kunjin virus, among others . Self-amplifying mRNAs (~9–11 kb) are generated from the DNA template with similar procedures to those previously described for conventional mRNAs and RNA molecules can be produced at a large scale in vitro. After the purified RNA replicon is delivered into host cells, either as viral particles or as synthetically formulated RNA, it is translated extensively and amplified by its encoding RNA-dependent RNA polymerase. Compared with the rapid expression of conventional mRNAs, published results have shown that vaccination with self-amplifying mRNA vaccines results in higher antigen expression levels, although delayed in time, which persist for several days in vivo. Equivalent protection is conferred but at a much lower RNA dose . Due to the lack of viral structural proteins, the replicon does not produce infectious viral particles. Additionally, both conventional mRNA and self-amplifying mRNAs cannot potentially integrate into the host genome and will be degraded naturally during the process of antigen expression. These characteristics indicate that mRNA vaccines have the potential to be much safer than other vaccines and are a promising vaccine platform.

Stability and translation of mRNA is crucial for a successful RNA vaccine . In the process of translation, mRNA purity is critical to determine its stability and protein yield. Contamination with dsRNAs, derived from aberrant RNA polymerase activities, leads to the inhibition of translation and degradation of cellular mRNA and ribosomal RNA, thus decreasing protein expression by interrupting the translation machine. The removal of dsRNA can increase translation dramatically . Excess components and short or double strand RNAs (dsRNA) can be removed by purification. Initially, lithium chloride (LiCl) was used for this purpose, but it restricted the industrialization of mRNA vaccines and it did not remove dsRNAs. Purification via fast protein liquid chromatography (FPLC) or high-performance liquid chromatography (HPLC) could be utilized to remove any remaining product and produce mRNA at a large scale and for Good Manufacturing Practice (GMP) processes . Non-coding sequence flanking 5′ and 3′ terminal of open reading frame (ORF) is crucial for translation. The 5′ untranslated region, such as kozak sequence, or 5′ caps is required for efficient protein production . The 3′ untranslated region containing optimal poly(A) signal determined the stability of mRNA and increased protein translation. Additionally, codon optimization is a popular method to avoid rare codons with low utilization, to increase protein production, mRNA abundance and stability .

mRNA vaccines are efficient at antigen expression, but sequence and secondary structures formed by mRNAs are recognized by a number of innate immune receptors, and this recognition can inhibit protein translation. Thanks to advancement in RNA biology understanding, several methods can be employed to increase the potency of mRNA vaccines, including sequence optimization and usage of modified nucleosides. Recognition from innate immune sensors can be avoided by incorporating modified nucleosides, such as pseudouridine (Ψ), 5-methylcytidine (5 mC), cap-1 structure and optimized codons, which in turns improve translation efficiency . During the in vitro transcription of mRNA, immature mRNA would be produced as contamination which inhibited translation through stimulating innate immune activation. FPLC and HPLC purification could tackle this problem .

Currently, most vaccines in use, with the exception of some animal vaccines, need to be transported and stored in an uninterrupted cold-chain process, which is prone to failure, especially in poor rural areas of tropical countries; these requirements are not being met by available effective vaccines to prevent and control infectious diseases. Therefore, the development of thermostable vaccines has been gaining interest. Optimization in formulation of synthetic mRNA vaccines have shown that it is possible to generate thermostable vaccines. The results described by Jones showed that freeze-dried mRNA with trehalose or naked mRNA is stable for at least 10 months at 4°C. After being transfected, these mRNAs expressed high levels of proteins and conferred highly effective and long-lasting immunity in newborn and elderly animal models . Another lyophilized mRNA vaccine was shown to be stable at 5–25°C for 36 months and 40°C for 6 months . Stitz and colleagues showed that when a protamine-encapsulated conventional mRNA-based rabies virus vaccine was subjected to oscillating temperatures between 4 and 56°C for 20 cycles and exposure 70°C, its immunogenicity and protective effects were not compromised Encapsulation of mRNA with cationic liposome or cell penetrating peptide (CPP) protected mRNA from degradation by RNase. These intriguing approaches would be discussed in delivery methods.

During the last two decades, mRNA vaccines have been investigated extensively for infectious disease prevention, and for cancer prophylaxis and therapy. Much progress has been made thus far . Cancer mRNA vaccines were designed to express tumor-associated antigens that stimulate cell-mediated immune responses to clear or inhibit cancer cells . Most cancer vaccine are investigated more as therapeutics than prophylactics and have been reviewed elsewhere ). mRNA vaccines against infectious diseases could be developed as prophylactic or therapeutic. mRNA vaccines expressing antigen of infectious pathogen induce both strong and potent T cell and humoral immune responses . As previously described the production procedure to generate mRNA vaccines is entirely cell-free, simple and rapid if compared to production of whole microbe, live attenuated and subunit vaccines. This fast and simple manufacturing process makes mRNA a promising bio-product that can potentially fill the gap between emerging infectious disease and the desperate need for effective vaccines. Producing RNA at a large scale to satisfy commercialization is the first step toward making mRNA vaccines. Currently, all components needed for mRNA production are available at the GMP grade; however, some components are supplied at a limited scale.

A great deal of research has been initially conducted on the development of cancer mRNA vaccines and has demonstrated the feasibility of producing clinical grade in vitro transcribed RNA . Several projects on mRNA vaccines against infectious disease have also been conducted, although clinical evaluation is still limited. For example, several RNA-based vaccine platforms have been utilized for the development of influenza vaccines. Several published results showed that RNA-based influenza vaccines induce a broadly protective immune response against not only homologous but also hetero-subtypic influenza viruses . Influenza mRNA vaccines hold great promises being an egg-free platform, and leading to production of antigen with high fidelity in mammalian cells. Recent published results demonstrated that the loss of a glycosylation site by a mutation in the hemagglutinin (HA) of the egg-adapted H3N2 vaccine strain resulted in poor neutralization of circulating H3N2 viruses in vaccinated humans and ferrets. In contrast, the process of mRNA vaccine production is egg-free, and mRNA-encoded proteins are properly folded and glycosylated in host cells after vaccine administration, thus avoiding the risk of producing incorrect antigens .

mRNA has also been used in the veterinary field to prevent animal infectious diseases. Pulido et al. demonstrated that immunization with in vitro transcribed mRNA induced protection against foot and mouse disease virus in mice .

Saxena and colleagues demonstrated that a self-amplifying mRNA vaccine encoding rabies virus glycoprotein induced an immune response and provided protection in mice and could potentially be used to prevent rabies in canine . Recently, VanBlargan et al. developed a lipid nanoparticle (LNP)-encapsulated modified mRNA vaccine encoding prM and E genes of deer powassan virus (POWV). This mRNA vaccine induced robust humoral immune response not only against POWV strains but also against the distantly related Langat virus . As described previously, modification of nucleosides and optimization of codons can avoid recognition by innate immune sensors to improve translation efficiency.


DNA, or deoxyribonucleic acid, is the hereditary material in humans and almost all other organisms. Nearly every cell in a person’s body has the same DNA. Most DNA is located in the cell nucleus (where it is called nuclear DNA), but a small amount of DNA can also be found in the mitochondria (where it is called mitochondrial DNA or mtDNA). Mitochondria are structures within cells that convert the energy from food into a form that cells can use.

The information in DNA is stored as a code made up of four chemical bases: adenine (A), guanine (G), cytosine (C), and thymine (T). Human DNA consists of about 3 billion bases, and more than 99 percent of those bases are the same in all people. The order, or sequence, of these bases determines the information available for building and maintaining an organism, similar to the way in which letters of the alphabet appear in a certain order to form words and sentences.

DNA bases pair up with each other, A with T and C with G, to form units called base pairs. Each base is also attached to a sugar molecule and a phosphate molecule. Together, a base, sugar, and phosphate are called a nucleotide. Nucleotides are arranged in two long strands that form a spiral called a double helix. The structure of the double helix is somewhat like a ladder, with the base pairs forming the ladder’s rungs and the sugar and phosphate molecules forming the vertical sidepieces of the ladder.

An important property of DNA is that it can replicate, or make copies of itself. Each strand of DNA in the double helix can serve as a pattern for duplicating the sequence of bases. This is critical when cells divide because each new cell needs to have an exact copy of the DNA present in the old cell.

DNA is a long polymer made from repeating units called nucleotides, each of which is usually symbolized by a single letter: either A, T, C, or G. The structure of DNA is dynamic along its length, being capable of coiling into tight loops and other shapes. In all species it is composed of two helical chains, bound to each other by hydrogen bonds. Both chains are coiled around the same axis, and have the same pitch of 34 ångströms (3.4 nm). The pair of chains have a radius of 10 Å (1.0 nm). According to another study, when measured in a different solution, the DNA chain measured 22–26 Å (2.2–2.6 nm) wide, and one nucleotide unit measured 3.3 Å (0.33 nm) long. Although each individual nucleotide is very small, a DNA polymer can be very large and may contain hundreds of millions of nucleotides, such as in chromosome 1. Chromosome 1 is the largest human chromosome with approximately 220 million base pairs, and would be 85 mm long if straightened.

DNA does not usually exist as a single strand, but instead as a pair of strands that are held tightly together. These two long strands coil around each other, in the shape of a double helix. The nucleotide contains both a segment of the backbone of the molecule (which holds the chain together) and a nucleobase (which interacts with the other DNA strand in the helix). A nucleobase linked to a sugar is called a nucleoside, and a base linked to a sugar and to one or more phosphate groups is called a nucleotide. A biopolymer comprising multiple linked nucleotides (as in DNA) is called a polynucleotide.

The backbone of the DNA strand is made from alternating phosphate and sugar groups. The sugar in DNA is 2-deoxyribose, which is a pentose (five-carbon) sugar. The sugars are joined together by phosphate groups that form phosphodiester bonds between the third and fifth carbon atoms of adjacent sugar rings. These are known as the 3′-end (three prime end), and 5′-end (five prime end) carbons, the prime symbol being used to distinguish these carbon atoms from those of the base to which the deoxyribose forms a glycosidic bond. Therefore, any DNA strand normally has one end at which there is a phosphate group attached to the 5′ carbon of a ribose (the 5′ phosphoryl) and another end at which there is a free hydroxyl group attached to the 3′ carbon of a ribose (the 3′ hydroxyl). The orientation of the 3′ and 5′ carbons along the sugar-phosphate backbone confers directionality (sometimes called polarity) to each DNA strand. In a nucleic acid double helix, the direction of the nucleotides in one strand is opposite to their direction in the other strand: the strands are antiparallel. The asymmetric ends of DNA strands are said to have a directionality of five prime end (5′ ), and three prime end (3′), with the 5′ end having a terminal phosphate group and the 3′ end a terminal hydroxyl group. One major difference between DNA and RNA is the sugar, with the 2-deoxyribose in DNA being replaced by the alternative pentose sugar ribose in RNA.A section of DNA. The bases lie horizontally between the two spiraling strands .

The DNA double helix is stabilized primarily by two forces: hydrogen bonds between nucleotides and base-stacking interactions among aromatic nucleobases. The four bases found in DNA are adenine (A), cytosine (C), guanine (G) and thymine (T). These four bases are attached to the sugar-phosphate to form the complete nucleotide, as shown for adenosine monophosphate. Adenine pairs with thymine and guanine pairs with cytosine, forming A-T and G-C base pairs.

The chemical DNA was first discovered in 1869, but its role in genetic inheritance was not demonstrated until 1943. In 1953 James Watson and Francis Crick, aided by the work of biophysicists Rosalind Franklin and Maurice Wilkins, determined that the structure of DNA is a double-helix polymer, a spiral consisting of two DNA strands wound around each other. The breakthrough led to significant advances in scientists’ understanding of DNA replication and hereditary control of cellular activities.

DNA was discovered in 1869 by a Swiss biochemist, Friedrich Miescher. He wanted to determine the chemical composition of leucocytes (white blood cells), his source of leucocytes was pus from fresh surgical bandages. Although initially interested in all the components of the cell, Miescher quickly focussed on the nucleus because he observed that when treated with acid, a precipitate was formed which he called ‘nuclein’. Almost all molecular bioscience graduates would have repeated a form of this experiment in laboratory classes where DNA is isolated from cells. Miescher, Richard Altmann and Albrecht Kossel further characterised ‘nuclein’ and the name was changed to nucleic acid by Altmann. Kossel went on to show that nucleic acid contained purine and pyrimidine bases, a sugar and phosphate. Work in the 1930s from many scientists further characterised nucleic acids including the identification of the four bases and the presence of deoxyribose, hence the name deoxyribonucleic acid (DNA). Erwin Chargaff had found that DNA molecules from a particular species always contained the same amount of the bases cytosine (C) and guanine (G) and the same amount of adenosine (A) and thymine (T). So, for example, the human genome contains 20% C, 20% G, 30% A and 30% T.

DNA is a polymer made of monomeric units called nucleotides , a nucleotide comprises a 5-carbon sugar, deoxyribose, a nitrogenous base and one or more phosphate groups. The building blocks for DNA synthesis contain three phosphate groups, two are lost during this process, so the DNA strand contains one phosphate group per nucleotide.

There are four different bases in DNA, the double-ring purine bases: adenine and guanine; and the single-ring pyrimidine bases: cytosine and thymine . The carbon within the deoxyribose ring are numbered 1′ to 5′. Within each monomer the phosphate is linked to the 5′ carbon of deoxyribose and the nitrogenous base is linked to the 1′ carbon, this is called an N-glyosidic bond. The phosphate group is acidic, hence the name nucleic acid.

In the DNA chain , the phosphate residue forms a link between the 3′-hydroxyl of one deoxyribose and the 5′-hydroxyl of the next. This linkage is called a phosphodiester bond. DNA strands have a ‘sense of direction’. The deoxyribose at the top of the diagram in Figure 1B is not linked to another deoxyribose; it terminates with a 5′ phosphate group. At the other end the chain terminates with a 3′ hydroxyl.

Each strand of a DNA molecule is composed of a long chain of monomer nucleotides. The nucleotides of DNA consist of a deoxyribose sugar molecule to which is attached a phosphate group and one of four nitrogenous bases: two purines (adenine and guanine) and two pyrimidines (cytosine and thymine). The nucleotides are joined together by covalent bonds between the phosphate of one nucleotide and the sugar of the next, forming a phosphate-sugar backbone from which the nitrogenous bases protrude. One strand is held to another by hydrogen bonds between the bases; the sequencing of this bonding is specific—i.e., adenine bonds only with thymine, and cytosine only with guanine.

The configuration of the DNA molecule is highly stable, allowing it to act as a template for the replication of new DNA molecules, as well as for the production (transcription) of the related RNA (ribonucleic acid) molecule. A segment of DNA that codes for the cell’s synthesis of a specific protein is called a gene.

DNA replicates by separating into two single strands, each of which serves as a template for a new strand. The new strands are copied by the same principle of hydrogen-bond pairing between bases that exists in the double helix. Two new double-stranded molecules of DNA are produced, each containing one of the original strands and one new strand. This “semiconservative” replication is the key to the stable inheritance of genetic traits.

Within a cell, DNA is organized into dense protein-DNA complexes called chromosomes. In eukaryotes, the chromosomes are located in the nucleus, although DNA also is found in mitochondria and chloroplasts. In prokaryotes, which do not have a membrane-bound nucleus, the DNA is found as a single circular chromosome in the cytoplasm. Some prokaryotes, such as bacteria, and a few eukaryotes have extrachromosomal DNA known as plasmids, which are autonomous, self-replicating genetic material. Plasmids have been used extensively in recombinant DNA technology to study gene expression.

The genetic material of viruses may be single- or double-stranded DNA or RNA. Retroviruses carry their genetic material as single-stranded RNA and produce the enzyme reverse transcriptase, which can generate DNA from the RNA strand. Four-stranded DNA complexes known as G-quadruplexes have been observed in guanine-rich areas of the human genome.

Stands ForDeoxyriboNucleicAcid.RiboNucleicAcid.
DefinitionA nucleic acid that contains the genetic instructions used in the development and functioning of all modern living organisms. DNA’s genes are expressed, or manifested, through the proteins that its nucleotides produce with the help of RNA.The information found in DNA determines which traits are to be created, activated, or deactivated, while the various forms of RNA do the work.
FunctionThe blueprint of biological guidelines that a living organism must follow to exist and remain functional. Medium of long-term, stable storage and transmission of genetic information.Helps carry out DNA’s blueprint guidelines. Transfers genetic code needed for the creation of proteins from the nucleus to the ribosome.
StructureDouble-stranded. It has two nucleotide strands which consist of its phosphate group, five-carbon sugar (the stable 2-deoxyribose), and four nitrogen-containing nucleobases: adenine, thymine, cytosine, and guanine.Single-stranded. Like DNA, RNA is composed of its phosphate group, five-carbon sugar (the less stable ribose), and 4 nitrogen-containing nucleobases: adenine, uracil (not thymine), guanine, and cytosine.
Base PairingAdenine links to thymine (A-T) and cytosine links to guanine (C-G).Adenine links to uracil (A-U) and cytosine links to guanine (C-G).
LocationDNA is found in the nucleus of a cell and in mitochondria.Depending on the type of RNA, this molecule is found in a cell’s nucleus, its cytoplasm, and its ribosome.
StabilityDeoxyribose sugar in DNA is less reactive because of C-H bonds. Stable in alkaline conditions. DNA has smaller grooves, which makes it harder for enzymes to “attack.”Ribose sugar is more reactive because of C-OH (hydroxyl) bonds. Not stable in alkaline conditions. RNA has larger grooves, which makes it easier to be “attacked” by enzymes.
PropagationDNA is self-replicating.RNA is synthesized from DNA when needed.
Unique FeaturesThe helix geometry of DNA is of B-Form. DNA is protected in the nucleus, as it is tightly packed. DNA can be damaged by exposure to ultra-violet rays.The helix geometry of RNA is of A-Form. RNA strands are continually made, broken down and reused. RNA is more resistant to damage by Ultra-violet rays.


In the case of DNA vaccines, a piece of DNA encoding the antigen is first inserted into a bacterial plasmid. This is a circular piece of DNA used by a bacterium to store and share genes which may benefit its survival – a bit like a computer flash drive. Plasmids can replicate independently of the main chromosomal DNA and provide a simple tool for transferring genes between cells. Because of this, they are already widely used within the field of genetic engineering.

DNA plasmids carrying the antigen are usually injected into the muscle, but a key challenge is getting them to cross into people’s cells. This is an essential step, because the machinery which enables the antigen to be translated into protein is located inside cells. Various technologies are being developed to aid this process – such as electroporation, where short pulses of electric current are used to create temporary pores in patients’ cell membranes; a ‘gene gun’ which uses helium to propel DNA into skin cells; and encapsulating the DNA in nanoparticles which are designed to fuse with the cell membrane.

RNA vaccines encode the antigen of interest in messenger RNA (mRNA) or self-amplifying RNA (saRNA) – molecular templates used by cellular factories to produce proteins. Because of its transitory nature, there is zero risk of it integrating with our own genetic material. The RNA can be injected by itself, encapsulated within nanoparticles (as Pfizer’s mRNA-based Covid vaccine is), or driven into cells using some of the same techniques being developed for DNA vaccines.

Once the DNA or RNA is inside the cell and it starts producing antigens, these are then displayed on its surface, where they can be detected by the immune system, triggering a response. This response includes killer T cells, which seek out and destroy infected cells, as well as antibody-producing B cells and helper T cells which support antibody production.

Once a pathogen’s genome has been sequenced, it is relatively quick and easy to design a vaccine against any of its proteins. For instance, Moderna’s RNA vaccine against COVID-19 entered clinical trials within two months of the SARS-CoV-2 genome being sequenced. This speed could be particularly important in the face of new emerging epidemic, pandemic pathogens or pathogens which are rapidly mutating.

Both DNA and RNA vaccines are relatively easy to produce, but the manufacturing process differs slightly between them. Once DNA encoding the antigen has been chemically synthesised, it is inserted into a bacterial plasmid with the help of specific enzymes – a relatively straightforward procedure. Multiple copies of the plasmid are then produced within giant vats of rapidly dividing bacteria, before being isolated and purified. RNA vaccines are easier to synthesise because this can be done chemically, from a template in the lab, without the need for any bacteria or cells.

In both cases, vaccines for different antigens could be manufactured within the same facilities, further reducing costs. This is not possible for most conventional vaccines.

Vaccination consists of stimulating the immune system with an infectious agent, or components of an infectious agent, modified in such a manner that no harm or disease is caused, but ensuring that when the host is confronted with that infectious agent, the immune system can adequately neutralize it before it causes any ill effect. For over a hundred years vaccination has been effected by one of two approaches: either introducing specific antigens against which the immune system reacts directly; or introducing live attenuated infectious agents that replicate within the host without causing disease synthesize the antigens that subsequently prime the immune system.

Recently, a radically new approach to vaccination has been developed. It involves the direct introduction into appropriate tissues of a plasmid containing the DNA sequence encoding the antigen(s) against which an immune response is sought, and relies on the in situ production of the target antigen. This approach offers a number of potential advantages over traditional approaches, including the stimulation of both B- and T-cell responses, improved vaccine stability, the absence of any infectious agent and the relative ease of large-scale manufacture. As proof of the principle of DNA vaccination, immune responses in animals have been obtained using genes from a variety of infectious agents, including influenza virus, hepatitis B virus, human immunodeficiency virus, rabies virus, lymphocytic chorio-meningitis virus, malarial parasites and mycoplasmas. In some cases, protection from disease in animals has also been obtained. However, the value and advantages of DNA vaccines must be assessed on a case-by-case basis and their applicability will depend on the nature of the agent being immunized against, the nature of the antigen and the type of immune response required for protection.

The field of DNA vaccination is developing rapidly. Vaccines currently being developed use not only DNA, but also include adjuncts that assist DNA to enter cells, target it towards specific cells, or that may act as adjuvants in stimulating or directing the immune response. Ultimately, the distinction between a sophisticated DNA vaccine and a simple viral vector may not be clear. Many aspects of the immune response generated by DNA vaccines are not understood. However, this has not impeded significant progress towards the use of this type of vaccine in humans, and clinical trials have begun.

The first such vaccines licensed for marketing are likely to use plasmid DNA derived from bacterial cells. In future, others may use RNA or may use complexes of nucleic acid molecules and other entities. These guidelines address the production and control of vaccines based on plasmid DNA intended for use in humans. The purpose of these guidelines is to indicate:

  • appropriate methods for the production and control of plasmid DNA vaccines; and
  • specific information that should be included in submissions by manufacturers to national control authorities in support of applications for the authorization of clinical trials and marketing.

It is recognized that the development and application of nucleic acid vaccines are evolving rapidly. Thus, their control should be approached in a flexible manner so that it can be modified as experience is gained in production and use. The intention of these guidelines is to provide a scientifically sound basis for the production and control of DNA vaccines intended for use in humans, and to assure their consistent ssafety and efficacy. Individual countries may wish to use these guidelines to develop their own national guidelines for DNA vaccines.


The process of RNA biosynthesis from a DNA template is called transcription. Transcription requires nucleoside 5′-triphosphates as precursors and is catalyzed by enzymes called RNA polymerases. Unlike DNA replication, RNA transcription is not semiconservative; that is, only one DNA strand (the “sense” strand) is transcribed for any given gene, and the resulting RNA transcript is dissociated from the parental DNA strand. As is true of DNA synthesis, RNA synthesis always proceeds in a 5′-to-3′ direction. Because RNA synthesis requires no primer, as does DNA synthesis, the first nucleotide in any primary transcript retains its 5′-triphosphate group. Therefore, an important test to determine whether any RNA molecule is a primary transcript is to see whether it possesses an intact 5′-triphosphate terminus. As in DNA replication, base pairing orders the sequence of nucleotides during transcription. In RNA synthesis, uridine (rather than thymidine) base-pairs with adenine.We have seen that a gene can encode either an RNA product or a protein sequence. The production of both requires the gene to be transcribed into RNA, either because the RNA is the final product or because the RNA will need to act as template for protein synthesis. RNA synthesis is very similar in

prokaryotes and eukaryotes, being catalysed by the enzyme RNA Polymerase. However, of the processes discussed in this article it is arguably the one that differs most between prokaryotes and eukaryotes. One difference is that in eukaryotes the whole process needs to occur in a chromatin context, so access to the DNA template is limited. Regulation of gene expression is a major facilitator of cell differentiation, homoeostasis and speciation. Different cell types turn on transcription of different genes giving rise to their differentiated phenotypes. If we look at mammals as an example of speciation, they all have roughly the same gene content; it is how transcription is regulated that has changed as mammals have evolved. For example, if you compare humans and mice, the important changes to the human and mouse genome sequence that have occurred since they diverged from a common ancestor, are predominantly in the sequences that control transcription rather than in protein coding sequences.

DNA replication

Whenever a cell divides there is a need to synthesise two copies of each chromosome present within the cell. For example in a human, prior to cell division, all 23 pairs of chromosomes need to be replicated to form 46 pairs, so that following cell division each daughter cell has a full complement (23 pairs) of chromosomes. The structure of DNA gives us a clue to how it is replicated, this was eloquently postulated by Watson and Crick in their 1953 paper: “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material”. Each strand can act as a template for the synthesis of the complementary strand, so the replication machinery would ‘unzip’ the double helix and read along the two existing ‘parent’ strands, synthesising a complementary new ‘daughter’ strand with A opposite T, C opposite G etc. This is described as semi-conservative, since each ‘new’ double-stranded DNA molecule has one original parent strand and one newly made daughter ‘strand’.

The evidence that DNA replication was semi-conservative came from an elegant experiment completed by Matthew Meselson and Franklin Stahl. They labelled the parental DNA with a heavy isotope of nitrogen (15N) by growing bacteria in a growth medium that contained 15NH4Cl. They then grew the bacteria, in a medium that contained 14NH4Cl, in conditions such that any newly synthesised DNA would contain 14N. Since DNA replication is semi-conservative, after one round of DNA replication, each cell would have a DNA molecule that contains one ‘old’ parental strand labelled with 15N and one ‘new’ daughter strand labelled with 14N. This was shown by analysing the density of the DNA using density-gradient centrifugation. As predicted, they observed that the new daughter DNA molecule had a density consistent with the fact that it contained both 15N and 14N and that this daughter DNA contained one strand with 15N and another strand with 14N.

DNA polymerase and DNA synthesis

The enzyme, DNA polymerase, is responsible for DNA synthesis. DNA polymerase is a template-driven enzyme, so it will use the parental DNA strand as a template. It cannot synthesise DNA in the absence of a template. In addition, it will only add nucleotides on to the 3′ end of an existing nucleic acid chain. The building blocks for DNA synthesis are deoxynucleoside triphosphates (dATP, dTTP, dCTP and dGTP). During DNA synthesis, the base within the incoming deoxynucleoside triphosphate pairs with the complementary base on the template strand, a phosphodiester bond is formed between the 5′ phosphate on the incoming nucleotide and the free 3′ hydroxyl on the existing nucleic acid chain; pyrophosphate is released

Pyrophosphate is the two phosphate residues within the deoxynucleoside triphosphate building block that are not incorporated into the DNA chain. DNA polymerase synthesises DNA in the 5′ to 3′ direction, because it can only add nucleotides on to the 3′ end of the chain. DNA polymerase has proofreading activity, so after the phosphodiester bond has been formed, the base pairing is checked and if a nucleotide with an incorrect base has been added, DNA polymerase will remove the nucleotide using a 3′ to 5′ exonuclease activity. Exonucleases are enzymes that can remove nucleotides from the ends of a DNA molecule, 3′ to 5′ exonucleases remove nucleotides from the 3′ end of a DNA molecule and therefore can remove the last nucleotide that was added during DNA replication. This is analogous to using the delete key to remove a letter that you have typed incorrectly before adding the correct one and continuing typing.

DNA polymerase requires a short double-stranded region with a free 3′ hydroxyl in order to start making a copy of the template; this ensures that DNA is synthesised in a controlled way. Initiation of DNA synthesis uses a small RNA primer (8–12 bases) made by the enzyme primase. DNA polymerase will then extend from the primer copying the template and synthesising the daughter DNA strand. This means that when DNA synthesis first starts each DNA molecule actually contains a small piece of RNA at its 5′ end. This RNA will ultimately be replaced with DNA, how this is done is discussed below.

The genetic code and the concept of a gene

As we have seen in the previous two sections, the genetic material in a cell is made of DNA and can be copied and passed on to progeny through DNA replication allowing for inheritance of the information that it carries. A large proportion of the information on the DNA is first transcribed into mRNA and then translated into proteins. However there are some RNAs that are never translated into proteins and these have important functions too. Phrases like ‘it is in my genes’ or ‘in my DNA’ are used in common speech to mean to be an important part of who someone is.

The term gene was coined in the early 1900s to describe the basic unit of heredity. Genes were thought of as distinct loci arranged lineally on chromosomes. Breeding experiments with the fruitfly Drosophila supported this view and showed that if two genes are close together on a chromosome they are more likely to be inherited together. The observation that mutations in genes could give rise to altered phenotypes gave rise to the ‘one gene one polypeptide’ hypothesis. Once it became clear that genes were made of DNA, what is referred to as the central dogma of molecular biology was coined. This describes a two step process in which the genes on the DNA are transcribed into RNA and then translated into a sequence of amino acids that makes up a protein. The information flow is from DNA to RNA and then to protein.

The genetic code

The genetic code is the set of rules used by living cells to translate the information encoded within genetic material into proteins. When DNA and RNA were first discovered, the relative simplicity of nucleic acids led many scientists to doubt that it carried the genetic information. DNA only has four different kinds of bases; the question was how it could code for 20 amino acids. If there were a 1:1 correlation between bases and amino acids DNA could only encode four amino acids. Pairs of bases would give 16 possible combinations which is still not enough. However if you consider a triplet code you have 64 possibilities, which is more than enough. This is the code that we are familiar with where each codon, a sequence of three nucleotides, specifies a particular amino acid. This triplet code still did not seem logical because now you have far more codons than you need. 

The experiments that allowed scientists to decipher the genetic code were carried out long before we were able to determine the sequence of DNA. While it was possible at that time to determine the proportions of each different amino acid in a protein, it was not yet possible to work out the order in which they occurred. Francis Crick and Sydney Brenner answered some key questions with an experiment using mutants of a virus that infects bacteria called bacteriophage. The normal or wild-type phage will infect E. coli and grow. Crick and Brenner investigated mutants that would not grow on some strains of E. coli.

Mutants which are insertions or deletions cause what are called frameshifts. Inserting a single adenine base into the DNA sequence not only changes the amino acid at the position of the insertion but all subsequent amino acids translated from that sequence ; the reading frame has been shifted by one base and it results in a protein that is non-functional. However if you insert three nucleotides you often get a wild-type or near wild-type phenotype. This is because you have inserted a whole triplet codon, you will get one or two amino acids that were not in the original sequence but the reading frame is not shifted and the rest of the sequence is normal.Crick and Brenner were looking for what they called suppressor mutations that would rescue the mutant and allow it to grow normally. They showed that their suppressor mutants did not simply reverse the original mutation; they often added or subtracted one or more bases. They worked out that if you insert or delete one, two or four nucleotides then you see a mutant phenotype. However, if you insert or delete three nucleotides, this has little or no effect. This was strong supporting evidence for a triplet code. This is also evidence for a redundant code where the same amino acid can be coded in more than one way. If the code were non-redundant there would be 20 codons that code for amino acids and 44 that are ‘nonsense’ codons. In this case inserting three nucleotides would be most likely to introduce a nonsense codon and not restore the wild-type. Crick and Brenner proposed correctly that the genetic code is read from a fixed starting point and the bases are read in groups of three.

At about the same time two American scientists Marshall Nirenberg and Heinrich Matthaei had developed a cell-free system which could synthesise proteins in a test tube when provided with an RNA molecule. They showed that when provided with an artificial RNA chain composed only of uracil (polyuracil) the system made a polypeptide composed entirely of phenyalanine residues. They now had a tool that they could use to crack the genetic code. RNA composed of cytosine (C) residues directed the synthesis of polyproline and RNA composed of adenosine (A) made polylysine. Experiments with combinations of nucleotides demonstrated that, for example, if you make RNA from A and C you produce proteins containing only six amino acids: asparagine, glutamine, histidine, lysine, proline and threonine. There are eight possible triplet codons that can be made from A and C, two of these we know encode proline and cysteine. The remaining four amino acids must be encoded by other combinations of A and C. This of course provides additional evidence for the redundancy of the genetic code.

These experiments using RNA molecules composed of random combinations of two or three bases were not enough to fully crack the genetic code. The use of chemically synthesised RNA molecules of known repeating sequence added some more important information. For example a synthetic RNA of alternating A and G residues (AGAGAGAGAG…) can be read as two alternating codons CAC and ACA. It encodes a protein of alternating histidine and threonine residues.

In the last section, we will discuss how tRNAs and ribosomes decode the genetic code and synthesise proteins. The final detail of the genetic code was determined by a technique using ribosome-bound tRNAs. Pieces of RNA as short as a single codon will bind to ribosomes and if amino acids attached to tRNA are added they will associate with the complementary RNA. If you then filter the solution you trap only the tRNAs that are bound to the ribosome, these are the ones specified by the codon in your RNA.


The primary biological role of RNA is to direct the process of protein synthesis. The three major RNA classes perform different specialized functions toward this end. The 18S and 28S ribosomal RNAs of eukaryotes are organized with proteins and other smaller RNAs into the 45S and 60S ribosomal subunits, respectively. The completed ribosome serves as a minifactory where all the components of protein synthesis are brought together during translation of the messenger RNA. The messenger RNA binds to the ribosome at a point near the initiation codon for protein synthesis. Through codon-anticodon base pairing between messenger and transfer RNA sequences, the transfer RNA molecules bearing amino acids are juxtaposed to allow formation of the first peptide bond between amino acids. Then, the ribosome moves along the messenger RNA strand as more amino acids are added to the peptide chain.

RNA of certain bacterial viruses serves a dual function. In certain bacteriophages (viruses that infect bacterial cells), the RNA serves as a message to direct synthesis of viral-coat proteins and of enzymes needed for viral replication. The RNA also serves as a template for viral replication. Viral RNA polymerases that copy RNA rather than DNA are made after infection. These enzymes first produce an intermediate replicative form of the viral RNA that consists of complementary RNA strands. One of these strands then serves as the sense strand for synthesis of multiple copies of the original viral RNA. 

RNA also serves as the actively transmitted genomic agent of certain viruses that infect cells of higher organisms. For example, Rous sarcoma virus, which is an avian tumor virus, contains RNA as its nucleic acid component. In this case, the RNA is copied to make DNA by an enzyme called reverse transcriptase. The viral DNA is then incorporated into the host cell genome, where it codes for enzymes that are involved in altering normal cell processes. These enzymes, as well as the site at which the virus integrates, regulate the drastic transformation of cell functions, inducing cell division and the ultimate formation of a tumor. Transcription of the viral DNA results in replication of the original viral RNA.

DNA provides living organisms with guidelines—genetic information in chromosomal DNA—that help determine the nature of an organism’s biology, how it will look and function, based on information passed down from former generations through reproduction. The slow, steady changes found in DNA over time, known as mutations, which can be destructive, neutral, or beneficial to an organism, are at the core of the theory of evolution.

Genes are found in small segments of long DNA strands; humans have around 19,000 genes. The detailed instructions found in genes—determined by how nucleobases in DNA are ordered—are responsible for both the big and small differences between different living organisms and even among similar living organisms. The genetic information in DNA is what makes plants look like plants, dogs look like dogs, and humans look like humans; it is also what prevents different species from producing offspring (their DNA will not match up to form new, healthy life). Genetic DNA is what causes some people to have curly, black hair and others to have straight, blond hair, and what makes identical twins look so similar.

RNA has several different functions that, though all interconnected, vary slightly depending on the type. There are three main types of RNA:

  • Messenger RNA (mRNA) transcribes genetic information from the DNA found in a cell‘s nucleus, and then carries this information to the cell’s cytoplasm and ribosome.
  • Transfer RNA (tRNA) is found in a cell’s cytoplasm and is closely related to mRNA as its helper. tRNA literally transfers amino acids, the core components of proteins, to the mRNA in a ribosome.
  • Ribosomal RNA (rRNA) is found in a cell’s cytoplasm. In the ribosome, it takes mRNA and tRNA and translates the information they provide. From this information, it “learns” whether it should create, or synthesize, a polypeptide or protein.

DNA’s genes are expressed, or manifested, through the proteins that its nucleotides produce with the help of RNA. Traits (phenotypes) come from which proteins are made and which are switched on or off. The information found in DNA determines which traits are to be created, activated, or deactivated, while the various forms of RNA do the work.

One hypothesis suggests that RNA existed before DNA and that DNA was a mutation of RNA. The video below discusses this hypothesis in greater depth.

Recent Discovery and approach

Genomic sequencing was found to be an effective way to identify and evaluate the risk of pathogenic and likely pathogenic variants detected in pediatric patients with a range of cancer types.

DNA and RNA genomic sequencing was found to be a useful tool for identifying and characterizing genetic drivers of relapsed or refractory (R/R) cases of several pediatric cancers, according to a recent study published in Cancer Discovery.

The prospective nontherapeutic study analyzed the general usefulness of next-generation sequencing (NGS) using a 3-platform approach consisting of combined whole genome (WGS), exome, and RNA sequencing of tumor and paired normal tissues, an approach that had previously only been evaluated for specific tumor types or high-risk disease.

“This study demonstrates the power of a 3-platform approach that incorporates WGS to interrogate and interpret the full range of genomic variants across newly diagnosed as well as relapsed/refractory pediatric cancers,” wrote the investigators.

Evidence has proved NGS to be helpful to providers as they refine or change cancer diagnoses, provide prognostic information, identify therapeutic targets or markers of therapy resistance, detect variants showing pharmacogenetic significance, and reveal genetic predisposition.

However, most of these studies assess NGS in patients with difficult-to-treat or R/R cancers and have rarely addressed patients with new diagnoses or standard risk cancers. Additionally, more research needs to be dedicated to investigating new diagnostic and prognostic subgroups and the full list of genetic drivers for many rare pediatric cancers.

The 3-platform approach developed by the investigators was validated using a retrospective cohort of patients with high-risk pediatric cancer. Data were collected from the Genomes for Kids (G4K) study and included 309 pediatric patients with cancer who were selected without regard to tumor type or stage and were treated at St. Jude Children’s Research Hospital from August 2015 to March 2017.

Among the 309 patients, 82% (n = 253) had their tumors examined using all 3 platforms, 166 were male, and the average age at cancer diagnosis was 7.4 (range, 4 days to 25.7 years) years. At the time of study enrollment, 85% (n = 262) of patients had newly diagnosed cancers and 15% (n = 47) had R/R cancers.

By cancer type, 128 (41%) patients had hematological malignancies of 28 subtypes, 97 (31%) had brain humors of 27 subtypes, and 84 (27%) had tumors that were not associated with the central nervous system of 26 types. Rare tumor types were identified in 45 (15%) patients. Hodgkin and non-Hodgkin lymphoma were underrepresented, and leukemia and retinoblastoma were overrepresented.

Using the 3-platform approach of paired tumor and normal tissue revealed that 86% (n = 218/253) of patients had at least 1 diagnostic (53%), prognostic (57%), therapeutically relevant (25%), and/or cancer predisposing genetic variant (18%).

Additionally, utilization of WGS allowed for providers to detect activating gene fusions (36% of tumors), enhancer hijacks (8% of tumors), and mutational signatures that showed pathogenic variant effects, of which 55% were discovered to be relevant to tumor development when evaluating paired tumor-normal data.

Of the patients whose tumors were sequenced, 78 had or developed metastatic, R/R disease, and 41% of the tumors displayed potentially targetable lesions. Therapy regimens were changed for 12 patients based on tumor genomic data, 5 of whom responded to the newly directed therapy.

“Our screening through G4K uncovered germline [pathogenic or likely pathogenic] variants in 55 patients, of whom almost two-thirds would not have been detected based on routine clinical indications for genetic testing,” said the investigators.

They listed the inability to source fresh frozen tissue for all patients, the inability for comprehensive genomic profiling to become standard of care due to cost, and the long turnaround time to receive results as study limitations.

“As genomic sequencing technologies become less expensive and more widely available, their use will be an important adjunct to gene panels in the evaluation and management of children with newly diagnosed as well as relapsed or refractory cancers,” the investigators noted.


The study of nucleic acids, from their first identification as the genetic material is littered with landmarks in molecular biosciences, many of them marked with Nobel Prizes. Since Watson and Crick proposed their structure of DNA our knowledge about DNA and how it works has expanded almost exponentially. The topics introduced in this article are important topics covered in all bioscience programmes; understanding them is key to all areas of biosciences from evolution and animal diversity to health and disease. Recent developments in the techniques that we can use to study DNA, often in living cells means that new and exciting developments in our understanding of the way nucleic acids work are occurring all the time.




The success and final outcome of this project required a lot of guidance and assistance from many people and I am extremely privileged to have got this all along the completion of my project. All that I have done is only due to such supervision and assistance and I would not forget to thank them.

I respect and thank Mr./Ms.  [NAME 1], for providing me an opportunity to do the project work in [VENUE] and giving us all support and guidance which made me complete the project duly. I am extremely thankful to [her/him] for providing such a nice support and guidance, although he had busy schedule managing the corporate affairs.

I owe my deep gratitude to our project guide [NAME 2], who took keen interest on our project work and guided us all along, till the completion of our project work by providing all the necessary information for developing a good system.

I would not forget to remember [NAME 3 AND NAME 4], of [COMPANY NAME] for their encouragement and more over for their timely support and guidance till the completion of our project work.

I heartily thank our internal project guide, [Name 5], [Position] , [Department] for her/his guidance and suggestions during this project work.

I am thankful to and fortunate enough to get constant encouragement, support and guidance from all Teaching staffs of [Department name] which helped us in successfully completing our project work. Also, I would like to extend our sincere esteems to all staff in laboratory for their timely support.


1.Make the project more and more presentable.
2.Prefer colourful images like-


3. Paste downloaded images or draw images on the left side of the project copy.

4. Prefer good handwriting.

5. You may add statistical graphs, or sample in plastic pouches if required.


6. You may add the INDEX page before the INTRODUCTION page.

7. Colour the headings and subheadings.

8. Avoid using whitner .

9. Cover your project file with colorful chart paper and write your name, class, section, roll number, topic prominently.