GENOME: complete set of genetic information (nucleus + mitochondrial DNA [0.5%]) of a cell
Or
Total mass of cellular DNA (usually the more complex the individual the greater the
amount of genetic information)
Haploid genome: 3,2 billion base pairs (chromosome 1 is the biggest, chromosome 21 the smallest)
32%: genes for proteins (1% are actually translated)
15%: genes for non-coding RNAs
7%: tandem repeated sequences
46%: interspersed repeat sequences (transposons = LINE, SINE, LTR, DNA TRANSPOSONS)
Gene variability is of 3 types:
1) single nucleotide polymorphisms (SNPs): differences of single base pairs
2) tandem repetitive sequences: sequences of bases repeated multiple times
3) submicroscopic structural variants: variants of the chromosome that modify its structure → most
common are copy number variants (CNVs) → gain or loss of segments → can also have inversions
(differences in sequence orientation)
GENE: a unit of transcription (because of the existence of non-coding RNAs genes don’t code only
for functional proteins)
REGULATORY SEQUENCE (recognized by proteins that can bind to DNA):
• PROMOTER: signals where transcription starts (it is not transcribed itself) and establishes
which strand; immediately upstream of transcription site
• ENHANCER: can increase promoter efficiency by binding transcription activators; can be
found distant from transcribed site
• SILENCER: can bind transcription inhibitors
TRANSCRIBED SEQUENCE (region that is copied into RNA):
• EXON (on average 11 in human gene): sequence found in mature RNA (not excised during
splicing)
, • INTRON (on average 10 in human gen): present in primary transcript (between 2 exons)
and excised during splicing; almost always begin with GT (GU in pre-mRNA) and end with
AG (these are referred to as classical/canonical introns)
Introns can modulate other genes and allow for great product variability (because of alternate
splicing)
[transcription]
Primary transcript: rough product consisting of an mRNA strand
[maturation] → mRNA (series of exons welded together, 3500 base pairs [average size])
5’ capping 3’-polyadenilation splicing
mRna is made up of:
• 5’ UNTRANSLATED REGION (5’ UTR): sequence that extends from 5’ cap to base that
proceeds START codon (AUG)
• CODING SEQUENCE (CDS): includes START codon (AUG) and STOP codon
(UAA/UAG/UGA), it is a succession of codons that code for an amino acid sequence
(polypeptide chain)
• 3’ UNTRANSLATED REGION (3’UTR): sequence that extends from first base after STOP
codon up to the poly-A tail
Non-coding genes:
- Ribosomal RNAs (rRNAs): rDNA for 28S, 5.8S and 18S is found on chromosomes
13,14,15,21,22, for 5S on all the chromosomes; nucleolus assembles around rDNA; 85% of
RNA is rRNA
- Transfer RNAs (tRNAs): translator, able to bind a specific amino acid and recognize its
complementary mRNA codon; 10% of RNA
- Non-coding RNAs (ncRNAs): spliceosomal RNA (uRNA), small nuclear RNAs (snRNAs),
small nucleolar RNAs (snoRNAs), microRNAs (miRNAs), small interfering RNAs
(siRNAs); mRNA + ncRNA is 5% of all RNA