Samenvatting

SAMENVATTING Bioinformatics and Systems Biology: Sequence, Structure and Evolution (ALLE THEORIE + OEFENINGEN UITGEWERKT!)

Beoordeling

Verkocht

Pagina's

Geüpload op

18-05-2025

Geschreven in

2024/2025

Deze samenvatting bevat overzichtelijk alle geziene theorie uit de lessen en daarbij op het einde de uitgeschreven stappen van elke oefening. This summary clearly presents all the theory covered in class, along with the written-out steps for each exercise at the end.

Meer zien Lees minder

Instelling

Vak

Voorbeeld van de inhoud

The structure of proteins

What’s in the menu today

1. Why are we interested in protein structure?
2. The four levels of protein structure
• Protein sequence
• Protein secondary structure
• Protein tertiary structure
• Protein complexes and disorder
3. Forces driving protein structure and the protein folding problem
4. Proteins structure determination and prediction methods
5. The applications and limitations of structure models
Why proteins

Proteins as machines of the cell

• Proteins are present everywhere in living organisms: When you look at an organism and zoom in further and further, you eventually see cells, and
within those cells are many complex structures —> many of which are proteins.
• Proteins are essential machinery of the cell: They carry out most of the functions that make life possible.
• DNA plays a central role, but proteins do most of the work: DNA contains the instructions, but proteins execute the functions.
• Many proteins work together in complex structures.: Some proteins form larger machines by interacting with other proteins, and this collaboration
determines their biological function.
• Proteins have highly diverse shapes (structures): Their structure is often closely linked to their specific function in the cell.
• Examples of protein functions include:
• Proteins involved in translation (protein synthesis), such as ribosomal proteins.
• Proteins that help other proteins fold (chaperones).
• Proteins that break down old or damaged proteins (e.g., proteasomes).
• Structural proteins, like microtubules and actin, that support cell structure.
• Antibodies involved in the immune system.
• Transport proteins that move substances across membranes.
• Proteins are incredibly diverse: They are found throughout the cell, have a wide variety of shapes and functions, and that variety is strongly tied to
their structure.
• Proteins are encoded by genes in DNA.
Machinery diversity
– Protein structures are highly diverse: Their 3D shape can vary greatly and this shape is usually linked directly to their
function.
– Structure determines function: The way a protein folds and its final form often defines what it can do in the cell.
– Proteins are involved in many cellular processes.
• Some interact directly with DNA, e.g. transcription factors.
• Others assist with translation, such as ribosomal proteins.
• Some help other proteins fold properly (chaperones).
• There are proteins that degrade misfolded or damaged proteins (e.g. proteasomes).
• Structural proteins, like microtubules and actin, provide support to the cell.
• Enzymes catalyze chemical reactions.
• Antibodies are proteins involved in the immune system.
• Transport proteins move substances across membranes.
• Even some toxins are proteins.
– Proteins often work together as machines: Many cellular processes rely on complexes of proteins that function in coordination.
– Proteins are encoded by genes: The DNA sequence determines the amino acid sequence, which in turn determines structure and function.
– Proteins are encoded by genes.: The DNA sequence determines the amino acid sequence, which in turn determines structure and function.
Central Dogma of Molecular Biology
The information to produce proteins is encoded in the DNA
• DNA replication: the copying of DNA to DNA, an essential step for genetic information copying
• DNA transcription: the information contained in a section of DNA is replicated in the form of a newly
assembled piece of mRNA
• mRNA translation: the information in mRNA is read by the ribosome, which produce linear polypeptide
chain

• Proteins are incredibly diverse: They exist everywhere in the cell and carry out a huge variety of essential functions.
• Structure relates to function: A protein’s 3D shape is typically closely related to what it does in the cell.
• Proteins are encoded by genes: Genes (stretches of DNA) contain the instructions to make proteins.
• The Central Dogma of Molecular Biology: This explains the flow of genetic information:
1. DNA replication – DNA copies itself.
2. Transcription – A gene (DNA) is transcribed into messenger RNA (mRNA).
3. Translation – The mRNA is read by a ribosome, which builds the corresponding protein.
• Sometimes information flows in reverse: In some viruses (like retroviruses), RNA can be reverse-transcribed back into DNA → this is an exception
to the usual flow.

Basics of protein 3D structure

, From linear chains to 3D structures

Protein Folding and Structure
• Proteins are linear polymers of amino acids.
o They are made up of a sequence (chain) of amino acids.
o This sequence is called the primary structure.
o The protein is synthesized by the ribosome, which reads mRNA and builds the chain like a strand of spaghetti.
• Folding begins as the protein exits the ribosome.
o Local interactions (like hydrogen bonds) begin to form between atoms in the chain.
o These lead to recognizable local patterns, known as secondary structures:
▪ Alpha helices
▪ Beta sheets
• As more of the protein is synthesized, secondary structures start interacting.
o They fold and collapse into a more complex, compact shape — the tertiary structure.
o This is the overall 3D structure of a single protein chain.
• Some proteins interact with other proteins to form larger complexes.
o When multiple protein subunits (folded chains) assemble into one functional unit, we call this the quaternary structure.
o These complexes can function as molecular machines in the cell.
Amino acids – the building blocks of proteins
Proteins are linear polymers of amino acids
• 20 (+2 special) amino acids are commonly found in proteins
• All proteogenic amino acids are 𝘢-amino acids
• They have a carboxyl group and an amino group bonded to the same carbon atom (the 𝘢 carbon, C𝘢)
• They differ from each other in their side chains, or R groups
The R group (side chain) defines each amino acid’s chemical behavior
• It can be:
• Nonpolar (hydrophobic)
• Polar (hydrophilic)
• Charged (acidic or basic)
• These properties influence how proteins fold and function
• The 𝘢 carbon is a chiral center and amino acids have two stereoisomers/enantiomers – L and D
• Natural proteins are made of L-amino acids
Amino acids exist as stereoisomers (L and D forms)
• The alpha carbon is chiral (has four different groups attached)
• This creates two mirror-image forms (enantiomers):
• L-amino acids ("left-handed")
• D-amino acids ("right-handed")
• Nature almost exclusively uses L-amino acids in proteins
• Why this is the case is still an open question in biochemistry and the origins of life

• Amino acids are small organic molecules
• They are the monomers (building blocks) that link together to form proteins.
• Each amino acid has a common structure:
• A central alpha carbon (Cα)
• An amino group (-NH₂)
• A carboxyl group (-COOH) → the "acid" part
• A hydrogen atom
• A variable side chain (R group) → this determines the unique properties of each amino acid
• There are 20 standard amino acids used in natural proteins
• Plus two special cases (selenocysteine and pyrrolysine), used in very specific situations
• While in theory there could be infinite amino acids, life on Earth mostly uses this fixed set
• Fun fact on language and memorization:
• In Latin or Romance languages, “left” = sinistra, “right” = dextra / direita (Portuguese)
• You can use this to help remember L (left-handed) vs D (right-handed) amino acids!

There are 20 standard amino acids encoded by the genetic code
• These are incorporated into proteins during translation via tRNA and ribosomes.

• There are also 2 'special' amino acids:
• Selenocysteine (Sec) – similar to cysteine, but contains a selenium atom instead of sulfur.
▪ Not directly encoded by the genetic code, but inserted via a special mechanism.
• Pyrrolysine (Pyl) – a modified version of lysine, found in some archaea and bacteria.
▪ Also inserted by a unique mechanism, not by a standard codon.

• Amino acids are classified by the properties of their side chain (R group):
• Nonpolar (hydrophobic, uncharged):
▪ Tend to avoid water, usually found buried inside proteins.
▪ Examples: alanine, valine, leucine, isoleucine, methionine, phenylalanine, tryptophan, proline, glycine.
• Polar, uncharged:
▪ Often contain oxygen or nitrogen → can form hydrogen bonds.
▪ Examples: serine, threonine, asparagine, glutamine, cysteine, tyrosine.
• Charged (at physiological pH):
o Positively charged (basic): lysine, arginine, histidine
o Negatively charged (acidic): aspartate, glutamate

,• Cysteine is special
• Contains a sulfur group (-SH) that can form disulfide bonds with another cysteine. → These bonds are important for protein folding and
stability.
• Its sulfur can be replaced by selenium → forms selenocysteine.

• Post-translational modifications can alter amino acids.
• Occur after translation to fine-tune a protein’s function or structure.
• Examples: methylation, phosphorylation, or the addition of special groups (e.g. pyrrolysine).

• Not all amino acids are directly encoded in DNA.
• Special cases like selenocysteine have no dedicated codon, but are inserted by non-standard translation mechanisms.
The genetic code (from DNA to protein)
The 20 common amino acids are encoded by our genetic code:
DNA is composed of 4 bases (nucleotides):
• Adenine (A), Thymine (T), Cytosine (C), Guanine (G)
• In RNA, Uracil (U) replaces Thymine (T)
Proteins are made of 20 natural amino acids.
• Each amino acid must be encoded in DNA/RNA in some way.
The genetic code uses “words” of 3 bases—called codons—to encode one amino acid.
• This is made possible through the translation process in ribosomes.
• 1 codon = 3 nucleotides = 1 amino acid
Why 3 nucleotides per amino acid?
• 1 base = 4 combinations → 4¹ = 4 (too few)
• 2 bases = 16 combinations → 4² = 16 (still not enough)
• 3 bases = 64 combinations → 4³ = 64 (more than enough for 20 amino acids)
Because 64 codons code for only 20 amino acids:
• The genetic code is redundant (degenerate): some amino acids are coded by multiple codons
▪ Example: Proline is coded by CCU, CCC, CCA, CCG
▪ Some like Tryptophan are coded by a single codon: UGG

• The set of rules used to translate information encoded within genetic material into proteins
• Each amino acid is encoded by a sequence of three nucleotides – the codons
• There are 64 codons (43 combinations of nucleotides)
• Some encode for amino acids
• Other few encode for special functions, such as the start and end of the protein
• There are 64 codons (43 combinations of nucleotides)
• Some encode for amino acids
• Other few encode for special functions, such as the start and end of the protein
The genetic code is redundant
!! Some amino acids are encoded by multiple codons

Start and stop codons:
• AUG codes for methionine and also serves as the start codon for translation.
•There are 3 stop codons: UAA, UAG, UGA, which signal the end of translation.
Consequences of this redundancy:
• Some amino acids occur more frequently in proteins.
• Codon bias: different organisms may prefer certain codons over others for the same amino acid.
From linear chains to 3D structures
1. Amino acids are linked together by ribosomes:
o Ribosomes read the mRNA codon by codon (3 bases at a time).
o Each codon codes for a specific amino acid.
o The correct amino acid is delivered by tRNA.
2. Formation of the peptide chain:
o The ribosome connects amino acids like beads on a string.
o The chemical reaction is a condensation reaction:
▪ The amino group (–NH₂) of the new amino acid attacks the carboxyl group (–COOH) of the previous one.
▪ A water molecule (H₂O) is released → hence "condensation."
3. The bond between amino acids = peptide bond:
o A covalent bond between the carbon atom of the carboxyl group and the nitrogen atom of the amino group.
o This creates a strong, stable chain: the primary structure of the protein.
4. Peptide bonds have unique properties:
o They show resonance (electron delocalization between single and double bonds).
o This makes the bond planar: the 4 atoms involved lie in the same plane.
o Limited rotation, which influences how the protein can fold.
5. Cis and trans configuration:
o Refers to how the R-groups (side chains) of amino acids are positioned:
▪ Trans: R-groups point away from each other → preferred (more stable).
▪ Cis: R-groups point toward each other → rare (more steric hindrance).
o Small side chains (e.g., glycine, alanine) can sometimes allow cis-configuration.
o Large side chains (e.g., tryptophan) make cis-configuration unfavorable.
6. N- and C-terminus:
o The chain starts at the N-terminus (free amino group) and ends at the C-terminus (free carboxyl group).

The protein sequence
The order by which each amino acid is connected is encoded in the genetic material
• The linear sequence of non-overlapping codons defines the sequence by which each amino acid is linked
• The single-letter code of each amino acid can be used to write the protein sequence

, • Each individual protein has a unique amino acid sequence
The polypeptide chain
During protein synthesis, AZ are joined end-to-end by the formation of peptide bonds to form the polypeptide chain
The peptide bond
Has a partial double-bonded character due to resonance
• Rotation around the peptide bond is restricted
• The six atoms involved tend to be co-planar, which makes the peptide bond planar
• Two planar configurations are possible:
• Cis: unfavored energetically due to repulsion between side chain atoms
• Trans: favored energetically due to fewer repulsion between side chain atoms
• Nitrogen as an Electron Donor: Nitrogen, with its lone pair of electrons, is often involved in resonance, especially in molecules where it can
donate electrons to form double bonds.
• Electron Movement in Resonance: The lone pair on nitrogen can participate in the resonance, shifting electrons between bonds to stabilize the
molecule.
• Bonding: Nitrogen can form both single and double bonds, and its lone pair can affect the bond character (for example, in amines or nitro compounds).
• Effect on Bonding: In resonance, the bond between nitrogen and another atom (like carbon or oxygen) might appear as a blend of single & double
bonds, rather than one or the other.
From linear chains to 3D structures
What is secondary structure?
• Local interactions between amino acids in a single polypeptide chain.
• Formed through hydrogen bonds between atoms of the backbone, not the side chains (R-groups).
• The two most common structures:
1. α-helix (spiral shape)
2. β-sheet (flat, folded structures)
The backbone degrees of freedom
Since the peptide bond is mostly rigid, there are basically 2 degrees of freedom:
Each peptide unit can rotate around two dihedral angles
• Phi angle (ɸ): angle of rotation around the N-C𝘢 bond
• Psi angle (Ψ): angle of rotation around the C𝘢-C’ bond
Other bonds and angles are more restricted due to unfavorable energetics
The Ramachandran plot
Each amino acid residue in a protein chain is associated with two conformational angles ɸ and Ψ
• The angle pairs for a whole protein are usually plotted against each other in diagram =the Ramachandran plot
• Most combinations of ɸ and Ψ are not allowed due to steric collisions between side chains
• Regions in the diagram are named based on the local structural motifs that the repetition of those specific angle
combinations causes

Ramachandran Plot = Shows allowed combinations of Φ (phi) and Ψ (psi) angles in amino acids.

• Not all combinations are allowed:
o Some cause steric clashes (atomic collisions).
o Others are energetically favorable and common in real proteins.
• The plot is asymmetric due to the use of L-amino acids:
o Only certain regions are preferred.
o Right-handed regions are typically favored (due to chirality).
• Dark blue areas represent the most energetically favorable combinations:
o One region corresponds to α-helices.
o Another to β-strands.
• Loops and turns occur in less populated but allowed regions.
• α-helices and β-sheets dominate because their angles:
o Are energetically stable.
o Allow optimal hydrogen bonding.
o Cause minimal steric hindrance.
• Ramachandran angles are computed around Cα atoms: In theory, continuous rotations are possible, but proteins favor specific regions.
• L-amino acids limit certain rotations due to their chirality.
• Left-handed regions exist but are less populated.
• Dark blue regions correlate with stable secondary structures.
• Planar regions reflect flatter structural conformations.
• Disallowed regions involve steric clashes and are avoided.
• In-between regions are allowed but less energetically optimal: Often found in flexible parts like loops.
• Different areas of the plot correspond to different secondary structures:
o Favored regions lead to stable α-helices and β-sheets.
o Less favored regions are associated with loops and turns.
Amino acid-specific Ramachandran plots
As different sidechains have different properties, they have different preferred allowed areas
With only an H as sidechain, glycine can adopt a much larger range of conformations
Sequence defines structure!

Proline and Rotation
• Proline’s Rotation Restrictions:
o Proline prefers right-handed combinations, not left-handed.
o Its side chain forms a ring with the backbone nitrogen, restricting rotation around the peptide bond.
Side Chain Influence on Backbone Flexibility
• Each amino acid’s side chain (R-group):
o Affects the Ramachandran angles (Φ and Ψ), dictating which angles are energetically favorable.
o Influences the local structure the amino acid prefers.

Meld schending auteursrecht

Geschreven voor

Instelling: Katholieke Universiteit Leuven (KU Leuven)
Studie: Biomedische Wetenschappen
Vak: Bioinformatics and Systems Biology (E02N3A)

Alle documenten voor dit vak (3)

Documentinformatie

Geüpload op: 18 mei 2025
Aantal pagina's: 87
Geschreven in: 2024/2025
Type: SAMENVATTING

Onderwerpen

alphamissense
uniprot
alpha helix
beta sheet
coils
hydrofoob
hydrofiel
hydrogen bonds

$19.12

Krijg toegang tot het volledige document:

Geschreven door studenten die geslaagd zijn

Direct beschikbaar na je betaling

Online lezen of als PDF

Maak kennis met de verkoper

Lorejansens123

4.1

(14)

Maak kennis met de verkoper

Lorejansens123 Odisee Hogeschool

Bekijk profiel

Volgen

Verkocht

138

Lid sinds

4 jaar

Aantal volgers

Documenten

Laatst verkocht

5 dagen geleden

4.1

14 beoordelingen

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper Lorejansens123. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor $19.12. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 49710 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen

SAMENVATTING Bioinformatics and Systems Biology: Sequence, Structure and Evolution (ALLE THEORIE + OEFENINGEN UITGEWERKT!)

Voorbeeld van de inhoud

Geschreven voor

Documentinformatie

Onderwerpen

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Bezig met je bronvermelding?

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?