Summary Omics (NWI-MOL410)
2024/2025 Q2
Content
Week 1 - Introduction.............................................................................................................................1
Lecture: Introduction Pruijn................................................................................................................1
Lecture: Introduction Jansen..............................................................................................................5
Week 2 - Transcriptomics......................................................................................................................11
Week 3 - Proteomics, Interactomics, PTMs..........................................................................................17
Week 4 - Metabolomics........................................................................................................................25
Week 5 - Univariate Analysis, Clustering...............................................................................................36
Lecture univariate analysis...............................................................................................................36
Lecture clustering.............................................................................................................................42
Week 6 - Multivariate Analysis.............................................................................................................48
Lecture Unsupervised multivariate analysis.....................................................................................48
Lecture supervised multivariate analysis..........................................................................................53
Week 7 - Guest lecture and IPOP..........................................................................................................60
Week 1 - Introduction
Lecture: Introduction Pruijn
Omics = Collective characterization and quantification of pools of biological molecules that translate
into the structure, function, and dynamics of cells, tissue(s), an organism or organisms
Omics Comprehensive analysis of biological systems at the molecular level
Systems biology Biology that focuses on complex systems of life
History of omics
Modern uses of the term ‘omics’ derive from the term genome (hence genomics), a term
invented by Hans Winkler in 1920, although the use of –ome is older, signifying the
‘collectivity’ of a set of things.
The first genome was completely sequenced by Sanger in Cambridge, UK, in the 1970s.
(Bacteriophage X174; 5,386 bp)
The word genomics appeared in the 1980s and became widely
used in the 1990s.
Genome is the most fundamental part of many omics.
Genomics may be described as the comprehensive analysis of DNA structure and function.
Understanding biological diversity at the whole genome level yields insight into the origins of
individual traits and disease susceptibility.
,The word ‘omics’ refers to a field of study in biology ending in the suffix –omics such as genomics,
proteomics, or metabolomics. The related ‘ome’ addresses the objects of study of such fields, such as
the genome, proteome, or metabolome, respectively. The term omics is derived from the Latin suffix
‘-ome’ meaning mass or many. Thus, omics involve a mass (large number) of measurements per
endpoint.
Classical approach: Analysis of one, or a few, molecules
Omics approach: Analysis of many molecules simultaneously
Types of omics
Genomics – genome (DNA)
Transcriptomics – transcriptome (RNA)
Proteomics – proteome (protein)
Metabolomics – metabolome (metabolites)
Interactomics – interactome
Antibodyomics – antibodyome
Epigenomics – epigenome
Lipidomics – lipidome
Etc etc
Genomics implies some hidden network among genetic elements. This network is regulated by many
other omics such as .transcriptomics, proteomics, metabolomics and interactomics.
Genomic variation; genotyping
Though organisms such as humans are quite similar at the genetic level, differences exist at a
frequency of about 1 in every 1000 nucleotide bases. This translates into
approximately 3 million base differences between each individual. Such changes are referred to as
single nucleotide polymorphisms (SNPs). A polymorphism is distinct from a mutation. The latter is
considered rare; affecting less than one percent of the species, whereas a polymorphism is relatively
common and its prevalence is not different to what is considered normal.
Applications of genome analysis
Diagnose a disease, Confirm a diagnosis, Confirm the existence of a disease in individuals, Predict the
risk of future disease in healthy individuals
• Carrier screening, or the identification of unaffected individuals who carry one copy of a
gene for a disease that requires two copies for the disease to manifest
• Prenatal diagnostic screening
• Newborn screening
, • Presymptomatic testing for predicting adult-onset disorders
• Presymptomatic testing for estimating the risk of developing adult-onset cancers
• Confirmational diagnosis of symptomatic individuals
• Forensic/identity testing
Epigenomics
Study of epigenetic processes on a large (ultimately genome-wide) scale, e.g. to assess the effect on
disease.
Epigenetic processes: mechanisms other than changes in DNA sequence that cause effect in gene
transcription and gene silencing. Main epigenetic mechanisms: DNA methylation, histone
modification and RNA- mediated mechanisms (RNAi).
Hypermethylation of CpG islands located in promoter regions of genes is related to gene silencing.
Histone proteins are involved in the structural packaging of DNA in chromatin. Post-translational
histone modifications such as acetylation and methylation regulate chromatin structure and therefore
gene expression.
Transcriptomics
The transcriptome is the set of all mRNA molecules, or ‘transcripts’, produced in one or a population
of cells. The term can be applied to the total set of transcripts in a given organism, or to the specific
subset of transcripts present in a particular cell type. The transcriptome in contrast to the genome is
highly variable over time, between cell types and as a result of environmental changes.
The study of transcriptomics, also referred to as Expression Profiling, examines the expression level of
mRNAs in a given cell population, often using high-throughput techniques based on DNA microarray
technology and next-generation sequencing (NGS). The use of NGS technology to study the
transcriptome at the nucleotide level is known as RNA-Seq.
Proteomics
The proteome is dynamic, defined as the set of proteins expressed in a specific cell, given a particular
set of conditions. Within a given human proteome, the number of proteins can be as large as 2
million.
The proteome is highly variable over time and between cell types, and will change in response to
environmental changes.
Although all proteins are directly correlated to mRNA (transcriptome), translational silencing, the
regulation of protein degradation, post-translational modifications and environmental interactions
impede to predict from transcriptome analysis alone.
Interactomics
Interactomics comprises the study of interactions and their consequences between various proteins
as well as other cellular components. The network of all such interactions, known as the
‘interactome’, aims to provide a better understanding of genome and proteome functions and can
give valuable insights and information about biological functions in cells.
Metabolomics
Metabolomics is the systematic study of the unique chemical fingerprints that specific cellular
processes leave behind – specifically, the study of their small-molecule metabolite profiles.
, Metabolome refers to the complete set of small-molecule metabolites (such as metabolic
intermediates, hormones and other signaling molecules, and secondary metabolites) to be found
within a biological sample, such as a single organism.
Metabolites are the intermediates and products of metabolism. The term metabolite is usually
restricted to small molecules.
The broad idea behind omics: The functional state of a cell can be explained by the (integrated set of
different) omics data, called molecular signature.
The same fact can be exploited to find out the
difference between diseased and normal. For
diagnosis of diseases in the future, personal
omics profiling (POP) is indispensable. POP
further confers advantage to produce personal
drugs based on POP.
Omics technologies
DNA and protein microarrays
2-dimensional gelelectrophoresis
Mass spectrometry
Next-generation sequencing
NMR
Bioinformatics
Analysis of large datasets
Technology-based omics are based on technologies developed for understanding the “central
dogma,” which can be further divided into three groups, i.e., the “four big omics” (genomics,
transcriptomics, proteomics, and metabolomics), epiomics (epigenomics, epitranscriptomics, and
epiproteomics), and their interactomics (DNA-RNA interactomics, RNA-RNA interactomics, DNA-
protein interactomics, RNA-protein interactomics, protein-protein interactomics, and protein-
metabolite interactomics). Omics
indicated by the horizontal (above)
and vertical (right-hand side) pink
boxes of each interactomic term
constitute to its two interacting
omics.
Knowledge-based omics are
developed to understand a
particular knowledge domain in a
systematic way through integrating
multiple omics information.
Pipeline omics experiment: cell/tissue/organism biological sample pretreatment: isolation
molecules of interest (or labeling) measurements + data acquisition data preprocessing data
analysis interpretation ( integration with other data sets)
2024/2025 Q2
Content
Week 1 - Introduction.............................................................................................................................1
Lecture: Introduction Pruijn................................................................................................................1
Lecture: Introduction Jansen..............................................................................................................5
Week 2 - Transcriptomics......................................................................................................................11
Week 3 - Proteomics, Interactomics, PTMs..........................................................................................17
Week 4 - Metabolomics........................................................................................................................25
Week 5 - Univariate Analysis, Clustering...............................................................................................36
Lecture univariate analysis...............................................................................................................36
Lecture clustering.............................................................................................................................42
Week 6 - Multivariate Analysis.............................................................................................................48
Lecture Unsupervised multivariate analysis.....................................................................................48
Lecture supervised multivariate analysis..........................................................................................53
Week 7 - Guest lecture and IPOP..........................................................................................................60
Week 1 - Introduction
Lecture: Introduction Pruijn
Omics = Collective characterization and quantification of pools of biological molecules that translate
into the structure, function, and dynamics of cells, tissue(s), an organism or organisms
Omics Comprehensive analysis of biological systems at the molecular level
Systems biology Biology that focuses on complex systems of life
History of omics
Modern uses of the term ‘omics’ derive from the term genome (hence genomics), a term
invented by Hans Winkler in 1920, although the use of –ome is older, signifying the
‘collectivity’ of a set of things.
The first genome was completely sequenced by Sanger in Cambridge, UK, in the 1970s.
(Bacteriophage X174; 5,386 bp)
The word genomics appeared in the 1980s and became widely
used in the 1990s.
Genome is the most fundamental part of many omics.
Genomics may be described as the comprehensive analysis of DNA structure and function.
Understanding biological diversity at the whole genome level yields insight into the origins of
individual traits and disease susceptibility.
,The word ‘omics’ refers to a field of study in biology ending in the suffix –omics such as genomics,
proteomics, or metabolomics. The related ‘ome’ addresses the objects of study of such fields, such as
the genome, proteome, or metabolome, respectively. The term omics is derived from the Latin suffix
‘-ome’ meaning mass or many. Thus, omics involve a mass (large number) of measurements per
endpoint.
Classical approach: Analysis of one, or a few, molecules
Omics approach: Analysis of many molecules simultaneously
Types of omics
Genomics – genome (DNA)
Transcriptomics – transcriptome (RNA)
Proteomics – proteome (protein)
Metabolomics – metabolome (metabolites)
Interactomics – interactome
Antibodyomics – antibodyome
Epigenomics – epigenome
Lipidomics – lipidome
Etc etc
Genomics implies some hidden network among genetic elements. This network is regulated by many
other omics such as .transcriptomics, proteomics, metabolomics and interactomics.
Genomic variation; genotyping
Though organisms such as humans are quite similar at the genetic level, differences exist at a
frequency of about 1 in every 1000 nucleotide bases. This translates into
approximately 3 million base differences between each individual. Such changes are referred to as
single nucleotide polymorphisms (SNPs). A polymorphism is distinct from a mutation. The latter is
considered rare; affecting less than one percent of the species, whereas a polymorphism is relatively
common and its prevalence is not different to what is considered normal.
Applications of genome analysis
Diagnose a disease, Confirm a diagnosis, Confirm the existence of a disease in individuals, Predict the
risk of future disease in healthy individuals
• Carrier screening, or the identification of unaffected individuals who carry one copy of a
gene for a disease that requires two copies for the disease to manifest
• Prenatal diagnostic screening
• Newborn screening
, • Presymptomatic testing for predicting adult-onset disorders
• Presymptomatic testing for estimating the risk of developing adult-onset cancers
• Confirmational diagnosis of symptomatic individuals
• Forensic/identity testing
Epigenomics
Study of epigenetic processes on a large (ultimately genome-wide) scale, e.g. to assess the effect on
disease.
Epigenetic processes: mechanisms other than changes in DNA sequence that cause effect in gene
transcription and gene silencing. Main epigenetic mechanisms: DNA methylation, histone
modification and RNA- mediated mechanisms (RNAi).
Hypermethylation of CpG islands located in promoter regions of genes is related to gene silencing.
Histone proteins are involved in the structural packaging of DNA in chromatin. Post-translational
histone modifications such as acetylation and methylation regulate chromatin structure and therefore
gene expression.
Transcriptomics
The transcriptome is the set of all mRNA molecules, or ‘transcripts’, produced in one or a population
of cells. The term can be applied to the total set of transcripts in a given organism, or to the specific
subset of transcripts present in a particular cell type. The transcriptome in contrast to the genome is
highly variable over time, between cell types and as a result of environmental changes.
The study of transcriptomics, also referred to as Expression Profiling, examines the expression level of
mRNAs in a given cell population, often using high-throughput techniques based on DNA microarray
technology and next-generation sequencing (NGS). The use of NGS technology to study the
transcriptome at the nucleotide level is known as RNA-Seq.
Proteomics
The proteome is dynamic, defined as the set of proteins expressed in a specific cell, given a particular
set of conditions. Within a given human proteome, the number of proteins can be as large as 2
million.
The proteome is highly variable over time and between cell types, and will change in response to
environmental changes.
Although all proteins are directly correlated to mRNA (transcriptome), translational silencing, the
regulation of protein degradation, post-translational modifications and environmental interactions
impede to predict from transcriptome analysis alone.
Interactomics
Interactomics comprises the study of interactions and their consequences between various proteins
as well as other cellular components. The network of all such interactions, known as the
‘interactome’, aims to provide a better understanding of genome and proteome functions and can
give valuable insights and information about biological functions in cells.
Metabolomics
Metabolomics is the systematic study of the unique chemical fingerprints that specific cellular
processes leave behind – specifically, the study of their small-molecule metabolite profiles.
, Metabolome refers to the complete set of small-molecule metabolites (such as metabolic
intermediates, hormones and other signaling molecules, and secondary metabolites) to be found
within a biological sample, such as a single organism.
Metabolites are the intermediates and products of metabolism. The term metabolite is usually
restricted to small molecules.
The broad idea behind omics: The functional state of a cell can be explained by the (integrated set of
different) omics data, called molecular signature.
The same fact can be exploited to find out the
difference between diseased and normal. For
diagnosis of diseases in the future, personal
omics profiling (POP) is indispensable. POP
further confers advantage to produce personal
drugs based on POP.
Omics technologies
DNA and protein microarrays
2-dimensional gelelectrophoresis
Mass spectrometry
Next-generation sequencing
NMR
Bioinformatics
Analysis of large datasets
Technology-based omics are based on technologies developed for understanding the “central
dogma,” which can be further divided into three groups, i.e., the “four big omics” (genomics,
transcriptomics, proteomics, and metabolomics), epiomics (epigenomics, epitranscriptomics, and
epiproteomics), and their interactomics (DNA-RNA interactomics, RNA-RNA interactomics, DNA-
protein interactomics, RNA-protein interactomics, protein-protein interactomics, and protein-
metabolite interactomics). Omics
indicated by the horizontal (above)
and vertical (right-hand side) pink
boxes of each interactomic term
constitute to its two interacting
omics.
Knowledge-based omics are
developed to understand a
particular knowledge domain in a
systematic way through integrating
multiple omics information.
Pipeline omics experiment: cell/tissue/organism biological sample pretreatment: isolation
molecules of interest (or labeling) measurements + data acquisition data preprocessing data
analysis interpretation ( integration with other data sets)