ANSWERS GRADED A+
✔✔Compute the Skew dictionary for the following string: CATCATGCGC - ✔✔0, -1, -1, -
1, -2, -2, -2, -1, -2, -1, -2
✔✔Compute the Hamming distance between the following two strings:
CAGAAAGGAAGGTCCCCATAC
CACGCCGTATGCATAAACGAG - ✔✔15
✔✔What nucleotide substitution occurs most often because of deamination in
single-stranded DNA? - ✔✔C to T
✔✔Define Countd(Text, Pattern) as the total number of occurrences of
Pattern in Text with at most d mismatches.
Compute Count1(CGTGACAGTGTATGGGCATCTTT, TGT). - ✔✔7
✔✔Given the following two functions:
f(x) = n2 + n + 2
g(x) = 0.5n2 + 1
In Big-O notation, what is the complexity of f(x)+g(x)? - ✔✔n^2
✔✔What is the correct model of DNA replication? - ✔✔Semi-conservative model
✔✔(Fill in the blank) Proteins acting as "switches" to turn other genes on or off
are called _______________________. - ✔✔regulatory proteins
✔✔What is the largest possible value of Score(Motifs) for 3 motifs of length k? - ✔✔2k
✔✔Compute Score(Motifs) of the following list of k-mers Motifs.
AACGTA
CACGTT
CACCTT
GGATTT
TTCCGG - ✔✔12
✔✔Fill in the profile matrix for A for the motifs below
AACGTA
CACGTT
CACCTT
GGATTT
TTCCGG - ✔✔0.2, 0.6, 0.2, 0.0, 0.0, 0.2
✔✔What is the consensus string of the following profile matrix?
, A: 0.4 0.2 0.0 0.1 0.0 0.9
C: 0.2 0.4 0.0 0.3 0.0 0.1
G: 0.1 0.3 1.0 0.1 0.4 0.0
T: 0.3 0.1 0.0 0.5 0.6 0.0 - ✔✔ACGTTA
✔✔Describe the idea of a "Brute Force" algorithm approach - ✔✔Checks all possible
candidates of the solution, also known as exhaustive search
✔✔Describe the idea of a "Greedy" algorithm approach - ✔✔Locally optimal choice at
each stage
✔✔For a given list of k-mers Motifs, define Score(Motifs) as the sum of
Hamming distances between Motifs[i] and Consensus(Motifs) for all i.
For a given list of strings Dna, let BestMotifs(k, Dna) denote the list of
k-mers Motifs (where Motifs[i] is a substring of Dna[i] for all i)
minimizing Score(Motifs).
True or False: Consensus(BestMotifs(k, Dna)) must appear as a substring
of at least one of the strings in Dna. - ✔✔False
✔✔Compute Pr(AAGTTC|Profile) of the following profile matrix Profile:
A: 0.4 0.3 0.0 0.1 0.0 0.9
C: 0.2 0.3 0.0 0.4 0.0 0.1
G: 0.1 0.3 1.0 0.1 0.5 0.0
T: 0.3 0.1 0.0 0.4 0.5 0.0 - ✔✔0.0024
✔✔Finding Origin of Replication Problem:
Input: A DNA string Genome.
Output: The location of oriC in Genome.
STOP and Think: Does this biological problem represent a clearly stated computational
problem? - ✔✔False
✔✔Hidden Message Problem: Find a "hidden message" in the replication origin.
Input: A string Text.
Output: A hidden message in Text.
STOP and Think: Does the Hidden Message Problem represent a clearly stated
computational problem? - ✔✔False
✔✔STOP and Think: Can a string have multiple most frequent k-mers? - ✔✔True
✔✔STOP and Think: Do any of the counts in 1.3.14 seem surprisingly large? - ✔✔True