Questions with 100% Verified Correct Answers
Guaranteed A+
Collobert and Weston Vector Idea - CORRECT ANSWER: a word and its context is a
positive training sample; a random word in that sample context gives a negative training
sample
Conditional language models and how to train them (teacher/student forcing), language
metrics (how to calculate them) - CORRECT ANSWER:
Debiasing word2vec - CORRECT ANSWER: - identify gender subspace with gendered
words
- project all words onto this subspace
- subtract those projections from the original word
Problem: Not that effective and bias pervades the word embedding space
Embedding - CORRECT ANSWER: A learned map from entities to vectors that encodes
similarity
Evaluating Word Embeddings Extrinsic - CORRECT ANSWER: - Evaluation on real task
- Can take a long time to compute
- Unclear if the subsystem is the problem or its interaction
- if replacing exactly one subsystem with another improves accuracy -> winning
Evaluating Word Embeddings Intrinsic - CORRECT ANSWER: - Evaluation on a
specific/intermediate subtask
, - Fast to compute
- Helps to understand the system
- Not clear if really helpful unless correlation to real task is established
Example: Evaluate word vectors by how well their cosine distance after addition
captures intuitive semantic and syntactic analogy questions
Graph Embedding - CORRECT ANSWER: Optimize the objective that connected nodes
have more similar embeddings than unconnected nodes.
Task: convert nodes to vectors
- effectively unsupervised learning where nearest neighbors are similar
- these learned vectors are useful for downstream tasks
Graph Embedding is Slow: Reason and Solution - CORRECT ANSWER: - Training time
dominated by computing scores for "fake edges"
- Corrupt a sub-batch of edges with the same set of random nodes
Graph Embeddings Loss Function - CORRECT ANSWER: - Margin loss between the
score of an edge f(e) and a negative sampled edge f(e')
- Negative sampled edges are constructed by taking real edge and replacing either the
source or destination vertex with a random node
- the score of an edge f(e) is a similarity (dot product) between the source embedding
and a transformed version of the destination embedding
- f(e) = cos( theta(s) , theta(d) + theta(r) )