Tentamen (uitwerkingen)

NLP TEST 1 QUESTIONS AND ANSWERS 2024

Beoordeling

Verkocht

Pagina's

Cijfer

A+

Geüpload op

31-10-2024

Geschreven in

2024/2025

Exam of 5 pages for the course NLP at NLP (NLP TEST 1)

Instelling

Vak

Voorbeeld van de inhoud

NLP TEST 1

classic parsing method - answer1. parse as search: top-down or bottom-up;
2. shift-reduce
3. cky
4. Earley

CKY parser - answer bottom-up; requires a binarized grammar

earley parser - answer top-down, complex

generative classifier - answer Naive Bayes. Build a model of each class. Given an
observation, they return the class most likely to have generated the observation.

discriminative classifier - answer Logistic regression (MaxEnt). Learn what features from
the input are most useful to discriminate between the different classes.

10-fold cross-validation - answer 留太多 training set 的话，test set 小就不够有代表
性。Thus use all data both for training and test.
1. Randomly choose a training and test set division of data, train the classifier, compute
the error rate on the test set.
2. Repeat with a different randomly selected training set and test set.
3. Do it 10 times
4. average 10 runs to get an average error rate
又因为所有 data 都用来 test，我们不能去看 data，分析有哪些 feature。为避免这种情
况：
create a fixed training set and test set, then do 10-fold cross-validation inside the
training set, compute error rate the normal way in the test set.

overfitting - answerA model that learned the noise instead of the signal is considered
overfit because it fits the training dataset but has poor fit with new datasets.

two common architectures for corpus-based chabots - answer1. information retrieval
2. machine learned sequence transduction

types of chatbots - answerrule-based, corpus-based, frame-based(task-based)

domain ontology - answermodern frame-based dialogue systems are based on domain
ontology.
The ontology defines one or more frames, each a collection of slots, and
slot defines the values that each slot can take

, frame-based chatbot/GUS architecture - answerbased on hand designed FSA.

NLU goal for filling frame-based chatbot slots - answer1. domain classification
2. user intent determination
3. slot filling

language models - answerModels that assign probabilities to sequences of words. The
simplest model is N-gram model.

N-gram model - answerInstead of computing the probability of a word given its entire
history, we can approximate the history by just the last few words. It is based on Markov
assumption: the probability of a word depends only on the previous word.

Markov models - answerthe class of probabilistic models that assume we can predict
the probability of some future unit without looking too far into the past.

maximum likelihood estimation - answerThe procedure of computing the score for all
possible parameter values to identify the parameter value that confers the highest
likelihood score

evaluate language models - answer1. extrinsic evaluation: to embed the model in an
application and measure how much the application improves. Expensive
2. intrinsic evaluation: to measure the quality of a model independent of any application.
80% training, 10%development set, 10% test set

perplexity - answer In practice, we don't use raw probability as our metric for evaluating
language models but a variant called perplexity. It is the inverse probability of the test
set. The lower the perplexity, the higher the probability.

Perplexity can also be thought as the weighted average branching factor of a language
(Not just a branching factor).
The branching factor of a language is the number of possible next words that can follow
any word.

OOV - answer out of vocabulary, words that we haven't seen before.
The percentage of OOV words that appear in the test set is called the OOV rate.

Smoothing - answers keep a language model from assigning zero probability to these
unseen events, we'll have to shave off a bit of probability mass from some more
frequent events and give it to the events we've never seen. This modification is called
smoothing or discounting.

Laplace/add-1 smoothing, add-k smoothing, stupid backoff, Kneser-Ney
smoothing(most useful for language modeling)

(add-one and add-k are not good for language modeling, but good for classification)

Meld schending auteursrecht

Geschreven voor

Vak: NLP

Alle documenten voor dit vak (76)

Documentinformatie

Geüpload op: 31 oktober 2024
Aantal pagina's: 5
Geschreven in: 2024/2025
Type: Tentamen (uitwerkingen)
Bevat: Vragen en antwoorden

Onderwerpen

nlp test 1

$13.49

Krijg toegang tot het volledige document:

Geschreven door studenten die geslaagd zijn

Direct beschikbaar na je betaling

Online lezen of als PDF

Maak kennis met de verkoper

julianah420

4.2

(155)

Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

julianah420 Phoenix University

Bekijk profiel

Volgen

Verkocht

696

Lid sinds

3 jaar

Aantal volgers

329

Documenten

35596

Laatst verkocht

5 uur geleden

NURSING,TESTBANKS,ASSIGNMENT,AQA AND ALL REVISION MATERIALS

On this page, you find all documents, package deals, and flashcards offered by seller julianah420

4.2

155 beoordelingen

101

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper julianah420. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor $13.49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 50056 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen

NLP TEST 1 QUESTIONS AND ANSWERS 2024

Voorbeeld van de inhoud

Geschreven voor

Documentinformatie

Onderwerpen

Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Bezig met je bronvermelding?

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?