Tentamen (uitwerkingen)

Natural Language Processing: many questions, no answers Floriana Grasso May30, 2000

Beoordeling

Verkocht

Pagina's

Cijfer

A+

Geüpload op

04-08-2024

Geschreven in

2024/2025

Meer zien Lees minder

Instelling

Natural Language Processing

Vak

Natural Language Processing

Voorbeeld van de inhoud

Natural Language Processing: many questions, no answers
Floriana Grasso
May 30, 2000

Introduction
Computational Linguistics, as a subfield of Linguistics, or Natural Language Processing (NLP), as a subfield of
Artificial Intelligence (two research areas that nowadays can be safely considered as merged) concentrate on the
“study of computer systems for understanding and generating natural language” [10], in order to develop “a com-
putational theory of language, using the notions of algorithms and data structures from Computer Science” [2].
Typical problems for these research fields are [2]: How is the structure of sentences identified? How can
knowledge and reasoning be modelled? How can language be used to accomplish specific tasks?
In more recent times, these problems are being studied not so much from the traditional linguistic viewpoint
(syntax and semantics) but with a focus on problems of belief models, planning processes, and functional proper-
ties of both discourse and text.
I will focus my discussion on three problems in NLP. The choice of the problems to address is admittedly
biased by my personal research interests, but I do believe that these problems in particular can (and should) most
benefit from considerations coming from the Argumentation theory field. This paper is merely an attempt to build
up a canvas of issues, on which further discussion is needed.

Problem 1: Natural Language Generation
Natural Language Generation (NLG) is concerned with the “construction of computer systems that can produce
understandable texts in English or other human language from some underlying non-linguistic representation of
information” [30].
Typically, an NLG tool would start with, say, a database of facts on a given subject, for instance illnesses1 :
isa(diabetes, disease)
symptom(diabetes, tiredness)
symptom(diabetes, nausea)
treatment(diabetes, insulin-injection)

and build up a piece of text, possibly personalized to the addressee, which in some way “translates” these facts in
natural language, for instance:

Diabetes is a disease. Its symptoms include tiredness and nausea. It can be treated with insulin injections.
In order to achieve this, in a more “intelligent” way than just gluing words together, an NLG system needs
knowledge of, at least:

the domain (diabetes, and illnesses in general);
grammar and lexicon;
how to reach the intended purpose (what does “explaining” mean?);
discourse organization (why is a sentence “good”? Why is it “understandable”?);
the addressee (would the text be different if addressed to an audience of physicians? How?).
Let us impudently ignore here the first two elements in the list: we may lightly assume that a computer system
can collect and record data in some structured form (a database, a knowledge base, etc.) and can query such
information structure. We may also assume that a computer system can process a grammar and keep a dictionary
1 For some reasons, research in NLG has seen a proliferation of systems on medical or epidemiological domains [4, 5, 6, 12, 25, 31, 32].

1

, of terms, and we may leave aside problems related to the choice of words (should I call it “disease”, “illness”,
“pathology” or what?) and/or of grammatical constructs (should I use passive or active form?), despite they
undoubtly give rise to an important series of pragmatic problems.
The first problem to consider in our way to produce our piece of text on diabetes becomes therefore: how can
we structure the text? What pieces of data should be included, and why? And how they should be put together?
All computer systems which, more or less automatically, have produced natural language text have approached
this problem on the basis of a common assumption: text has a structure, that can be determined, at least partially,
by the speaker’s goal. The main differences between these systems lies in deciding what this structure is, and how
it can be modelled.
Early systems are based on the concept of “recipe”, or schema: if most of the explanations (e.g. in an ency-
clopedia) have the same structure (start with the identification, “X is a Y”, then pass to listing peculiar attributes
of X etc.) then we can replicate the same structure to all texts having the same purpose (to explain). We can go
a step further, and think of other, different goals the speaker might have, beyond the mere explanation: instruct,
provide evidence, justify, compare etc., and create a schema for each of them.
More recent systems tend to be based on a different, less rigid approach. Following Austin’s intuition that
utterances are performatives just as physical action are [3], many researchers in NLP have acknowledged that the
most flexible way to produce natural language text is to use a planning approach. Just as a robot may have the goal
of building a brick tower, and can decompose this problem into smaller and smaller tasks, until easily executable
basic actions can be performed (lift arm, pick up block etc.), similarly a natural language tool may decompose
a communicative goal into its steps, and use a plan based mechanism to achieve them. We can then benefit
from huge achievements in the planning research community, and implement a system capable of processing
“communicative operators”, such as (adapted from [27]):

NAME: persuade-by-motivation
EFFECT: (PERSUADED ?hearer (DO ?hearer ?act))
CONSTRAINTS: (STEP ?act ?goal) AND (GOAL ?hearer ?goal) AND (MOST-SPECIFIC ?goal)
DECOMPOSITION: (FORALL ?goal (MOTIVATION ?act ?goal))

The “only” problem we are left with, therefore, is to decide how such communicative goals can be defined
in the first place, and how they can be achieved, that is decomposed into smaller, more manageable problems,
so that we can feed our planner with a suitable library of operators similar to the one above. Guidance on these
issues comes typically from discourse theories, the most widely used of which is perhaps the Rhetorical Structure
Theory (RST) [21]. RST has been used in many applications (see for example [15]), and, despite some criticisms
[28], it is generally seen as “the” theory for generating text. This is most surprising, as, while extremely useful
for generating descriptive texts, RST deals very poorly with different genres of text, for instance it says very little
about how persuasive text can be generated. Moreover, it assumes texts have a hierarchical structure, which is in
many circumstances too strong a constraint, if not an artificial imposition.

Generation from Predefined Text: Summarization
An important subfield of NLG, which is rapidly becoming a field on its own right, summarization can be seen, at
a shallower level, as the extraction of excerpts from a text that can convey the main message(s) of the text without
too many details. At a deeper level, it can be seen as the extraction of the meaning of a text, and the reproduction
of a shorter version of the text itself. In both cases, a discourse structure is needed to understand what to keep and
what to throw away, decision which again should be based on the satisfaction of a given, communicative goal.

Problem 2: Intelligent Dialogue Agents
A natural extension to the ability of creating text for a purpose, would be to use this text in a conversation with
another partner, whether human or not. Most of the problems expressed before for the generation of a piece of
text appear here again, for the production of the single sentence needed for the dialogue move. They are accom-
panied by many others, though, as a consequence of having the audience directly intervening in the generation.
Such problems may involve shallower aspects of the dialogic activity, such as repairing conversation failures and
keeping the conversation “on focus”. Or architectural problems of how the generation of messages should be
interleaved with the management of the dialogue, or of how to organize turn taking (e.g. dialogue games). But
perhaps most importantly, an “intelligent” dialogic agent needs to be able to understand the implications of the
single sentence the partner communicates.

2

Meld schending auteursrecht

Geschreven voor

Instelling: Natural Language Processing
Vak: Natural Language Processing

Documentinformatie

Geüpload op: 4 augustus 2024
Aantal pagina's: 5
Geschreven in: 2024/2025
Type: Tentamen (uitwerkingen)
Bevat: Vragen en antwoorden

Onderwerpen

natural language processing many questions no an
problem 1 natural language generation
problem 2 intelligent dialogue agents

$15.49

Krijg toegang tot het volledige document:

Geschreven door studenten die geslaagd zijn

Direct beschikbaar na je betaling

Online lezen of als PDF

Maak kennis met de verkoper

StudyCenter1

4.3

(28)

Maak kennis met de verkoper

StudyCenter1 Teachme2-tutor

Bekijk profiel

Volgen

Verkocht

227

Lid sinds

2 jaar

Aantal volgers

Documenten

3850

Laatst verkocht

1 week geleden

Nursing school is hard! Im here to simply the information and make it easier!

My mission is to be your LIGHT in the dark. If you"re worried or having trouble in nursing school, I really want my notes to be your guide! I know they have helped countless others get through and thats all i want for YOU! Stay with me and you will find everything you need to study and pass any tests,quizzes abd exams!

4.3

28 beoordelingen

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper StudyCenter1. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor $15.49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 49904 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen

Natural Language Processing: many questions, no answers Floriana Grasso May30, 2000

Voorbeeld van de inhoud

Geschreven voor

Documentinformatie

Onderwerpen

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Bezig met je bronvermelding?

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?