Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Tentamen (uitwerkingen)

Natural Language Processing: many questions, no answers Floriana Grasso May30, 2000

Beoordeling
-
Verkocht
-
Pagina's
5
Cijfer
A+
Geüpload op
04-08-2024
Geschreven in
2024/2025

Natural Language Processing: many questions, no answers Floriana Grasso May30, 2000 Introduction Computational Linguistics, as a subfield of Linguistics, or Natural Language Processing (NLP), as a subfield of Artificial Intelligence (two research areas that nowadays can be safely considered as merged) concentrate on the “study of computer systems for understanding and generating natural language” [10], in order to develop “a com putational theory of language, using the notions of algorithms and data structures from Computer Science” [2]. Typical problems for these research fields are [2]: How is the structure of sentences identified? How can knowledge and reasoning be modelled? How can language be used to accomplish specific tasks? In more recent times, these problems are being studied not so much from the traditional linguistic viewpoint (syntax and semantics) but with a focus on problems of belief models, planning processes, and functional proper ties of both discourse and text. I will focus my discussion on three problems in NLP. The choice of the problems to address is admittedly biased by my personal research interests, but I do believe that these problems in particular can (and should) most benefit from considerations coming from the Argumentationtheory field. This paper is merely an attempt to build up a canvas of issues, on which further discussion is needed. Problem 1: Natural Language Generation Natural Language Generation (NLG) is concerned with the “construction of computer systems that can produce understandable texts in English or other human language from some underlying non-linguistic representation of information” [30]. Typically, an NLG tool would start with, say, a database of facts on a given subject, for instance illnesses1: isa(diabetes, disease) symptom(diabetes, tiredness) symptom(diabetes, nausea) treatment(diabetes, insulin-injection) and build up a piece of text, possibly personalized to the addressee, which in some way “translates” these facts in natural language, for instance: Diabetes is a disease. Its symptoms include tiredness and nausea. It can be treated with insulin injections. In order to achieve this, in a more “intelligent” way than just gluing words together, an NLG system needs knowledge of, at least: the domain (diabetes, and illnesses in general); grammar and lexicon; how to reach the intended purpose (what does “explaining” mean?); discourse organization (why is a sentence “good”? Why is it “understandable”?); the addressee (would the text be different if addressed to an audience of physicians? How?). Let us impudently ignore here the first two elements in the list: we may lightly assume that a computersystem can collect and record data in some structured form (a database, a knowledge base, etc.) and can query such information structure. We may also assume that a computer system can process a grammar and keep a dictionary 1For some reasons, research in NLG has seen a proliferation of systems on medical or epidemiological domains [4, 5, 6, 12, 25, 31, 32].

Meer zien Lees minder
Instelling
Natural Language Processing
Vak
Natural Language Processing

Voorbeeld van de inhoud

Natural Language Processing: many questions, no answers
Floriana Grasso
May 30, 2000


Introduction
Computational Linguistics, as a subfield of Linguistics, or Natural Language Processing (NLP), as a subfield of
Artificial Intelligence (two research areas that nowadays can be safely considered as merged) concentrate on the
“study of computer systems for understanding and generating natural language” [10], in order to develop “a com-
putational theory of language, using the notions of algorithms and data structures from Computer Science” [2].
Typical problems for these research fields are [2]: How is the structure of sentences identified? How can
knowledge and reasoning be modelled? How can language be used to accomplish specific tasks?
In more recent times, these problems are being studied not so much from the traditional linguistic viewpoint
(syntax and semantics) but with a focus on problems of belief models, planning processes, and functional proper-
ties of both discourse and text.
I will focus my discussion on three problems in NLP. The choice of the problems to address is admittedly
biased by my personal research interests, but I do believe that these problems in particular can (and should) most
benefit from considerations coming from the Argumentation theory field. This paper is merely an attempt to build
up a canvas of issues, on which further discussion is needed.


Problem 1: Natural Language Generation
Natural Language Generation (NLG) is concerned with the “construction of computer systems that can produce
understandable texts in English or other human language from some underlying non-linguistic representation of
information” [30].
Typically, an NLG tool would start with, say, a database of facts on a given subject, for instance illnesses1 :
isa(diabetes, disease)
symptom(diabetes, tiredness)
symptom(diabetes, nausea)
treatment(diabetes, insulin-injection)

and build up a piece of text, possibly personalized to the addressee, which in some way “translates” these facts in
natural language, for instance:

Diabetes is a disease. Its symptoms include tiredness and nausea. It can be treated with insulin injections.
In order to achieve this, in a more “intelligent” way than just gluing words together, an NLG system needs
knowledge of, at least:

 the domain (diabetes, and illnesses in general);
 grammar and lexicon;
 how to reach the intended purpose (what does “explaining” mean?);
 discourse organization (why is a sentence “good”? Why is it “understandable”?);
 the addressee (would the text be different if addressed to an audience of physicians? How?).
Let us impudently ignore here the first two elements in the list: we may lightly assume that a computer system
can collect and record data in some structured form (a database, a knowledge base, etc.) and can query such
information structure. We may also assume that a computer system can process a grammar and keep a dictionary
1 For some reasons, research in NLG has seen a proliferation of systems on medical or epidemiological domains [4, 5, 6, 12, 25, 31, 32].


1

, of terms, and we may leave aside problems related to the choice of words (should I call it “disease”, “illness”,
“pathology” or what?) and/or of grammatical constructs (should I use passive or active form?), despite they
undoubtly give rise to an important series of pragmatic problems.
The first problem to consider in our way to produce our piece of text on diabetes becomes therefore: how can
we structure the text? What pieces of data should be included, and why? And how they should be put together?
All computer systems which, more or less automatically, have produced natural language text have approached
this problem on the basis of a common assumption: text has a structure, that can be determined, at least partially,
by the speaker’s goal. The main differences between these systems lies in deciding what this structure is, and how
it can be modelled.
Early systems are based on the concept of “recipe”, or schema: if most of the explanations (e.g. in an ency-
clopedia) have the same structure (start with the identification, “X is a Y”, then pass to listing peculiar attributes
of X etc.) then we can replicate the same structure to all texts having the same purpose (to explain). We can go
a step further, and think of other, different goals the speaker might have, beyond the mere explanation: instruct,
provide evidence, justify, compare etc., and create a schema for each of them.
More recent systems tend to be based on a different, less rigid approach. Following Austin’s intuition that
utterances are performatives just as physical action are [3], many researchers in NLP have acknowledged that the
most flexible way to produce natural language text is to use a planning approach. Just as a robot may have the goal
of building a brick tower, and can decompose this problem into smaller and smaller tasks, until easily executable
basic actions can be performed (lift arm, pick up block etc.), similarly a natural language tool may decompose
a communicative goal into its steps, and use a plan based mechanism to achieve them. We can then benefit
from huge achievements in the planning research community, and implement a system capable of processing
“communicative operators”, such as (adapted from [27]):

NAME: persuade-by-motivation
EFFECT: (PERSUADED ?hearer (DO ?hearer ?act))
CONSTRAINTS: (STEP ?act ?goal) AND (GOAL ?hearer ?goal) AND (MOST-SPECIFIC ?goal)
DECOMPOSITION: (FORALL ?goal (MOTIVATION ?act ?goal))

The “only” problem we are left with, therefore, is to decide how such communicative goals can be defined
in the first place, and how they can be achieved, that is decomposed into smaller, more manageable problems,
so that we can feed our planner with a suitable library of operators similar to the one above. Guidance on these
issues comes typically from discourse theories, the most widely used of which is perhaps the Rhetorical Structure
Theory (RST) [21]. RST has been used in many applications (see for example [15]), and, despite some criticisms
[28], it is generally seen as “the” theory for generating text. This is most surprising, as, while extremely useful
for generating descriptive texts, RST deals very poorly with different genres of text, for instance it says very little
about how persuasive text can be generated. Moreover, it assumes texts have a hierarchical structure, which is in
many circumstances too strong a constraint, if not an artificial imposition.


Generation from Predefined Text: Summarization
An important subfield of NLG, which is rapidly becoming a field on its own right, summarization can be seen, at
a shallower level, as the extraction of excerpts from a text that can convey the main message(s) of the text without
too many details. At a deeper level, it can be seen as the extraction of the meaning of a text, and the reproduction
of a shorter version of the text itself. In both cases, a discourse structure is needed to understand what to keep and
what to throw away, decision which again should be based on the satisfaction of a given, communicative goal.


Problem 2: Intelligent Dialogue Agents
A natural extension to the ability of creating text for a purpose, would be to use this text in a conversation with
another partner, whether human or not. Most of the problems expressed before for the generation of a piece of
text appear here again, for the production of the single sentence needed for the dialogue move. They are accom-
panied by many others, though, as a consequence of having the audience directly intervening in the generation.
Such problems may involve shallower aspects of the dialogic activity, such as repairing conversation failures and
keeping the conversation “on focus”. Or architectural problems of how the generation of messages should be
interleaved with the management of the dialogue, or of how to organize turn taking (e.g. dialogue games). But
perhaps most importantly, an “intelligent” dialogic agent needs to be able to understand the implications of the
single sentence the partner communicates.

2

Geschreven voor

Instelling
Natural Language Processing
Vak
Natural Language Processing

Documentinformatie

Geüpload op
4 augustus 2024
Aantal pagina's
5
Geschreven in
2024/2025
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

$15.49
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
StudyCenter1 Teachme2-tutor
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
227
Lid sinds
2 jaar
Aantal volgers
91
Documenten
3850
Laatst verkocht
1 week geleden
Nursing school is hard! Im here to simply the information and make it easier!

My mission is to be your LIGHT in the dark. If you"re worried or having trouble in nursing school, I really want my notes to be your guide! I know they have helped countless others get through and thats all i want for YOU! Stay with me and you will find everything you need to study and pass any tests,quizzes abd exams!

4.3

28 beoordelingen

5
18
4
4
3
4
2
0
1
2

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen