AM
BUAL 5660 FINAL – KALGOTRA EXAM QUESTIONS
AND ANSWERS WITH COMPLETE SOLUTIONS
Leave the first rating
Save
Practice questions for this set
Learn 1/7 Study using Learn
the process of breaking text documents apart into those pieces.
Select the correct term
1Stemming 2Clustering
3Tokenizing
4Stop Words
Don't know?
Terms in this set (32)
a set of programming instructions and standards for accessing a
API
Web-based software application or Web tool.
(REpresentational State Transfer) is an architectural style, and an approach
REST API
to communications that is often used in the development of Web
services.
To find information on the hundreds of millions of Web pages that exist,
a search engine employs special software robots, called spiders, to
Web Crawling or Spidering
build lists of the words found on Web sites. When a spider is building its
lists, the process is called Web
crawling.
A semi-automated process of extracting knowledge from
Text Mining
unstructured data sources
1) Both seek for novel and useful patterns
2)Both are semiautomated processes
3) Difference is the nature of the data:
Data Mining vs Text Mining
Structured versus unstructured data
4)To perform text mining - first, impose structure to the data, then
mine the structured data
- Structured data: in databases
Structured data vs Unstructured data - Unstructured data: Word documents, PDF files, text excerpts, XML
files, and so on
1/
3