Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
College aantekeningen

DATA ACQUISITION

Beoordeling
-
Verkocht
-
Pagina's
62
Geüpload op
08-11-2025
Geschreven in
2025/2026

Brief notes on collecting data from sensors or external sources and converting it into digital form for further processing and analysis.

Instelling
Vak

Voorbeeld van de inhoud

UNIT 1 DATA ACQUISITION



Data Acquisition – Sources of acquiring the data – Internal Systems and External Systems,
Web APIs, Data Preprocessing – Exploratory Data Analysis(EDA) – Basic tools(plots,
graphs and summary statistics) of EDA, Open Data Sources, Data APIs, Web Scrapping –
Relational Database access(queries) to process/access data

Introduction

Data

• a raw information, facts or numbers collected to be examined or analysed to make
decisions.

• should be in a formalized manner suitable for communication, interpretation and
processing.

Information

• Result of analysing data

Data versus Information




Types of Data

• Structured – Data which is organized and formatted in a specific way that forms a
well-defined schema or shape to form a proper structure.

, • Unstructured – These data are in an unorganized form and context specific or
varying. Eg., e-mail

• Natural language - It is a special type of unstructured data; it’s challenging to
process because it requires knowledge of specific data science techniques and
linguistics

• Machine-generated - data that is automatically created by a computer, process,
application, or other machine without human intervention.

• Graph-based - data that focuses on the relationship or adjacency of objects

• Audio, video, and images – captured and recognized through sound, pictures and
videos

• Streaming - The data flows into the system when an event happens instead of being
loaded into a data store in a batch.

Data file formats

• Tabular (e.g., .csv, .tsv, .xlsx)

• Non-tabular (e.g., .txt, .rtf, .xml)

• Image (e.g., .png, .jpg, .tif)

• Agnostic (e.g., .dat)

➢ some file formats are proprietary and can only be opened by software developed by
a particular company

➢ There are also other file formats that store metadata, such as SPSS and STATA files
that contain information on data labels.

1. Data Acquisition

Data Acquisition :

• The process of gathering various data from different relevant sources is referred to
as Data Acquisition

• It translates into the collection of data and ingesting it into a system for further use.

Importance of Data Acquisition :

• It is easier for businesses to analyze and formulate corresponding strategies around it.

• Having data in one place makes it easier to detect any discrepancy and solve it faster.

• It also decreases human error and improves data security.

, • In the longer run, it proves to be cost-efficient.

• It helps in building Recommendation System

Things to consider when acquiring data are:

• What data is needed to achieve the goal?
• How much data is needed?
• Where and how can this data be found?
• What legal and privacy concerns should be considered?

Data acquisition comprises of two steps – Data Harvest and Data Ingestion

Data Harvest :

It is the process by which a source generates data and it considers what data is
acquired.

Data Ingestion :

• Focuses on bringing the produced data into a given system.

• Data ingestion consists of three stages – discover, connect and synchronize.

Data Acquisition Methods

Data can be obtained from many different sources, such as websites, apps, IoT protocols or
even physical notes, and new data sources pop up literally every day.

Methods of acquiring data :

• Collecting new data.

• Converting and/or transforming legacy data.

• Sharing or exchanging data.

• Purchasing data.

Challenges and Characteristics to be considered for Data Acquisition

Before using these methods for data acquisition, USGS suggests considering certain business
goals and data characteristics.

First, think about the business goal (why is this data required and what will it bring?).

Next, consider the costs, time restrictions and format.

For specific domain, heavily regulated industry like banking, or are a government-controlled
entity, additional restrictions may also apply– for instance, data standard thresholds or
business rule limitations

, Every data acquisition method comes with additional challenges and characteristics to be
considered. For example, when it comes to transforming legacy data, first assess the legacy
quality. And for purchasing data, all the licensing issues need to be analysed.




Data Acquisition in Machine Learning

“ Data acquisition is the process of sampling signals that measure real-world physical
conditions and converting the resulting samples into digital numeric values that a computer
can manipulate.”

• Collection and Integration of the data: The data is extracted from various sources
and multiple data need to be combined based upon the requirement.

• Formatting: Prepare or organize the datasets as per the analysis requirements.

• Labeling: After gathering data, it is required to label or naming the data

Data Acquisition Process

The process of data acquisition involves searching for the datasets that can be used to train
the Machine Learning models.

The main segments are :

1. Data Discovery

2. Data Augmentation

3. Data Generation

Geschreven voor

Instelling
Vak

Documentinformatie

Geüpload op
8 november 2025
Aantal pagina's
62
Geschreven in
2025/2026
Type
College aantekeningen
Docent(en)
Abirami
Bevat
Alle colleges

Onderwerpen

$3.49
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
lsharan

Maak kennis met de verkoper

Seller avatar
lsharan Sathyabama institute of science and technology
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
-
Lid sinds
6 maanden
Aantal volgers
0
Documenten
15
Laatst verkocht
-

0.0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen