Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

Summary of all lectures Interactive Data Transformation 2021

Beoordeling
-
Verkocht
6
Pagina's
44
Geüpload op
26-05-2021
Geschreven in
2020/2021

Summary of all lectures of the course 'Interactive Data Transformation'. Table of contents: Lecture 1: Intro + data management systems + SQL Topics: database management systems; single table queries using SQL. Lecture 2: Entity-relationship model Topics: Entity-relationship model Lecture 3: Advanced SQL operators and functions Topics: Processing multiple tables: relationship and joins; aggregation, views, and functions; table creation and population. Lecture 4: Data types + corresponding management systems Topics: Transforming entity-relationship diagram to relational schema; data normalization; data types and evolution. Lecture 5: Popular Intensive Systems I Topics: Big data and analytics; Map Reduce in Hadoop. Lecture 6: Popular Intensive Systems II + Database normalization Topics: The Spark Ecosystem; resilient distributed datasets, programming model. Good luck with studying for the exam! :)

Meer zien Lees minder
Instelling
Vak

Voorbeeld van de inhoud

Table of contents:
Lecture 1: Intro + Data Management Systems + SQL 2

Topics: Overview for the course; Database Management Systems; Single table queries using SQL.




Lecture 2: Entity-Relationship model 8

Topics: Entity-Relationship model.




Lecture 3: Advanced SQL operators and functions 16

Topics: Processing multiple tables: relationship and joins; aggregation, views, and functions; table creation
and population.




Lecture 4: Data Types + Corresponding Management Systems 22

Topics: Transforming entity relationship diagram to relational schema; data normalization; data types and
evolution.




Lecture 5: Popular Intensive Systems I 32

Topics: Big Data and Analytics; Map Reduce in Hadoop.




Lecture 6: Popular Intensive Systems II + Database Normalization 39

Topics: The Spark Ecosystem; Resilient distributed datasets; Programming model.




1

,Lecture 1: Intro + Data Management Systems + SQL

Database Management Systems
In the early days database applications were built on top of file systems, which resulted in a lot
of drawbacks. Therefore, there are a couple of reasons to use a DBMS (Database Management
System):
- Data redundancy and inconsistency: multiple file formats, duplication in different files
(same information in different files = redundancy, changed information in related file =
inconsistency).
- Difficulty in accessing data: need to write a new program to carry out each new task.
- Data isolation: multiple files and formats (date = MM/DD/YY or DD/MM/YY?).
- Integrity problems: it’s hard to verify constraints or change existing ones.
- Atomicity of updates: Failures of updates may leave data in an inconsistent state with
partial updates carried out.
- Concurrent access by multiple users: uncontrolled concurrent accesses can lead to
inconsistencies. Example: two people reading a balance at the same time ($100) and then
withdrawing money ($50 each) at the same time.
- Security problems: hard to provide users access to some, but not all, data.
Database Management Systems offer solutions to all the above mentioned problems!

Database management system architecture:
- Database (DB):
o Collection of data with the same structure.
o Including correlations and relationships.
o Common purpose (defined for a particular use).
o Shared (used by several users).
- Database Management System (DBMS):
o Collection of programs over DB.
o Define and specify the data types, structure, and
constraints.
o Build and manipulate (store on disk, retrieve, update).
o Administrate (manage access rights).
DBMS = black box interacting between users/applications
and the database.
- Applications:
o Access to DB for performing queries.
o Mobile apps, web applications, etc.



Ultimate goal DBMS: separate the data from the application.
- Provide an interface that the application programmer must follow.
- Allow system administrator to make modifications without having an impact on the user.
- Users can change their view of the data without having to worry about how it is stored.




2

,Three layers of a DBMS:
- External layer: communication with users (analysis of user requests (queries), access
control, answer presentation).
- Logical layer: optimization of queries, resolving conflicting accesses (multiple users),
guarantees constant availability even in case of failures.
- Internal layer: storing the data, software for structuring the data, efficient access
methods (keys, indices, etc.).




Development Process (DBMS Lifecycle)
Planning
- Planning: develop a preliminary
understanding of the business situation
and how information systems might help
solve the problem. Includes: analyzing
Maintenance Analysis
current data processing and analyzing
general business functions and needs.
- Analysis: Analyze the business situation
thoroughly to determine requirements
and to structure those requirements.
Output = conceptual schema (e.g. ER-
model). Corresponds to a detailed,
technology independent specification of Logical
Implementation
Design
the overall organizational data
structure.
- Logical design: representation of the
Physical
DB. Transform the conceptual schema
Design
(outcome of previous step) in terms of
the data management system.
- Physical design: the set of specifications that describe how data is stored in a
computer’s secondary memory by a specific DBMS.
- Implementation: Build database implementation, populate with data, install application(s)
and test, and complete documentation and training materials.
- Maintenance: Monitor the operation and usefulness of the system, repair by fixing
errors in database and applications, and enhance by analyzing the database and
applications to ensure that evolving information requirements are met.

Different types of DBMS:
- Traditional database management systems: text and numerical data.
- Multimedia database management systems: multimedia data (movies, music, etc.)
- Spatial database management systems: Geographic and geometric data.
- Data Warehouses.


3

, Relational Data Model
Relational Model:
- An approach to manage data by representing it grouped into relations.
- Theory developed by Tedd Codd in 1970 at IBM.
- Relational DataBase Management Systems (RDBMS): A database management system
that manages data as a collection of tables in which all relationships are represented by
common values in related tables.

Structured Query Language (SQL):
- Language for creating and querying relational databases.
- Simple, expressive, with efficient implementations.
- Used by many commercial systems: Oracle, MySQL, MS Access, SQLite.
- Standard for RDBMS: reduced training costs and cross-system communication.

SQL environment:
- Catalog: information for included databases (about all the databases. What are the
databases? Who have access to the databases? When is the database created?)
- Schema: structure of one database (tables, views).
- Data Definition Language (DDL): commands that define a database, including creating,
altering, and dropping tables and establishing constraints.
- Data Manipulation Language (DML): commands that maintain and query a database
(including SELECT statements).
- Data Control Language (DCL): commands that control a database, including
administering privileges and committing data.




4

Geschreven voor

Instelling
Studie
Vak

Documentinformatie

Geüpload op
26 mei 2021
Aantal pagina's
44
Geschreven in
2020/2021
Type
SAMENVATTING

Onderwerpen

$19.12
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
TTilburg

Maak kennis met de verkoper

Seller avatar
TTilburg Tilburg University
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
6
Lid sinds
6 jaar
Aantal volgers
6
Documenten
1
Laatst verkocht
4 jaar geleden

0.0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen