Database - Answers a structured data set that can be accessed by authorized users (most secure
method of storing data)
- stored and managed using dedicated Management Systems (DBMS)
- Server examples: MySQL, PostgreSQL, Oracle, Azure, AWS, Cloud, MongoDB
- Other kinds of databases - noSQL, time-series, vector, etc.
SQL (Relational View) - Answers - data organized in clean, separate tables
- each table has a consistent structure
- relationships are managed through IDs, keys
- easy to read and understand at a glance
- structured schema (tables defined before data is added), ACID compliance (atomicity, consistency,
isolation, durability), great for complex queries and transactions
NoSQL (Document) View - Answers - data is nested and hierarchical
- more flexible but more complex to read
- relationships are embedded within the documents
- can be harder to scan and understand quickly
- flexible schema (dynamic, can evolve as needed), better for high velocity or unstructured data
Basic Components of relational databases - Answers - Tables - data organized into sets of columns
(fields) and rows (records)
- Fields - these are the columns that contain descriptive information about the observations in the
table (including primary and foreign keys)
- Records - these are the rows in a table; each row, or record, corresponds to a unique instance of
what is being described in the table
Hierarchy of Database - Answers Database, tables, records, fields
- database contains tables, which contain records, which contain fields
Primary key - Answers Unique identifier in each table (every table has a PK)
Foreign Key - Answers exist to create relationships of links between two tables
Ways to combine tables? - Answers - InnerJoin - data in both tables
- LeftJoin - everything kept in left table then add what's from table on right
- RightJoin - opposite of LeftJoin
- OuterJoin - only what doesn't match in any of them
- FullJoin - everything together in one
Relational Database Data Dictionary - Answers data dictionary but has more: table name, attribute
name (Field name), PK or FK, default value, field size
Relational Database Diagram - Answers a visual representation of the relational database data
dictionary
Entity- Relations (ER) Diagrams - Answers a graphical representation of a system, illustrating
relationships among people, objects, places and events in the system
Benefits of relational databases - Answers - Data integrity - useful acct info, relevant, faithful
representation of underlying transaction/event (free from error, complete, neutrality) - no outdated
info being used and keeps structure in data that facilitates checking and avoid redundancy
- Internal control benefits - improved security around data entry and table access (limit access, aid in
creating and enforcing data entry internal controls); enhance enforcement of preventative internal
controls
Redundancy - Answers - The nature of the database table structure confirms that there is a unique
listing of each observation stored in only one place
- Version control reduces the possibility of having more than one version of the truth
- Reduced redundancy cuts down on errors
Commands for getting data - Answers - Extract - retrieves data from various sources, such as
databases, APIs or spreadsheets
- Transform - cleans, formats, and structures that data to ensure accuracy and consistency
- Load - stores the processed data in a database or data warehouse for further analysis and reporting
SQL (Structured Query Language) - Answers a programming language designed for managing and
manipulating relational databases
- allows you to store, retrieve, modify, and delete data in database systems
, Using SQL - Answers SELECT fields (specifies the columns or expressions to retrieve from a database
table)
FROM tables (indicates the table from which the data is being retrieved)
WHERE filter
Descriptive analytics - Answers - addresses the questions of "What happened?" or "what is
happening?"
- analytics performed which characterize, summarize, and organize features and properties of the
data to facilitate understanding
Tools for descriptive analytics - Answers - Counts - frequency
- Totals, sums, averages, subtotals - aggregates data to provide overall trends and summaries
- Min, max, median, std. deviation - measures dispersion and central tendencies to understand
variability and trends
- Graphs, histograms - visual representations of data distributions, trends, and comparisons
- % Change - vertical/horizontal analysis
- Ratios - like return on assets, return on sales (Profit margin), asset turnover ratios, debt-to-equity
ratios - calculate important financial ratios for comparison
Accounting data sources for descriptive analytics - Answers B/S, I/S, Statement of CF, Statement of
SE, Footnotes, 10/K Filing
Descriptive tools categorzied - Answers - Measures of the central tendency of the data: help describe
what's "typical" in a data set (mean, median, mode)
- Measures of variability: mins, maxs, stdev, quartiles and stuff
- Other techniques - the other stuff
Horizontal vs. Vertical Analysis - Answers - Horizontal - provides comparative increases about various
line items of each financial statement over time; percent change
- Vertical - (or common-size analysis) expresses financial information in relation to a relevant figure or
base (facilitates comparisons between companies of different sizes, shows relative importance of
each item regardless of absolute values)
Ratio Analysis - Answers - ROA = Return on Assets - measures how efficiently a company uses all its
assets (net profit/total assets)
- ROE = Return on Equity - tells us how well a company uses shareholders' money (Net profit/equity)
DuPont Analysis - Answers Disaggregation of ROA or ROE into components that tells us better about
the company - HOW the company achieves its results
ROA = Profit margin * asset turnover * financial leverage
- PM - how much profit from each $ of sales (net prof/sales)
- AT - how efficiently assets are used to generate sales (sales/total assets)
- FL - shows how much the company uses debt vs. equity financing (total assets/equity)
Diagnostic analytics - Answers - "Why did it happen?"
- performed to investigate the underlying cause that cannot be answered by simply looking at the
descriptive data but can be answered by various types of analyses
- first step is to look for unusual, unexpected results or transactions
Finding previously unknown linkages, patterns, or relationships between variables - Answers -
Performing drill-down analytics - look for patterns in the underlying data set by summarizing data at
different levels and uncovering additional details to understand why something happened
- Determine relations/patterns/linkages between variables - find the extent to which there are
patterns in the data, or the data moves together
Diagnostic analytics tools for outliers/anomalies - Answers - Variance analysis - typically in mgmt acct;
looks at differences from expectations
- sequence checks and sequence analysis - why are some check numbers missing documentation?
does it signifiy errors or fraud? (we expect that they have a sequence)
- Duplicate transactions - why duplicates of some transactions? fraud or just errors?
- Benford's Law - in real-life data, there is an expected distribution of the first or leading digit (used to
identify fraud or irregular transactions)
- Fuzzy matching - used to identify similar but not exactly identical entries in a dataset (some vendor
addresses are similar to employee addresses)
- Bank Reconciliation - examines differences between transactions recorded by the bank and the G L
Outliers and Anomalies - Answers - Outlier -observation or data point that lies outside its expected
distribution (several stdevs away from other points)