SAMENVATTING DATA ANALYTICS FOR
ACCOUNTING
CHAPTER 1
1.1 GENERAL INTRODUCTION
What is data analytics?
=Data Analytics is the process of transforming and evaluating data with the
purpose of drawing conclusions to address business questions.
Effective data analytics?
=provides a way to search through large structured (predefined data) and
unstructured data (pdf, text format) to identify unknown patterns or relationships.
Big Data
= datasets which are too large and complex to be analysed traditionally.
The 4 V’s of big data
Volume= refers to size of dataset
Velocity =refers to speed of processing
Variety =refers to different types of data
Veracity =refers to the data quality.
Goal/purpose: transform (big) data into valuable knowledge to make more informed
business decisions or solve specific problems.
1.2 THE EFFECT OF DATA ANALYTICS
Effect on businesses
Importance:
• The global volume of data created is in the hundreds of zettabytes per year.
• 85% of CEOs put a high value on Data Analytics.
• 86% of CEOs place data mining and analysis as the second-most important
strategic technology.
• Business analytics tops CEO’s list of priorities.
• Data Analytics could generate up to $2 trillion in value per year.
Data Analytics is expected to have dramatic effects on auditing and financial
reporting as well as tax and managerial accounting
So it is not only about having the data but also using it in the best way possible
,Effect on auditing
General: enhances audit quality + expanded services + added value to clients
Audit process is changing from a traditional process towards a more automated
one
Allows audit prof to focus more on the logic en rationale behind data queries and
less on gathering the actual data
Expanded capabilities: testing for fraudulent transactions + automating
compliance- monitoring activities
Analyze complete dataset rather than sampling financial data
Effect on Management accounting
Data analytics and management accounting have quit similar task descriptions
Enhancements: cost analysis, better decision making, better forecasting, budgeting,
production and sales
Effect on financial reporting
better estimates of collectability, write-downs, etc.
better understand the business environment through social media and other
external data sources.
Analysts identify risks and opportunities through analysis of Internet searches
THE IMPACT MODEL
6 steps:
Identify the questions.
Master the data.
Perform the test plan.
Address and refine results.
Communicate insights.
Track outcomes
Iterative process
Step 1: Identify the Questions
=Understand the business problems
that need to be addressed
,Attributes to consider:
• What data do we need to answer the
question?
• Who is the audience that will use the
results?
• Is the scope of the question too
narrow or too broad?
• How will the results be used?
Example of questions:
• Are employees circumventing internal controls over payments?
• Are there any suspicious travel and entertainment expenses?
• Are our customers paying us in a timely manner?
• How can we predict the allowance for loan losses for our bank
loans?
• How can we find transactions that are risky in terms of accounting
issues?
• Who authorizes checks above $100,000?
• How can errors be identified?
Step 2: Master the Data
Consider the following 8 elements:
• Know what data are available and how they relate to the problem
• Data available in Internal systems.
• Data available in External networks and data warehouses.
• Data dictionaries.
• ETL - Extraction, transformation, and loading.
• Data validation and completeness.
• Data normalization.
• Data preparation and scrubbing (=cleaning)
Step 3: Perform the Test Plan
=Identify a relationship between the response (or dependent) variable and
those items that affect the response (also called the predictor, explanatory,
or independent variables).
Generally, we make a model, or a simplified representation of reality, to
address this purpose
For example: Predict the performance on the next accounting exam:
• The response/dependent variable: score on the exam
• The independent variables: study time, IQ, score on last exam, etc. (regression)
Provost and Fawcett, in their work Data Science for Business, identify 8 key
approaches to Data Analytics depending on the question:
, 1) Classification - Assign each unit in a population to a specific (pre-defined)
category or class.
2) Regression - Predict a continuous dependent variable’s value based on
independent variable inputs using a statistical model.
3) Similarity Matching – Identify similar individuals or items based on known
data (e.g., address data: "123 Main St." vs. "123 Main Street")
4) Clustering - Divide individuals or items into meaningful or useful groups
(without predefined categories). Not pre-defined)
5) Co-occurrence Grouping - Discover associations or relationships between
individuals or items based on shared transactions.
6) Link Prediction - Predict connections or relationships between two data
items.
7) Profiling - Characterize the typical behaviour of an individual, group, or
population by generating summary statistics about the data. (mean, median, max
and min, SE)
8) Data Reduction - Reduce the amount of information being analysed to
focus on the most critical and relevant elements. (highest risk, cost, impact?)
Step 4: Address and Refine Results
Identify issues with the analyses, possible issues, and refine the model
Wrong variables, wrong questions?
• Ask further questions.
• Explore the data.
• Rerun analyses.
Step 5 Communicate Insights
Communicate effectively using clear language and visualizations:
Dashboards.
ACCOUNTING
CHAPTER 1
1.1 GENERAL INTRODUCTION
What is data analytics?
=Data Analytics is the process of transforming and evaluating data with the
purpose of drawing conclusions to address business questions.
Effective data analytics?
=provides a way to search through large structured (predefined data) and
unstructured data (pdf, text format) to identify unknown patterns or relationships.
Big Data
= datasets which are too large and complex to be analysed traditionally.
The 4 V’s of big data
Volume= refers to size of dataset
Velocity =refers to speed of processing
Variety =refers to different types of data
Veracity =refers to the data quality.
Goal/purpose: transform (big) data into valuable knowledge to make more informed
business decisions or solve specific problems.
1.2 THE EFFECT OF DATA ANALYTICS
Effect on businesses
Importance:
• The global volume of data created is in the hundreds of zettabytes per year.
• 85% of CEOs put a high value on Data Analytics.
• 86% of CEOs place data mining and analysis as the second-most important
strategic technology.
• Business analytics tops CEO’s list of priorities.
• Data Analytics could generate up to $2 trillion in value per year.
Data Analytics is expected to have dramatic effects on auditing and financial
reporting as well as tax and managerial accounting
So it is not only about having the data but also using it in the best way possible
,Effect on auditing
General: enhances audit quality + expanded services + added value to clients
Audit process is changing from a traditional process towards a more automated
one
Allows audit prof to focus more on the logic en rationale behind data queries and
less on gathering the actual data
Expanded capabilities: testing for fraudulent transactions + automating
compliance- monitoring activities
Analyze complete dataset rather than sampling financial data
Effect on Management accounting
Data analytics and management accounting have quit similar task descriptions
Enhancements: cost analysis, better decision making, better forecasting, budgeting,
production and sales
Effect on financial reporting
better estimates of collectability, write-downs, etc.
better understand the business environment through social media and other
external data sources.
Analysts identify risks and opportunities through analysis of Internet searches
THE IMPACT MODEL
6 steps:
Identify the questions.
Master the data.
Perform the test plan.
Address and refine results.
Communicate insights.
Track outcomes
Iterative process
Step 1: Identify the Questions
=Understand the business problems
that need to be addressed
,Attributes to consider:
• What data do we need to answer the
question?
• Who is the audience that will use the
results?
• Is the scope of the question too
narrow or too broad?
• How will the results be used?
Example of questions:
• Are employees circumventing internal controls over payments?
• Are there any suspicious travel and entertainment expenses?
• Are our customers paying us in a timely manner?
• How can we predict the allowance for loan losses for our bank
loans?
• How can we find transactions that are risky in terms of accounting
issues?
• Who authorizes checks above $100,000?
• How can errors be identified?
Step 2: Master the Data
Consider the following 8 elements:
• Know what data are available and how they relate to the problem
• Data available in Internal systems.
• Data available in External networks and data warehouses.
• Data dictionaries.
• ETL - Extraction, transformation, and loading.
• Data validation and completeness.
• Data normalization.
• Data preparation and scrubbing (=cleaning)
Step 3: Perform the Test Plan
=Identify a relationship between the response (or dependent) variable and
those items that affect the response (also called the predictor, explanatory,
or independent variables).
Generally, we make a model, or a simplified representation of reality, to
address this purpose
For example: Predict the performance on the next accounting exam:
• The response/dependent variable: score on the exam
• The independent variables: study time, IQ, score on last exam, etc. (regression)
Provost and Fawcett, in their work Data Science for Business, identify 8 key
approaches to Data Analytics depending on the question:
, 1) Classification - Assign each unit in a population to a specific (pre-defined)
category or class.
2) Regression - Predict a continuous dependent variable’s value based on
independent variable inputs using a statistical model.
3) Similarity Matching – Identify similar individuals or items based on known
data (e.g., address data: "123 Main St." vs. "123 Main Street")
4) Clustering - Divide individuals or items into meaningful or useful groups
(without predefined categories). Not pre-defined)
5) Co-occurrence Grouping - Discover associations or relationships between
individuals or items based on shared transactions.
6) Link Prediction - Predict connections or relationships between two data
items.
7) Profiling - Characterize the typical behaviour of an individual, group, or
population by generating summary statistics about the data. (mean, median, max
and min, SE)
8) Data Reduction - Reduce the amount of information being analysed to
focus on the most critical and relevant elements. (highest risk, cost, impact?)
Step 4: Address and Refine Results
Identify issues with the analyses, possible issues, and refine the model
Wrong variables, wrong questions?
• Ask further questions.
• Explore the data.
• Rerun analyses.
Step 5 Communicate Insights
Communicate effectively using clear language and visualizations:
Dashboards.