CDW110
(Dimensional Modeling Terms) Bridge - ANS-A bridge is a table that stores information
where many-to-many relationships are common. For example, consider diagnoses and
patient records. Each type of record might be related to many of the other type: a patient can
have multiple diagnoses, and a diagnosis can appear on many patients. In a bridge table,
each key references a unique combination of records that exists in your organization's
extracted data. This method of storage helps improve performance because if multiple
patient records have the same associated diagnoses, the combination of diagnoses appears
only once instead of many times.
\(Dimensional Modeling Terms) Data Mart - ANS-A data mart is a collection of Caboodle
data on a topic. Some data marts serve as comprehensive data sources for a particular
reporting need, making it easier for business intelligence developers to write reports. For
example, the HospitalReadmissionAnalyticsDataMart contains all the data relevant for
predicting the likelihood of a future hospital readmission. Data marts are very similar to fact
tables because they both store data about significant, measurable events and refer to data in
dimension tables. However, data marts can do direct key lookup to retrieve data, which
means they can link to other tables using a single Key row instead of two rows for the ID and
ID type.
\(Dimensional Modeling Terms) Data model component (DMC) - ANS-A DMC is a
Caboodle-specific concept that refers to a table in the Caboodle reporting database and its
supporting infrastructure. DMCs can be facts, dimensions, bridges, data marts, or
informational tables.
\(Dimensional Modeling Terms) Dimension - ANS-A dimension is a table in Caboodle that
contains attributes describing one or more facts. Dimensions are joined to facts to provide
descriptive information about them. For example, PatientDim contains patient information
that applies to many DMCs.
\(Dimensional Modeling Terms) Fact - ANS-Fact tables hold all the measures in Caboodle.
These are the primary tables that contain the many rows of data created in the source
system, such as individual encounters, orders, and transactions. These facts are typically
joined to multiple dimensions to add extra information that helps define the facts. These
clusters of facts with multiple dimensions make up the star schema on which a dimensional
data warehouse is built. In general, fact tables in Caboodle contain numeric data and link to
dimensions and profile dimensions for other information.
\(ETL Terms) Execution - ANS-An execution is the process that extracts data from a source
system using packages, transforms the data in the staging database, and loads it to
Caboodle for reporting. You create and run executions in the Caboodle Console.
\(ETL Terms) Extract - ANS-Extracts to Caboodle from Clarity can be either backfill or
incremental. Backfill extracts load or reload every row in a table from Clarity, whereas
incremental extracts load only changed rows. Existing data is available while extracts are in
progress.
\(ETL Terms)package - ANS-A package is a definition of an extract of data from one specific
source to a specific import table. For example, a fact might have packages for Epic inpatient
data, Epic outpatient data, and several non-Epic data sources. Packages are defined in
SSIS .dtsx files.
,\(General Modeling Terms) Business Key - ANS-A business key is an identifier for a record
in Caboodle. The business key is a combination of column identifies a record, and is based
on source system identifiers. For example, a business key could consist of a patient's ID
number and an ID type column identifying the number as an MRN. In other words, the
business key is tied to the actual extracted data, unlike a surrogate key. For reporting
efficiency, durable and surrogate keys should be used in reports instead of business keys.
\(General Modeling Terms) Caboodle contains rows with -1,-2,-3 as the primary keys, to
indicate data that is unspecified, not appllicable, or deleted. - ANS-When a foreign key value
is expected but its value is null in the source system, the foreign key is said to be
unspecified. In this case, the corresponding foreign key in Caboodle is set to -1. For
example, suppose a patient hasn't been assigned a primary care physician. When the
patient's record is loaded into Caboodle, the PrimaryCareProviderKey column is set to -1.
That column would then join to the -1 row in the ProviderDim table.
Foreign keys also refer to the -1 row for inferred data. For example, suppose your
organization hires a new physician, and that physician is assigned as a patient's primary
care provider. If the patient's record is loaded into Caboodle before the new provider's
record, it contains a foreign key reference to a row that doesn't yet exist, but is expected to
be loaded eventually. In this case, Caboodle makes a new inferred row in ProviderDim for
the expected provider. When the new provider's information is eventually loaded into
Caboodle, its values overwrite the default values that were assigned to the inferred row.
When a foreign key doesn't apply for a certain row, the foreign key is said to be not
applicable. In this case, the corresponding foreign key in Caboodle is set to -2. For example,
BillingAccountFact contains information related to both hospital billing accounts and
professional billing accounts. Only hospital billing accounts have admitting providers for the
hospital encounter, so rows for professional billing accounts have AdmittingProviderKey set
to -2.
When a row in Caboodle is marked as deleted, the surrogate key isn't hard-deleted from
your database. Instead, all strings are replaced with *Deleted, foreign keys are set to -3, and
other columns have null values.
\(General Modeling Terms) Data Lineage - ANS-The data lineage is a precise technical
explanation of where data in a package is extracted from in a source database. This
explanation can be as simple as a source table and column in Clarity or much more logically
complex as in the following ex:
(see doc @
https://galaxy.epic.com/?#Browse/page=1!68!50!3517622&from=Galaxy-Redirect)
\(General Modeling Terms) Durable Keys - ANS-A durable key is an identifier for a record in
Caboodle, shared across rows for the same record. Unlike a business key, a durable key
consists of a single column and isn't derived or extracted from the source data. For example,
because patient data is tracked over time, a single patient record might have multiple rows
associated with it in PatientDim. Each row has the same durable key to make it easy to find
all rows with a particular patient. Durable keys are used only in tables that store snapshot
(type 2) data.
\(General Modeling Terms) Foreign Keys - ANS-A foreign key is the primary key for a row in
a different table in Caboodle, used to link two tables together. For example,
ProcedureOrderFact contains a PatientKey column, which corresponds to the surrogate key
in PatientDim. Business intelligence developers use foreign keys to write reports using data
from multiple tables.
, For ease and efficiency of report writing, foreign key columns in Caboodle have refrential
integrity. In other words, foreign key columns never contain null or unmatched values.
\(General Modeling Terms) Snapshot data - ANS-A snapshot column tracks changes to data
in the column by creating a new instance of the row each time data in the column changes.
A snapshot DMC contains one or more snapshot columns. For example, if you use snapshot
data for columns related to a patient's address, a new row for the patient is created each
time the patient's address changes.
If a column or DMC isn't marked as snapshot, it doesn't track changes over time and stores
only the most recent information for a record. If the data changes, it overwrites the previous
value. For example, if you don't track changes to a column for a patient's Social Security
number, any changes to a patient's SSN overwrite the previous value in existing rows for the
patient.
Snapshot data represents information at a particular point in time. The dates associated with
this snapshot aren't clinically relevant because the dates correspond with when the data was
loaded into Caboodle, not when an event actually occurred or was changed in an upstream
system like Chronicles. When you write reports, never treat snapshot data as historical
clinical data.
Prior to August 2020, snapshot data is called type 2 data, and other data is called type 1
data.
\(General Modeling Terms) Surrogate key - ANS-A surrogate key is a unique identifier for a
row in Caboodle, and is also a table's primary key. Unlike a business key, a surrogate key
isn't derived or extracted from the source data. Instead, it's applied in Caboodle. Unlike a
durable key, each row has a different surrogate key.
\(General Reporting Tips) Add a filter to most queries to exclude Caboodle's special rows for
unspecified, not applicable, and deleted records, which have surrogate keys of -1, -2, and -3
- ANS-Include only rows where the key is greater than 0.
\(General Reporting Tips) Caboodle has a numbers table, NumbersDim, that you can use as
needed in your reports - ANS-NumbersDim contains the integers from 1 to 1,000,000, which
you can reference to help manipulate strings and complete other processes. If you need
more than 1,000,000 rows to accomplish a task, you can refer to NumbersDim multiple times
in your query.
\(General Reporting Tips) If a query refers to more than one table, all columns should be
prefixed by a descriptor (table name or alias) - ANS-Using descriptors ensures you have
unambiguous column references, preventing issues that can occur when two tables contain
columns with the same name.
\(General Reporting Tips) Starting in August 2019, add the following SQL command to the
top of every report:SET TRANSACTION ISOLATION LEVEL SNAPSHOT; - ANS-Running
your reports with snapshot isolation level ensures the reports and Caboodle executions don't
interfere with each other.
\(General Reporting Tips) Use the views on the FullAccess schema. - ANS-This schema
simplifies your reports by aggregating the information you find in facts, profile dimensions,
and bridge tables. It also includes only the data that passes Caboodle's data quality checks,
which are available starting in August 2018. If you use the dbo schema, reports can include
data that's considered invalid.
\Chapter 1. (After-Class Exercise) Caboodle is designed to... (Choose all that apply): -
ANS-A. Store Epic data
B. Store non-Rpic Data
C. Make reporting easy
(Dimensional Modeling Terms) Bridge - ANS-A bridge is a table that stores information
where many-to-many relationships are common. For example, consider diagnoses and
patient records. Each type of record might be related to many of the other type: a patient can
have multiple diagnoses, and a diagnosis can appear on many patients. In a bridge table,
each key references a unique combination of records that exists in your organization's
extracted data. This method of storage helps improve performance because if multiple
patient records have the same associated diagnoses, the combination of diagnoses appears
only once instead of many times.
\(Dimensional Modeling Terms) Data Mart - ANS-A data mart is a collection of Caboodle
data on a topic. Some data marts serve as comprehensive data sources for a particular
reporting need, making it easier for business intelligence developers to write reports. For
example, the HospitalReadmissionAnalyticsDataMart contains all the data relevant for
predicting the likelihood of a future hospital readmission. Data marts are very similar to fact
tables because they both store data about significant, measurable events and refer to data in
dimension tables. However, data marts can do direct key lookup to retrieve data, which
means they can link to other tables using a single Key row instead of two rows for the ID and
ID type.
\(Dimensional Modeling Terms) Data model component (DMC) - ANS-A DMC is a
Caboodle-specific concept that refers to a table in the Caboodle reporting database and its
supporting infrastructure. DMCs can be facts, dimensions, bridges, data marts, or
informational tables.
\(Dimensional Modeling Terms) Dimension - ANS-A dimension is a table in Caboodle that
contains attributes describing one or more facts. Dimensions are joined to facts to provide
descriptive information about them. For example, PatientDim contains patient information
that applies to many DMCs.
\(Dimensional Modeling Terms) Fact - ANS-Fact tables hold all the measures in Caboodle.
These are the primary tables that contain the many rows of data created in the source
system, such as individual encounters, orders, and transactions. These facts are typically
joined to multiple dimensions to add extra information that helps define the facts. These
clusters of facts with multiple dimensions make up the star schema on which a dimensional
data warehouse is built. In general, fact tables in Caboodle contain numeric data and link to
dimensions and profile dimensions for other information.
\(ETL Terms) Execution - ANS-An execution is the process that extracts data from a source
system using packages, transforms the data in the staging database, and loads it to
Caboodle for reporting. You create and run executions in the Caboodle Console.
\(ETL Terms) Extract - ANS-Extracts to Caboodle from Clarity can be either backfill or
incremental. Backfill extracts load or reload every row in a table from Clarity, whereas
incremental extracts load only changed rows. Existing data is available while extracts are in
progress.
\(ETL Terms)package - ANS-A package is a definition of an extract of data from one specific
source to a specific import table. For example, a fact might have packages for Epic inpatient
data, Epic outpatient data, and several non-Epic data sources. Packages are defined in
SSIS .dtsx files.
,\(General Modeling Terms) Business Key - ANS-A business key is an identifier for a record
in Caboodle. The business key is a combination of column identifies a record, and is based
on source system identifiers. For example, a business key could consist of a patient's ID
number and an ID type column identifying the number as an MRN. In other words, the
business key is tied to the actual extracted data, unlike a surrogate key. For reporting
efficiency, durable and surrogate keys should be used in reports instead of business keys.
\(General Modeling Terms) Caboodle contains rows with -1,-2,-3 as the primary keys, to
indicate data that is unspecified, not appllicable, or deleted. - ANS-When a foreign key value
is expected but its value is null in the source system, the foreign key is said to be
unspecified. In this case, the corresponding foreign key in Caboodle is set to -1. For
example, suppose a patient hasn't been assigned a primary care physician. When the
patient's record is loaded into Caboodle, the PrimaryCareProviderKey column is set to -1.
That column would then join to the -1 row in the ProviderDim table.
Foreign keys also refer to the -1 row for inferred data. For example, suppose your
organization hires a new physician, and that physician is assigned as a patient's primary
care provider. If the patient's record is loaded into Caboodle before the new provider's
record, it contains a foreign key reference to a row that doesn't yet exist, but is expected to
be loaded eventually. In this case, Caboodle makes a new inferred row in ProviderDim for
the expected provider. When the new provider's information is eventually loaded into
Caboodle, its values overwrite the default values that were assigned to the inferred row.
When a foreign key doesn't apply for a certain row, the foreign key is said to be not
applicable. In this case, the corresponding foreign key in Caboodle is set to -2. For example,
BillingAccountFact contains information related to both hospital billing accounts and
professional billing accounts. Only hospital billing accounts have admitting providers for the
hospital encounter, so rows for professional billing accounts have AdmittingProviderKey set
to -2.
When a row in Caboodle is marked as deleted, the surrogate key isn't hard-deleted from
your database. Instead, all strings are replaced with *Deleted, foreign keys are set to -3, and
other columns have null values.
\(General Modeling Terms) Data Lineage - ANS-The data lineage is a precise technical
explanation of where data in a package is extracted from in a source database. This
explanation can be as simple as a source table and column in Clarity or much more logically
complex as in the following ex:
(see doc @
https://galaxy.epic.com/?#Browse/page=1!68!50!3517622&from=Galaxy-Redirect)
\(General Modeling Terms) Durable Keys - ANS-A durable key is an identifier for a record in
Caboodle, shared across rows for the same record. Unlike a business key, a durable key
consists of a single column and isn't derived or extracted from the source data. For example,
because patient data is tracked over time, a single patient record might have multiple rows
associated with it in PatientDim. Each row has the same durable key to make it easy to find
all rows with a particular patient. Durable keys are used only in tables that store snapshot
(type 2) data.
\(General Modeling Terms) Foreign Keys - ANS-A foreign key is the primary key for a row in
a different table in Caboodle, used to link two tables together. For example,
ProcedureOrderFact contains a PatientKey column, which corresponds to the surrogate key
in PatientDim. Business intelligence developers use foreign keys to write reports using data
from multiple tables.
, For ease and efficiency of report writing, foreign key columns in Caboodle have refrential
integrity. In other words, foreign key columns never contain null or unmatched values.
\(General Modeling Terms) Snapshot data - ANS-A snapshot column tracks changes to data
in the column by creating a new instance of the row each time data in the column changes.
A snapshot DMC contains one or more snapshot columns. For example, if you use snapshot
data for columns related to a patient's address, a new row for the patient is created each
time the patient's address changes.
If a column or DMC isn't marked as snapshot, it doesn't track changes over time and stores
only the most recent information for a record. If the data changes, it overwrites the previous
value. For example, if you don't track changes to a column for a patient's Social Security
number, any changes to a patient's SSN overwrite the previous value in existing rows for the
patient.
Snapshot data represents information at a particular point in time. The dates associated with
this snapshot aren't clinically relevant because the dates correspond with when the data was
loaded into Caboodle, not when an event actually occurred or was changed in an upstream
system like Chronicles. When you write reports, never treat snapshot data as historical
clinical data.
Prior to August 2020, snapshot data is called type 2 data, and other data is called type 1
data.
\(General Modeling Terms) Surrogate key - ANS-A surrogate key is a unique identifier for a
row in Caboodle, and is also a table's primary key. Unlike a business key, a surrogate key
isn't derived or extracted from the source data. Instead, it's applied in Caboodle. Unlike a
durable key, each row has a different surrogate key.
\(General Reporting Tips) Add a filter to most queries to exclude Caboodle's special rows for
unspecified, not applicable, and deleted records, which have surrogate keys of -1, -2, and -3
- ANS-Include only rows where the key is greater than 0.
\(General Reporting Tips) Caboodle has a numbers table, NumbersDim, that you can use as
needed in your reports - ANS-NumbersDim contains the integers from 1 to 1,000,000, which
you can reference to help manipulate strings and complete other processes. If you need
more than 1,000,000 rows to accomplish a task, you can refer to NumbersDim multiple times
in your query.
\(General Reporting Tips) If a query refers to more than one table, all columns should be
prefixed by a descriptor (table name or alias) - ANS-Using descriptors ensures you have
unambiguous column references, preventing issues that can occur when two tables contain
columns with the same name.
\(General Reporting Tips) Starting in August 2019, add the following SQL command to the
top of every report:SET TRANSACTION ISOLATION LEVEL SNAPSHOT; - ANS-Running
your reports with snapshot isolation level ensures the reports and Caboodle executions don't
interfere with each other.
\(General Reporting Tips) Use the views on the FullAccess schema. - ANS-This schema
simplifies your reports by aggregating the information you find in facts, profile dimensions,
and bridge tables. It also includes only the data that passes Caboodle's data quality checks,
which are available starting in August 2018. If you use the dbo schema, reports can include
data that's considered invalid.
\Chapter 1. (After-Class Exercise) Caboodle is designed to... (Choose all that apply): -
ANS-A. Store Epic data
B. Store non-Rpic Data
C. Make reporting easy