QUESTIONS AND CORRECT ANSWERS
What are the limitations of the chosen approach? - Correct answers✔✔Built-in bias, e.g. voice
recognition not handling many accents, medical datasets skewed towards men
What is the recall regarding ethics in DS outputs? - Correct answers✔✔Conclusions and
automated systems can be used for good or ill, professionals have responsibility for uses made of
their work
What is an example of unethical behavior in DS? - Correct answers✔✔Volkswagen detecting air-
quality tests and switching to lower-polluting mechanisms only during the test
What is the importance of professional ethics? - Correct answers✔✔Defines expected behavior,
required for membership in professional associations
Who are the stakeholders in data science? - Correct answers✔✔Clients/employers, subjects
described by data, workers whose jobs might be displaced, end-users whose environment might
be shaped
What are the principles of computing professional ethics? - Correct answers✔✔Contribute to
society, avoid harm, be fair, respect privacy, respect others' work, work within competence,
prioritize public good
What is data quality? - Correct answers✔✔Data quality is essential for useful results.
What does 'garbage in, garbage out' mean? - Correct answers✔✔'Garbage in, garbage out' means
that if the input data is of poor quality, the output results will also be of poor quality.
, Why is it important to have good data sources? - Correct answers✔✔Good data sources ensure
that the whole provenance chain is trusted and that the data is representative of the actual
domain.
What is the process of cleaning data? - Correct answers✔✔Cleaning data involves improving
data quality as much as possible for specific uses.
Why is it easier to automate analysis in Python than in Excel? - Correct answers✔✔Python
allows for easier automation and testing compared to Excel, which often contains unnoticed
errors in formulas and data.
Why is it important to know the limitations of the techniques used in data analysis? - Correct
answers✔✔Knowing the limitations helps in checking the applicability of the techniques and
ensures accurate results.
What is the significance of scatterplots in data analysis? - Correct answers✔✔Scatterplots with
all the data are helpful in spotting overall trends and patterns between attributes.
Why should one be skeptical and ready to step away from initial findings in data analysis? -
Correct answers✔✔Being skeptical helps in avoiding biased or incorrect conclusions and
promotes objective analysis.
Why is it important to be open about limitations when communicating results? - Correct
answers✔✔Being transparent about limitations prevents overclaiming and ensures accurate
interpretation of the results.
What is asset management in data science? - Correct answers✔✔Asset management involves
effectively managing project assets (data and code) to ensure quality outcomes, confidence in
results, and privacy and confidentiality.