E370 Exam 1 2023 with verified questions and answers
Frequency distribution shows the number of data observations that fall into specific intervals Statistics the mathematical science that deals with the collection, analysis, and presentationof data, which can then be used as a basis for inference and induction Data set All the data collected in a particular study Data values assigned to observations or measurements Nominal Data Arbitrary levels for data. No ranking allowed. EX: Zip codes (19808, 76137) Ordinal Data Ranking allowed. No measurable meaning to the number differences. EX: Education level (master's degree, doctorate degree) Interval Data Meaningful differences. No true zero point. EX: Calendar year (2009, 2010) Ratio Data Meaningful differences. True zero point. EX: Income ($48,000, $0) Time Series Data values that correspond to specific measurements taken over a range of time periods Cross Section Data values collected from a number of subjects during a single time period Descriptive statistics Collecting, summarizing, and displaying data (reported based on observations) Inferential statistics making claims or conclusions about the data based on a sample (makes statement about population) Discrete data values based on observations that can be counted and are typically represented by whole numbers (how many) - something that has been counted Continuous data values that can take on any real numbers, including numbers that contain decimal points (how much) - usually measured rather than counted Discrete or Continuous Data? Number of children per family Discrete Discrete or Continuous Data? Number of cars listed per insurance policy Discrete Discrete or Continuous Data? Time required to read chapter 2 Continuous Discrete or Continuous Data? Vacation days per month Discrete Discrete or Continuous Data? Thickness of paint applied to a car body Continuous Relative frequency distributions display the proportionof observations of each class relative to the total number of observations How do you find the relative frequency? by dividing each frequency by the total number of observations Cumulative relative frequency distribution totals the proportion of observations that are less than or equal to the class at which you are looking ( Cumulative relative frequency for the highest class is equal to 1.00 ) Histogram a graph showing the number of observations in each class of a frequency distribution. (basically a bar graph) How can you determine the number of classes in a frequency distribution? 2^k = n k: number of classes n: number of data points Class boundaries represent the minimum and maximum values for each class Qualitative data values that are categorical •Can be nominal or ordinal measurement level •Describe a characteristic, such as gender or level of education Pie charts a tool for comparing proportions for qualitative data Central tendency a single value used to describe the center point of a data set - mean, median, mode Mean the average of all the data values and the most common measure of central tendency How do you find the mean? adding all the values in a data set and then dividing the result by the number of observations Advantages of Using the Mean to Summarize Data •Simple to calculate •Summarizes the data with a single value Disadvantages of Using the Mean to Summarize Data •With only a summary value you lose information about the original data •The value of the mean is sensitive to outliers (values that are much higher or lower than most of the data) Median the value in the data set for which half the observations are higher and half are lower Mode the value that appears most often in a data set (value that occurs with greatest frequency) Symmetric Distribution Shape Mean = Median Left-Skewed Distribution Shape Mean Median Right-Skewed Distribution Shape Mean Median Measures of variability show how much spread is present in the data Range Highest value - Lowest Value Sample variance denoted by s2 and is the average of the squared differences between each data value and the mean Sample standard deviation the square root of the sample variance Coefficient of variation measures the SD in terms of its percentage of the mean and indicates how large the SD is in relation to the mean •A high CVindicates high variability relative to the mean •A low CVindicates low variability relative to the mean Coefficient of variation equation (s/x)(100) s= the sample x= the sample mean z-score identifies the number of standard deviations a particular value is from the mean of its distribution Empirical Rule: Approximately ___% of the values to fall within ±1 standard deviations from the mean 68% Empirical Rule: Approximately ___% of the values to fall within ±2 standard deviations from the mean 95% Empirical Rule: Approximately ___% of the values to fall within ±3 standard deviations from the mean 99.7% Z-Score equation x = mean + zs z= +/- standard deviation thing s= given standard deviation Percentiles measure the approximate percentage of values in the data set that are belowthe value of interest Quartiles split the ranked data into 4 equal groups The first quartile (Q1) is the value that constitutes the 25th percentile •The second quartile (Q2) is the value that constitutes the 50th percentile•Second quartile (the 50th percentile) = Median •The third quartile (Q3) is the value that constitutes the 75th percentile On a scatter plot, what variable is on the x-axis? independent variable On a scatter plot, what variable is on the y-axis? dependent variable Sample covariance measures the direction of the linear relationship between two variables Sample correlation coefficient measures both the strength and direction of the linear relationship between two variables What is the range of values of sample correlation coefficients? -1.0. a strong negative relationship to +1.0, a strong positive relationship In a questionnaire, respondents are asked to mark their gender as male or female. Gender is an example of the... nominal scale Data obtained from a nominal scale... can be either numeric or nonnumeric All the data collected in a particular study are referred to as the... data set Quantitative data... must be numeric ________ data use descriptive terms to measure or classify something of interest qualitative _________ is the process of drawing inferences about the population based on the information taken from the sample inferential statistics In a sample of 800 students in a university, 240, or 30%, are Business majors. The 30% is an example of... descriptive statistics A frequency distribution is a tabular summary of data showing the... number of items in several classes What shows the proportion of data items with values less than or equal to the upper limit of each class? A cumulative relative frequency distribution The sum of the relative frequencies for all classes will always equal... one Since the population is always larger than the sample, the value of the sample mean... could be larger, equal to, or smaller than the true value of the population mean If a data set has an even number of observations, the median... is the average value of the two middle items when all items are arranged in ascending order During a cold winter, the temperature stayed below zero for ten days (ranging from -20 to -5). The variance of the temperatures of the ten-day period... must be positive The standard deviation of a sample of 100 elements taken from a very large population is determined to be 60. The variance of the population... can be any value greater or equal to zero What measure is found by first determining the index point, i = 0.5(n),where n is the number of data points? median Which measure would you use to describe central tendency for categorical data? mode When outliers are present in the data set, which measure is the best to describe central tendency in the data? median Comparing the consistency (variability) between two data sets when their means are very different is best done with the... coefficient of variation Which of the following tools provide a format to display observations that have more than one value associated with them? contingency tables The sample ____________ measures the direction of the linear relationship between two variables but does not measure the strength of the relationship. covariance The sample correlation coefficient can be equal to any value... between -1.0 and +1.0 Positive values of covariance indicate positive relation between the independent and the dependent variables A numerical measure of linear association between two variables is the correlation coefficient Suppose that you conduct a study in which you observe parents and their children interacting at home. You find that the more supportive parents are, the less aggressive their children are. What conclusion can you make? Level of support and aggression are negatively correlated
Geschreven voor
- Instelling
- E370
- Vak
- E370
Documentinformatie
- Geüpload op
- 6 april 2023
- Aantal pagina's
- 6
- Geschreven in
- 2022/2023
- Type
- Tentamen (uitwerkingen)
- Bevat
- Vragen en antwoorden
Onderwerpen
-
e370 exam 1 2023 with verified questions and answers
-
frequency distribution shows the number of data observations that fall into specific intervals
-
statistics the mathematical science that deals with