A. Constructing and interpreting graphical displays of distribution of univriate data (dotplot, stemplot, histogram, cumulative frequency plot) 1. Center = location Spread = variablility 2. Clusters are isolated groups of data points. Gaps refer to missing areas in a data set. 3. Outliers are extreme values, data points that lie significantly outside other values in a data set. Unusual features are gaps and clusters. 4. Shape = Distribution pattern with data B.
Summarizing distribution of univariate data. 1. Mean = add up data values and divide by number of data values Median = list data vlues in order, locate middle data value 2. Range = Maximum – minimum Interquartile range (IQR) = Q3 –Q1 standard deviation is the average distance values fall from the mean of graph. 3. Q1(lower quartile) is the 25th percentile of ordered data or median of lower half of ordered data Median (Q2) is the 50th percentile of ordered data
Q3 (upper quartile) is the 75th percentile of ordered data or median of upper half of ordered data Z scores are standarized standard deviation measurments of how far from the center (mean) a data value falls 4. 5. Add or Subtract- the new location summary statistics (mean, median, min, max, Q1, and Q3) shifts accordingly to the addition (or subtraction) of the constant from the old loaction summary statistics. -The new variation (spread) summary statistics (standard deviation, range, interquartile range) do NOT change from the old variation summary statistics.
The Essay on Descriptive Statistics Using Excel
Open the Excel spreadsheet labeled “Example Database.” The first page is the Data Dictionary. The Variable Label is the “handle” used for computation. The Variable Name is longer and more descriptive. The Values column shows the numerical values associated with categorical variables. In the data dictionary, you can see that the only categorical (nominal) variable is SEX. ...
Measurments of variationare not affected by addition (or subtraction) of a constant. Multiply or Divide- if you multiply or divide a constant number to each value in data set, then -the new location summary statistics (mean, median, min, max, Q1, and Q3) changes by the same multiplication (or division) as calculated on the data set from the old location summary statistics. -The new variation (spread) summary staistics (standard deviation, range, interquartile range) changes by the same multiplication (or division) as alculated on the data set, as compared to the old variation (spread) summary statistics. C. Comparing ditributions of univariate data (dotplots, back stemplots, parallel boxplots) 1. Center is mean(average) Spread is the distance the points in a data set that make up that mean from the mean. 2. Gaps refer to missing areas in a data set. Clusters are isolated groups of data points. 3. Outliers are extreme values, data points that lie significantly outside other values in a data set. Unusual features are gaps and clusters. 4. Shapes = distribution pattern with data