# Descriptive Statistics Calculator

## Calculator Use

## What are Descriptive Statistics?

Descriptive statistics summarize certain aspects of a data set or a population using numeric calculations. Examples of descriptive statistics include:

- mean, average
- midrange
- standard deviation
- quartiles

This calculator generates descriptive statistics for a data set. Enter data values separated by commas or spaces. You can also copy and paste data from spreadsheets or text documents. See allowable data formats in the table below.

## Descriptive Statistics Formulas and Calculations

This calculator uses the formulas and methods below to find the statistical values listed.

### Minimum

Ordering a data set *x _{1} ≤ x_{2} ≤ x_{3} ≤ ... ≤ x_{n}* from lowest to highest value, the minimum is the smallest value

*x*.

_{1}### Maximum

Ordering a data set *x _{1} ≤ x_{2} ≤ x_{3} ≤ ... ≤ x_{n}* from lowest to highest value, the maximum is the largest value

*x*.

_{n}### Range

The range of a data set is the difference between the minimum and maximum.

\[ \text{Range} = x_n - x_1 \]### Sum

The sum is the total of all data values *x _{1} + x_{2} + x_{3} + ... + x_{n}*

### Size, Count

Size or count is the number of data points in a data set.

\[ \text{Size} = n = \text{count}(x_i)_{i=1}^{n} \]### Mean

The mean of a data set is the sum of all of the data divided by the size. The mean is also known as the average.

For a Population

\[ \mu = \dfrac{\sum_{i=1}^{n}x_i}{n}\]For a Sample

\[ \overline{x} = \dfrac{\sum_{i=1}^{n}x_i}{n}\]### Median

Ordering a data set *x _{1} ≤ x_{2} ≤ x_{3} ≤ ... ≤ x_{n}* from lowest to highest value, the median is the numeric value separating the upper half of the ordered sample data from the lower half. If

*n*is odd the median is the center value. If

*n*is even the median is the average of the 2 center values.

If *n* is odd the median is the value at position *p* where

If *n* is even the median is the average of the values at positions *p* and *p + 1* where

### Mode

The mode is the value or values that occur most frequently in the data set. A data set can have more than one mode, and it can also have no mode.

### Standard Deviation

Standard deviation is a measure of dispersion of data values from the mean. The formula for standard deviation is the square root of the sum of squared differences from the mean divided by the size of the data set.

For a Population

\[ \sigma = \sqrt{\dfrac{\sum_{i=1}^{n}(x_i - \mu)^{2}}{n}} \]For a Sample

\[ s = \sqrt{\dfrac{\sum_{i=1}^{n}(x_i - \overline{x})^{2}}{n - 1}} \]### Variance

Variance measures dispersion of data from the mean. The formula for variance is the sum of squared differences from the mean divided by the size of the data set.

For a Population

\[ \sigma^{2} = \dfrac{\sum_{i=1}^{n}(x_i - \mu)^{2}}{n} \]For a Sample

\[ s^{2} = \dfrac{\sum_{i=1}^{n}(x_i - \overline{x})^{2}}{n - 1} \]### Midrange

The midrange of a data set is the average of the minimum and maximum values.

\[ \text{MR} = \dfrac{x_{min} + x_{max}}{2} \]### Quartiles

Quartiles separate a data set into four sections. The median is the second quartile Q_{2}. It divides the ordered data set into higher and lower halves. The first quartile, Q_{1}, is the median of the lower half not including Q_{2}. The third quartile, Q_{3}, is the median of the higher half not including Q_{2}. This is one of
several methods for calculating quartiles.^{[1]}

### Interquartile Range

The range from *Q _{1}* to

*Q*is the interquartile range (IQR).

_{3}### Outliers

Potential outliers are values that lie above the Upper Fence or below the Lower Fence of the sample set.

\[ \text{Upper Fence} = Q_3 + 1.5 \times IQR \] \[ \text{Lower Fence} = Q_1 - 1.5 \times IQR \]### Sum of Squares

The sum of squares is the sum of the squared differences between data values and the mean.

For a Population

\[ SS = \sum_{i=1}^{n}(x_i - \mu)^{2} \]For a Sample

\[ SS = \sum_{i=1}^{n}(x_i - \overline{x})^{2} \]### Mean Absolute Deviation

Mean absolute deviation^{[2]} is the sum of the absolute value of the differences between data values and the mean, divided by the sample size.

For a Population

\[ MAD = \dfrac{\sum_{i=1}^{n}|x_i - \mu|}{n} \]For a Sample

\[ MAD = \dfrac{\sum_{i=1}^{n}|x_i - \overline{x}|}{n} \]### Root Mean Square

The root mean square describes the magnitude of a set of numbers. The formula for root mean square is the square root of the sum of the squared data values divided by *n*.

### Standard Error of the Mean

Standard error of the mean is calculated as the standard deviation divided by the square root of the count *n*.

For a Population

\[ {SE}_{\mu} = \dfrac{\sigma}{\sqrt{n}} \]For a Sample

\[ {SE}_{\overline{x}} = \dfrac{s}{\sqrt{n}} \]### Skewness

Skewness^{[3]} describes how far to the left or right a data set distribution is distorted from a symmetrical bell curve. A distribution with a long left tail is left-skewed, or negatively-skewed. A distribution with a long right tail is right-skewed, or positively-skewed.

For a Population

\[ \gamma_{1} = \dfrac{\sum_{i=1}^{n}(x_i - \mu)^{3}}{n\sigma^{3}} \]For a Sample

\[ \gamma_{1} = \dfrac{n}{(n-1)(n-2)} \sum_{i=1}^{n} \left(\dfrac{x_i - \overline{x}}{s}\right)^{3} \]### Kurtosis

Kurtosis^{[3]} describes the extremeness of the tails of a population distribution and is an indicator of data outliers. High kurtosis means that a data set has tail data that is more extreme than a normal distribution. Low kurtosis means the tail data is less extreme than a normal distribution.

For a Population

\[ \beta_{2} = \dfrac{\sum_{i=1}^{n}(x_i - \mu)^{4}}{n\sigma^{4}} \]For a Sample

\[ \beta_{2} = \dfrac{n(n+1)}{(n-1)(n-2)(n-3)} \sum_{i=1}^{n} \left(\dfrac{x_i - \overline{x}}{s}\right)^{4} \]### Kurtosis Excess

Excess kurtosis describes the height of the tails of a distribution rather than the extremity of the length of the tails. Excess kurtosis means that the distribution has a high frequency of data outliers.

For a Population

\[ \alpha_{4} = \dfrac{\sum_{i=1}^{n}(x_i - \mu)^{4}}{n\sigma^{4}} - 3 \]For a Sample (This is just Kurtosis in MS Excel and Google Sheets)

\[ \alpha_{4} = \dfrac{n(n+1)}{(n-1)(n-2)(n-3)} \sum_{i=1}^{n} \left(\dfrac{x_i - \overline{x}}{s}\right)^{4} - \dfrac{3(n-1)^{2}}{(n-2)(n-3)} \]### Coefficient of Variation

The coefficient of variation describes dispersion of data around the mean. It is the ratio of the standard deviation to the mean. The coefficient of variation is calculated as the standard deviation divided by the mean.

For a Population

\[ CV = \dfrac{\sigma}{\mu} \]For a Sample

\[ CV = \dfrac{s}{\overline{x}} \]### Relative Standard Deviation

Relative standard deviation describes the variance of a subset of data from the mean. It is expressed as a percentage. Relative standard deviation is calculated as the standard deviation times 100 divided by the mean.

For a Population

\[ RSD = \left[ \dfrac{100 \times \sigma}{\mu} \right] \% \]For a Sample

\[ RSD = \left[ \dfrac{100 \times s}{\overline{x}} \right] \% \]### Frequency

Frequency is the number of occurrences for each data value in the data set. Frequency is used to find the mode of a data set.

Unit

Options

54

65

47

59

40

53

54,

65,

47,

59,

40,

53,

or

42, 54, 65, 47, 59, 40, 53

65 47

59 40

53

or

42 54 65 47 59 40 53

54 65,,, 47,,59,

40 53

### References

[1] Wikipedia contributors. "Quartile." Wikipedia, The Free Encyclopedia. Last visited 28 May, 2020.

[2] Weisstein, Eric W. "Mean Deviation." From MathWorld--A Wolfram Web Resource. Mean Deviation. Last visited 28 May, 2020.

[3] Information Technology Lab, National Institute of Standards and Technology. Section 1.3.5.11 Measures of Skewness and Kurtosis. From the Engineering Statistics Handbook. Last visited 28 May, 2020.