When fitting a linear model we make two assumptions about the distribution of residuals:

What are empirical cumulative distribution functions and what can we do with them? To answer the first question, let’s first step back and make sure we understand "distributions", or more specifically, "probability distributions".

A Basic Probability Distribution

Imagine a simple event, say flipping a coin 3 times. Here are all the possible outcomes, where H = head and T = tails:

You ran a linear regression analysis and the stats software spit out a bunch of numbers. The results were significant (or not). You might think that you’re done with analysis. No, not yet. After running a regression analysis, you should check if the model works well for the data.

The QQ plot, or quantile-quantile plot, is a graphical tool to help us assess if a set of data plausibly came from some theoretical distribution such as a normal or exponential. For example, if we run a statistical analysis that assumes our residuals are normally distributed, we can use a normal QQ plot to check that assumption. It's just a visual check, not an air-tight proof, so it is somewhat subjective.