« AWIPS Tips: Unidata... | Main | Offer: NSF Unidata... »

R²: Downsides and Potential Pitfalls for ESS ML Prediction

20 December 2023

By Thomas Martin, AI/ML Software Engineer

Always plot your data!
(Click to see why.)

Regression analysis is a fundamental concept in the field of machine learning (ML), in that it helps establish relationships among the variables by estimating how one variable affects the other.

The coefficient of determination, R² (pronounced “R squared”), is a measure that provides information about how well the regression line suggested by a numerical model approximates the actual data (often referred to as “goodness of fit”).

Quick aside: Here are a couple of datasets to ponder while reading through this blog post: Anscombe's Quartet and Datasaurus Dozen.

R² is often one of the initial metrics introduced in predictive regression analysis, and while it is commonly reported, I've found it to be less suitable for some ML applications in Earth Systems Science (ESS), for the following reasons:

R² is best suited for Gaussian distributions

While you can calculate R² for nonlinear models, it is less appropriate for variables with non-gaussian distributions.

R² without slope does not tell the entire story

The R² value provides information about the proportion of variance explained, but it does not provide insights into the direction or strength of the relationships between variables.

It is crucial to consider the slope of the regression line. A high R² with a small or insignificant slope may indicate a weak relationship or lack of practical significance.

R² Is sensitive to outliers

R² is sensitive to outliers in the data, meaning that extreme values can disproportionately influence the R² value.

Outliers can significantly impact the regression line and, consequently, the proportion of variance explained by the model.

While R² can be useful for normally distributed prediction problems in ESS, especially for data exploration or quick feature selection workflows, I recommend using additional prediction metrics (particularly Mean Absolute Error) for day-to-day ML work to ensure a more robust and accurate assessment of ESS ML model performance. Plotting your data is always a necessary step no matter what metric you use!

By way of illustration, I've put together a short Jupyter notebook working through some basic examples of places where R² might fall short: R2 Playground

Sun	Mon	Tue	Wed	Thu	Fri	Sat
« July 2025
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Today

« AWIPS Tips: Unidata... | Main | Offer: NSF Unidata... »

R² is best suited for Gaussian distributions

R² without slope does not tell the entire story

R² Is sensitive to outliers

Further Reading

News@Unidata

News@Unidata

Recent Entries:

Take a poll!

Browse By Topic

Browse by Topic

Blog Search

« AWIPS Tips: Unidata... | Main | Offer: NSF Unidata... »

R2 is best suited for Gaussian distributions

R2 without slope does not tell the entire story

R2 Is sensitive to outliers

Further Reading

News@Unidata

News@Unidata

Recent Entries:

Take a poll!

Browse By Topic

Browse by Topic

Search by Tag

Blog Search

R² is best suited for Gaussian distributions

R² without slope does not tell the entire story

R² Is sensitive to outliers