Skip to content
Toolcroft

Math Calculators

Linear Regression Calculator - Slope, Intercept & R²

Compute least-squares linear regression for a set of (x, y) data points. Get the slope, intercept, R² coefficient of determination, Pearson correlation, and a scatter plot with the best-fit line.

Slope (m)

1.990000

Intercept (b)

0.050000

0.997305

Pearson r

0.998652

Least-Squares Method

The calculator uses the ordinary least-squares (OLS) method to find the line ŷ = mx + b that minimises the sum of squared residuals. It returns the slope m, y-intercept b, R² goodness-of-fit, and the Pearson correlation coefficient r.

Interpreting R²

R² rangeFit qualityInterpretation
> 0.9StrongThe line explains more than 90% of the variance in the data.
0.7 – 0.9GoodSolid predictive power for many applied contexts.
0.5 – 0.7ModerateThe model captures a meaningful trend but other variables matter.
< 0.5WeakThe linear model explains little of the observed variation.

Important: R² measures correlation, not causation. A high R² does not mean that X causes Y - both variables might be driven by a third confounding factor.

Residuals

A residual is the difference between an observed value and the model's prediction: e = y − ŷ. OLS minimises the sum of squared residuals. Examining a residual plot (residuals vs. fitted values) reveals model problems:

  • Non-random patterns: suggest non-linearity - a higher-order or different model may fit better.
  • Fan shape (heteroscedasticity): variance increases with fitted values - a log transformation of y often helps.
  • Outliers: individual points with large residuals may unduly influence the slope estimate.

OLS assumptions

  • Linearity: the true relationship between X and Y is linear.
  • Independence: observations are independent of each other (violated by time-series data without correction).
  • Homoscedasticity: the variance of residuals is constant across all values of X.
  • Normality of residuals: residuals are approximately normally distributed (required for valid hypothesis tests and confidence intervals, not for the regression itself).

Worked example

Hours studied (x)Exam score (y)
150
258
365
473
580

For this dataset: slope m ≈ 7.5, intercept b ≈ 42.5, giving the line ŷ = 7.5x + 42.5. R² ≈ 0.998 - a near-perfect linear fit. The slope says each additional hour of study is associated with ~7.5 more points on the exam.