StatsCalculators.com

Simple Linear Regression

Created:November 15, 2024
Last Updated:March 30, 2025

This Simple Linear Regression Calculator helps you analyze the relationship between two variables. It provides comprehensive analysis including model summary statistics, coefficient estimates, confidence intervals, and diagnostic tests. The calculator also generates a regression plot with the fitted line and confidence bands. To learn about the data format required and test this calculator, click here to populate the sample data.

Calculator

1. Load Your Data

2. Select Columns & Options

Learn More

Simple Linear Regression

Definition

Simple Linear Regression models the relationship between a predictor variable (X) and a response variable (Y) using a linear equation. It finds the line that minimizes the sum of squared residuals.

Key Formulas

Regression Line:

Y^=b0+b1X\hat{Y} = b_0 + b_1X

Slope:

b1=(xixˉ)(yiyˉ)(xixˉ)2b_1 = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sum(x_i - \bar{x})^2}

Intercept:

b0=yˉb1xˉb_0 = \bar{y} - b_1\bar{x}

R-squared:

R2=1(yiy^i)2(yiyˉ)2R^2 = 1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}

Key Assumptions

Linearity: Relationship between X and Y is linear
Independence: Observations are independent
Homoscedasticity: Constant variance of residuals
Normality: Residuals are normally distributed

Practical Example

Step 1: Data
XXYY(XXˉ)(X-\bar{X})(YYˉ)(Y-\bar{Y})(XXˉ)2(X-\bar{X})^2(XXˉ)(YYˉ)(X-\bar{X})(Y-\bar{Y})
12.1-2-3.8247.64
23.8-1-2.1212.12
36.200.2800
47.811.8811.88
59.323.3846.76
Σ=15\Sigma=15Σ=29.2\Sigma=29.2Σ=0\Sigma=0Σ=0\Sigma=0Σ=10\Sigma=10Σ=18.4\Sigma=18.4

Means: Xˉ=3\bar X = 3, Yˉ=5.84\bar Y = 5.84

Step 2: Calculate Slope (b1b_1)
b1=(xixˉ)(yiyˉ)(xixˉ)2=18.410=1.84b_1 = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sum(x_i - \bar{x})^2} = \frac{18.4}{10} = 1.84
Step 3: Calculate Intercept (b0b_0)
b0=yˉb1xˉ=5.841.84(3)=0.32b_0 = \bar{y} - b_1\bar{x} = 5.84 - 1.84(3) = 0.32
Step 4: Regression Equation
Y^=0.32+1.84X\hat{Y} = 0.32 + 1.84X
Step 5: Calculate R2R^2

R2=0.986R^2 = 0.986 (98.6% of variation in Y explained by X)

Code Examples

R
library(tidyverse)

data <- tibble(x = c(1, 2, 3, 4, 5), 
               y = c(2.1, 3.8, 6.2, 7.8, 9.3))

model <- lm(y ~ x, data=data)

summary(model)

ggplot(data, aes(x = x, y = y)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
  theme_minimal()

par(mfrow = c(2, 2))
plot(model)
Python
import numpy as np
import pandas as pd
import statsmodels.api as sm

X = [1, 2, 3, 4, 5]  
y = [2.1, 3.8, 6.2, 7.8, 9.3] 
X = sm.add_constant(X)

model = sm.OLS(y, X).fit()
print(model.summary())

Verification

Related Calculators