This Paired T-Test Calculator helps you compare two related groups or repeated measurements to determine if there are statistically significant differences between them. For example, you could compare before and after measurements in a weight loss study, or test scores before and after an intervention. The calculator performs comprehensive statistical analysis including descriptive statistics, hypothesis testing, and automatically checks normality assumptions. It also generates publication-ready APA format reports. To learn about the data format required and test this calculator, click here to populate the sample data.
Calculator
1. Load Your Data
2. Select Columns & Options
Learn More
Paired T-Test
Definition
Paired T-Test is a statistical test used to compare two related/dependent samples to determine if there is a significant difference between their means. It's particularly useful when measurements are taken from the same subject before and after a treatment, or when subjects are matched pairs.
Formula
Test Statistic:
Degrees of freedom:
Confidence Intervals:
Two-sided confidence interval:
One-sided confidence intervals:
Where:
- = mean difference between paired observations
- = standard deviation of the differences
- = number of pairs
Key Assumptions
Practical Example
Testing the effectiveness of a weight loss program by measuring participants' weights before and after the program:
Given Data:
- Before weights (kg): 70, 75, 80, 85, 90
- After weights (kg): 68, 72, 77, 82, 87
- Differences (After - Before): -2, -3, -3, -3, -3
- (two-tailed test)
Hypotheses:
Null Hypothesis (): (no difference between before and after)
Alternative Hypothesis (): (there is a difference)
Step-by-Step Calculation:
- Calculate mean difference:
- Calculate standard deviation of differences:
- Degrees of freedom:
- Calculate t-statistic:
- Critical value:
- Confidence interval:
Conclusion:
, we reject the null hypothesis. There is sufficient evidence to conclude that the weight loss program resulted in a significant change in participants' weights (). We are 95% confident that the true mean difference lies between -3.2 and -2.4 kg.
Effect Size
Cohen's d for paired samples:
Interpretation guidelines:
Power Analysis
Required sample size (n) for desired power (1-β):
Where:
- = significance level
- = probability of Type II error
- = standard deviation of differences
- = minimum detectable difference
Decision Rules
Reject if:
- Two-sided test:
- Left-tailed test:
- Right-tailed test:
- Or if
Reporting Results
Standard format for scientific reporting:
Code Examples
library(tidyverse)
library(car)
library(effsize)
set.seed(42)
n <- 30
baseline <- rnorm(n, mean = 100, sd = 15)
followup <- baseline + rnorm(n, mean = -5, sd = 5) # Average decrease of 5 units
# Create data frame
data <- tibble(
subject = 1:n,
baseline = baseline,
followup = followup,
difference = followup - baseline
)
# Basic summary
summary_stats <- data %>%
summarise(
mean_diff = mean(difference),
sd_diff = sd(difference),
n = n()
)
# Paired t-test
t_test_result <- t.test(data$followup, data$baseline, paired = TRUE)
# Effect size
cohens_d <- mean(data$difference) / sd(data$difference)
# Visualization
ggplot(data) +
geom_point(aes(x = baseline, y = followup)) +
geom_abline(intercept = 0, slope = 1, linetype = "dashed") +
theme_minimal() +
labs(title = "Baseline vs Follow-up Measurements",
subtitle = paste("Mean difference:", round(mean(data$difference), 2)))
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.stats.power import TTestPower
# Generate example data
np.random.seed(42)
n = 30
baseline = np.random.normal(100, 15, n)
followup = baseline + np.random.normal(-5, 5, n)
differences = followup - baseline
# Basic statistics
mean_diff = np.mean(differences)
sd_diff = np.std(differences, ddof=1)
se_diff = sd_diff / np.sqrt(n)
# Paired t-test
t_stat, p_value = stats.ttest_rel(followup, baseline)
# Effect size
cohens_d = mean_diff / sd_diff
# Power analysis
analysis = TTestPower()
power = analysis.power(effect_size=cohens_d,
nobs=n,
alpha=0.05)
# Visualization
plt.figure(figsize=(12, 5))
# Scatterplot
plt.subplot(1, 2, 1)
plt.scatter(baseline, followup)
min_val = min(baseline.min(), followup.min())
max_val = max(baseline.max(), followup.max())
plt.plot([min_val, max_val], [min_val, max_val], '--', color='red')
plt.xlabel('Baseline')
plt.ylabel('Follow-up')
plt.title('Baseline vs Follow-up')
# Differences histogram
plt.subplot(1, 2, 2)
sns.histplot(differences, kde=True)
plt.axvline(mean_diff, color='red', linestyle='--')
plt.xlabel('Differences (Follow-up - Baseline)')
plt.title('Distribution of Differences')
plt.tight_layout()
plt.show()
print(f"Mean difference: {mean_diff:.2f}")
print(f"Standard deviation of differences: {sd_diff:.2f}")
print(f"t-statistic: {t_stat:.2f}")
print(f"p-value: {p_value:.4f}")
print(f"Cohen's d: {cohens_d:.2f}")
print(f"Statistical Power: {power:.4f}")
Alternative Tests
Consider these alternatives when assumptions are violated:
- Wilcoxon Signed-Rank Test: When normality of differences is violated or data is ordinal
- Independent t-test: When samples are independent rather than paired
Verification
Related Calculators
Help us improve
Found an error or have a suggestion? Let us know!