StatsCalculators.com

Correlation Coefficient

This Correlation Coefficient Calculator helps you measure the strength and direction of the linear relationship between two variables. It calculates the Pearson correlation coefficient, which ranges from -1 to +1, indicating whether variables have a strong negative correlation (-1), no correlation (0), or strong positive correlation (+1). The calculator also provides a visual representation (a scatter plot with regression line) of the relationship between the variables.

Quick Calculator

Need a quick calculation? Enter your numbers below:

Calculator

1. Load Your Data

2. Select Two Columns

Learn More

Understanding Correlation Coefficient

Definition

Correlation Coefficient measures the strength and direction of the linear relationship between two variables. It ranges from -1 to +1, where -1 indicates a perfect negative correlation, +1 indicates a perfect positive correlation, and 0 indicates no linear correlation.

Formula

Pearson Sample Correlation Coefficient:

r=Cov(X,Y)sXsYr = \frac{\text{Cov}(X,Y)}{s_X s_Y}

Where:

  • Cov(X,Y)\text{Cov}(X,Y) = covariance of X and Y
  • sXs_X = sample standard deviation of X
  • sYs_Y = sample standard deviation of Y

Interpretation Guidelines

+1: Perfect positive correlation
-1: Perfect negative correlation
0: No linear correlation

Important Considerations

  • Correlation does not imply causation
  • Only measures linear relationships
  • Sensitive to outliers and extreme values

Practical Example

Let's calculate the correlation coefficient between hours studied and exam scores for 5 students:

StudentIdHours Studied (X)Exam Score (Y)
1275
2380
3485
4590
5695

Correlation Coefficient Calculation

Step 1: Calculate the sample standard deviations:

For X (Hours Studied):

sx=104=2.51.58s_x = \sqrt{\frac{10}{4}} = \sqrt{2.5} \approx 1.58

For Y (Exam Scores):

sy=2504=62.57.91s_y = \sqrt{\frac{250}{4}} = \sqrt{62.5} \approx 7.91

Step 2: Use the covariance and standard deviations to calculate the correlation coefficient:

r=cov(X,Y)sxsy=12.51.58×7.91=12.512.5=1.0r = \frac{cov(X,Y)}{s_x s_y} = \frac{12.5}{1.58 \times 7.91} = \frac{12.5}{12.5} = 1.0

Final Result: The correlation coefficient is 1.0, indicating a perfect positive linear relationship between hours studied and exam scores. This means:

  • The relationship is perfectly linear
  • As study hours increase, exam scores increase proportionally
  • All points fall exactly on a straight line
  • There is no scatter or deviation from the linear pattern

Interpretation: The correlation coefficient of 1.0 indicates a perfect positive linear relationship between hours studied and exam scores. As study hours increase, exam scores increase in perfect proportion.

Visual Examples of Correlation

The following examples illustrate different types of correlations between variables. Each chart shows how the strength and direction of relationships can vary. Hover over the charts to explore the data points.

Perfect Positive Correlation

r = 1.0

Relationship: Strong direct linear relationship

As X increases, Y increases proportionally with no variation.

Strong Positive Correlation

0.7 < r < 1.0

Relationship: Strong direct linear relationship

As X increases, Y tends to increase with some variation.

Moderate Positive Correlation

0.3 < r < 0.7

Relationship: Moderate direct linear relationship

As X increases, Y tends to increase with more variation.

No Correlation

r ≈ 0

Relationship: No linear relationship

No consistent pattern between X and Y values.

Moderate Negative Correlation

-0.7 < r < -0.3

Relationship: Moderate inverse linear relationship

As X increases, Y tends to decrease with more variation.

Strong Negative Correlation

-1.0 < r < -0.7

Relationship: Strong inverse linear relationship

As X increases, Y tends to decrease with some variation.

Key Takeaways

  • Perfect correlation (r = ±1) indicates an exact linear relationship
  • The sign indicates direction: positive (upward trend) or negative (downward trend)
  • Values closer to 0 indicate weaker relationships between variables

How to Calculate Pearson Correlation Coefficient in R

Use the cor() function for basic correlation matrices:

R
library(tidyverse)

tips <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")

# pearson correlation
cor(tips$total_bill, tips$tip)

ggplot(tips, aes(x = total_bill, y = tip)) +
  geom_point(color = "steelblue") + 
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(
    title = "Scatter Plot of Total Bill vs. Tip",
    x = "Total Bill",
    y = "Tip Amount"
  ) +
  theme_minimal()

Related Calculators