StatsCalculators.com

Logistic Regression

Created:April 17, 2025

This Logistic Regression Calculator helps you analyze binary outcome data and make classifications or predictions. It fits data to the model P(y=1)=11+e(β0+β1x1+β2x2+...+βpxp)P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p)}} , providing comprehensive analysis including model coefficients, odds ratios, and performance metrics. Logistic regression is widely used in various fields including medicine (disease diagnosis), marketing (customer conversion), and finance (credit scoring). You can analyze both simple and multiple logistic regression models with one or more predictor variables. To learn about the data format required and test this calculator, click here to populate the sample data.

Calculator

1. Load Your Data

Note: Column names will be converted to snake_case (e.g., "Product ID" → "product_id") for processing.

2. Select Columns & Options

The dependent variable should contain only 0 and 1 values

Related Calculators

Learn More

Logistic Regression

Definition

Logistic Regression is a statistical method used to model the probability of a binary outcome based on one or more predictor variables. Unlike linear regression, logistic regression models the log-odds of an event as a linear combination of predictors, which constrains the predicted probabilities between 0 and 1.

Key Formulas for One Predictor

Logistic Model (Probability):

P(y=1)=11+e(β0+β1x)P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x)}}

Logit Transformation (Log-odds):

log(P(y=1)1P(y=1))=β0+β1x\log\left(\frac{P(y=1)}{1 - P(y=1)}\right) = \beta_0 + \beta_1 x

Odds Ratio:

OR=eβ1\text{OR} = e^{\beta_1}

Decision Boundary (for classification):

xcutoff=β0log(1cc)β1x_{\text{cutoff}} = \frac{-\beta_0 - \log\left(\frac{1-c}{c}\right)}{\beta_1}

where c is the probability cutoff (typically 0.5)

Key Formulas for Multiple Predictors

Logistic Model (Probability):

P(y=1)=11+e(β0+β1x1+β2x2+...+βpxp)P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p)}}

Logit Transformation (Log-odds):

log(P(y=1)1P(y=1))=β0+β1x1+β2x2+...+βpxp\log\left(\frac{P(y=1)}{1 - P(y=1)}\right) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p

Odds Ratio:

ORi=eβi\text{OR}_i = e^{\beta_i}

For the i-th predictor, representing the change in odds when xix_i increases by one unit, holding other predictors constant

Decision Boundary (for classification):

β0+β1x1+β2x2+...+βpxp+log(1cc)=0\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p + \log\left(\frac{1-c}{c}\right) = 0

where c is the probability cutoff (typically 0.5)

β0+i=1pβixi=0\beta_0 + \sum_{i=1}^{p} \beta_i x_i = 0

(simplified for c = 0.5)

Key Assumptions

Binary outcome: The dependent variable is binary (0/1, success/failure, yes/no)
Independence: Observations are independent
No multicollinearity: Predictor variables are not highly correlated (for multiple logistic regression)
Linearity in the logit: The log-odds has a linear relationship with the predictor variables
Large sample size: Sufficient data to provide reliable estimates

Practical Example of Logistic Regression with One Predictor

Step 1: Data

Consider a dataset of student exam scores and admission outcomes (1 = admitted, 0 = rejected):

Exam Score (X)Admitted (Y)
350
420
570
781
931
Step 2: Fit Logistic Regression Model

After fitting a logistic regression model, we get:

log(P(admitted)1P(admitted))=10.68+0.15×exam_score\text{log} \left(\frac{P(\text{admitted})}{1 - P(\text{admitted})}\right) = -10.68 + 0.15 \times \text{exam\_score}
Step 3: Interpret the Coefficients

The coefficient β₁ = 0.15 means that for each one-point increase in exam score, the log-odds of admission increase by 0.15.

Converting to odds ratio: OR = e^0.15 = 1.16

This means that for each one-point increase in exam score, the odds of admission increase by 16%.

Step 4: Calculate Probability for a New Student

For a student with an exam score of 70:

P(admitted)=11+e(10.68+0.15×70)=11+e(10.68+10.5)=11+e0.18=0.45P(\text{admitted}) = \frac{1}{1 + e^{-(-10.68 + 0.15 \times 70)}} = \frac{1}{1 + e^{-(-10.68 + 10.5)}} = \frac{1}{1 + e^{0.18}} = 0.45

This student has a 45% probability of being admitted.

Step 5: Find the Decision Boundary

At what exam score is the probability of admission exactly 0.5?

exam_score=(10.68)0.15=71.2\text{exam\_score} = \frac{-(-10.68)}{0.15} = 71.2

Students scoring above 71.2 are more likely to be admitted than rejected.

Performance Metrics

Confusion Matrix

A table comparing actual vs. predicted classifications:

Actual PositiveActual Negative
Predicted PositiveTrue Positive (TP)False Positive (FP)
Predicted NegativeFalse Negative (FN)True Negative (TN)
Accuracy

Proportion of correct predictions: (TP + TN) / (TP + FP + FN + TN)

Sensitivity (Recall)

Proportion of actual positives correctly identified: TP / (TP + FN)

Specificity

Proportion of actual negatives correctly identified: TN / (TN + FP)

AUC (Area Under ROC Curve)

Measures the model's ability to distinguish between classes; ranges from 0.5 (no discrimination) to 1 (perfect discrimination)

Verification