Logistic Regression

Created:April 17, 2025

This Logistic Regression Calculator helps you analyze binary outcome data and make classifications or predictions. It fits data to the model $P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p)}}$ , providing comprehensive analysis including model coefficients, odds ratios, and performance metrics. Logistic regression is widely used in various fields including medicine (disease diagnosis), marketing (customer conversion), and finance (credit scoring). You can analyze both simple and multiple logistic regression models with one or more predictor variables. To learn about the data format required and test this calculator, click here to populate the sample data.

Calculator

1. Load Your Data

Note: Column names will be converted to snake_case (e.g., "Product ID" → "product_id") for processing.

2. Select Columns & Options

Dependent Variable (Y - Binary, 0/1):

The dependent variable should contain only 0 and 1 values

Independent Variables (X):

Regularization:

Probability Cutoff:

Related Calculators

Simple Linear Regression Calculator

Exponential Regression Calculator

Multiple Linear Regression Calculator

Quadratic Regression Calculator

Learn More

Logistic Regression

Definition

Logistic Regression is a statistical method used to model the probability of a binary outcome based on one or more predictor variables. Unlike linear regression, logistic regression models the log-odds of an event as a linear combination of predictors, which constrains the predicted probabilities between 0 and 1.

Key Formulas for One Predictor

Logistic Model (Probability):

P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x)}}

Logit Transformation (Log-odds):

\log\left(\frac{P(y=1)}{1 - P(y=1)}\right) = \beta_0 + \beta_1 x

Odds Ratio:

\text{OR} = e^{\beta_1}

Decision Boundary (for classification):

x_{\text{cutoff}} = \frac{-\beta_0 - \log\left(\frac{1-c}{c}\right)}{\beta_1}

where c is the probability cutoff (typically 0.5)

Key Formulas for Multiple Predictors

Logistic Model (Probability):

P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p)}}

Logit Transformation (Log-odds):

\log\left(\frac{P(y=1)}{1 - P(y=1)}\right) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p

Odds Ratio:

\text{OR}_i = e^{\beta_i}

For the i-th predictor, representing the change in odds when $x_i$ increases by one unit, holding other predictors constant

Decision Boundary (for classification):

\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p + \log\left(\frac{1-c}{c}\right) = 0

where c is the probability cutoff (typically 0.5)

\beta_0 + \sum_{i=1}^{p} \beta_i x_i = 0

(simplified for c = 0.5)

Key Assumptions

Binary outcome: The dependent variable is binary (0/1, success/failure, yes/no)

Independence: Observations are independent

No multicollinearity: Predictor variables are not highly correlated (for multiple logistic regression)

Linearity in the logit: The log-odds has a linear relationship with the predictor variables

Large sample size: Sufficient data to provide reliable estimates

Practical Example of Logistic Regression with One Predictor

Step 1: Data

Consider a dataset of student exam scores and admission outcomes (1 = admitted, 0 = rejected):

Exam Score (X)	Admitted (Y)
35	0
42	0
57	0
⋮	⋮
78	1
93	1

Step 2: Fit Logistic Regression Model

After fitting a logistic regression model, we get:

\text{log} \left(\frac{P(\text{admitted})}{1 - P(\text{admitted})}\right) = -10.68 + 0.15 \times \text{exam\_score}

Step 3: Interpret the Coefficients

The coefficient β₁ = 0.15 means that for each one-point increase in exam score, the log-odds of admission increase by 0.15.

Converting to odds ratio: OR = e^0.15 = 1.16

This means that for each one-point increase in exam score, the odds of admission increase by 16%.

Step 4: Calculate Probability for a New Student

For a student with an exam score of 70:

P(\text{admitted}) = \frac{1}{1 + e^{-(-10.68 + 0.15 \times 70)}} = \frac{1}{1 + e^{-(-10.68 + 10.5)}} = \frac{1}{1 + e^{0.18}} = 0.45

This student has a 45% probability of being admitted.

Step 5: Find the Decision Boundary

At what exam score is the probability of admission exactly 0.5?

\text{exam\_score} = \frac{-(-10.68)}{0.15} = 71.2

Students scoring above 71.2 are more likely to be admitted than rejected.

Performance Metrics

Confusion Matrix

A table comparing actual vs. predicted classifications:

	Actual Positive	Actual Negative
Predicted Positive	True Positive (TP)	False Positive (FP)
Predicted Negative	False Negative (FN)	True Negative (TN)

Accuracy

Proportion of correct predictions: (TP + TN) / (TP + FP + FN + TN)

Sensitivity (Recall)

Proportion of actual positives correctly identified: TP / (TP + FN)

Specificity

Proportion of actual negatives correctly identified: TN / (TN + FP)

AUC (Area Under ROC Curve)

Measures the model's ability to distinguish between classes; ranges from 0.5 (no discrimination) to 1 (perfect discrimination)

Logistic Regression

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators

Simple Linear Regression Calculator

Exponential Regression Calculator

Multiple Linear Regression Calculator

Quadratic Regression Calculator

Learn More

Logistic Regression

Definition

Key Formulas for One Predictor

Key Formulas for Multiple Predictors

Key Assumptions

Practical Example of Logistic Regression with One Predictor

Step 1: Data

Step 2: Fit Logistic Regression Model

Step 3: Interpret the Coefficients

Step 4: Calculate Probability for a New Student

Step 5: Find the Decision Boundary

Performance Metrics

Verification

View Verification Details