A density plot is a visualization that shows the distribution of a continuous variable. It uses kernel density estimation (KDE) to create a smooth curve that represents the probability density function of the variable. Density plots are useful for understanding the shape of your data, detecting patterns, and comparing distributions. Try the plot maker using the sample dataset named "tips" and select the "total_bill" column.
Calculator
1. Load Your Data
2. Select Columns & Options
Leave empty for automatic selection
Learn More
What is a Density Plot?
A density plot uses kernel density estimation (KDE) to visualize the distribution of a continuous variable. It creates a smooth curve by placing a kernel (probability density function) at each data point and summing them together. This creates a continuous estimation of the probability distribution that generated your data.
Unlike histograms which use discrete bins, density plots show a smooth estimation of the distribution, making it easier to identify patterns, skewness, and multimodality.
Understanding Kernel Types
Common Kernel Functions
- Gaussian: Bell-shaped, smooth, extends infinitely
- Epanechnikov: Optimal for minimizing error, parabolic shape
- Triangular: Simple linear interpolation
- Uniform: Rectangular shape, simplest approach
Choosing a Bandwidth
- Higher bandwidth = smoother curve (may miss features)
- Lower bandwidth = more details (may show noise)
- Automatic methods (Silverman's rule) balance smoothness and detail
- Experiment with different values for your specific dataset
Creating Density Plots in R
R provides excellent tools for creating density plots. Here's an example using ggplot2:
library(tidyverse)
# load tips dataset
tips <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")
# density plot with rug plot and normal curve
ggplot(tips, aes(x = total_bill)) +
geom_density(fill = "skyblue", alpha = 0.5) +
geom_rug(alpha = 0.5) +
stat_function(
fun = dnorm,
args = list(mean = mean(tips$total_bill), sd = sd(tips$total_bill)),
lty = 2, color = "red"
) +
labs(title = "Density Plot of Total Bill Amount",
x = "Total Bill ($)",
y = "Density") +
theme_minimal()
This code creates a density plot for the 'total_bill' variable from a restaurant tips dataset. The red dashed line represents a normal distribution with the same mean and standard deviation.
How to Interpret Density Plots
Shape Characteristics
- Symmetry: Bell-shaped curves suggest normal distributions
- Skewness: Longer tails on one side indicate skewed data
- Peaks: Multiple peaks suggest multimodal data
- Spread: Width indicates variability in the data
Common Patterns
- Normal: Symmetric, bell-shaped curve
- Right-skewed: Longer tail on right side (positive skew)
- Left-skewed: Longer tail on left side (negative skew)
- Bimodal: Two distinct peaks (suggests two subgroups)
- Uniform: Relatively flat distribution
When to Use Density Plots
Density plots are particularly useful in these situations:
- Examining the distribution shape of continuous variables
- Comparing distributions across different groups
- Identifying multimodality (multiple peaks) in your data
- Assessing normality assumptions visually
- Exploring the probability density of continuous data