What Is R-Squared?
R-squared (R²) is a statistical measure that represents the proportion of variance in a dependent variable that is predictable from one or more independent variables. In simple terms, it tells you how well your model explains the data.
R² values range between 0 and 1:
- 0 means the independent variables explain none of the variation in the dependent variable.
- 1 means they explain all the variation.
In regression analysis, R-squared provides a “goodness of fit” score—showing how close the data points are to the fitted regression line. The closer the points are to the line, the higher the R².
R-Squared Formula
The most common formula for R-squared is: R2=1−RSSTSSR^2 = 1 – \frac{RSS}{TSS}R2=1−TSSRSS
Where:
- R² = Coefficient of determination
- RSS (Residual Sum of Squares) = Unexplained variation
- TSS (Total Sum of Squares) = Total variation
This formula essentially compares how much variation remains unexplained by the model (RSS) to the total variation (TSS). Subtracting the ratio from 1 gives the proportion of explained variation.
How to Calculate R-Squared
To calculate R² manually:
- Find the mean of the dependent variable (Ȳ).
- Compute TSS – subtract Ȳ from each actual Y value, square the results, and sum them.
- Compute RSS – subtract each predicted Y value (Ŷ) from its actual Y, square the differences, and sum them.
- Apply the formula: R2=1−RSSTSSR^2 = 1 – \frac{RSS}{TSS}R2=1−TSSRSS
- Interpret the result: A higher R² means a stronger relationship between the independent and dependent variables.
What Does R-Squared Tell You?
R-squared answers the question: “How well does my model explain the data?”
- R² = 0.80 (80%) means 80% of the variance in the dependent variable can be explained by the model’s predictors.
- R² = 0.30 (30%) means only 30% of the variance is explained — the rest is due to unknown or random factors.
However, R² alone doesn’t indicate whether the model is accurate or unbiased—you must check other diagnostics, such as residual plots, p-values, and adjusted R-squared.
Example of R-Squared in Action
Suppose you build a linear regression model predicting house prices based on square footage.
- If R² = 0.92, then 92% of the variation in house prices can be explained by the size of the house.
- If R² = 0.25, the model explains only 25% of the variation—implying other factors (like location or age) might play a larger role.
How to Interpret R-Squared
| R² Value | Interpretation |
|---|---|
| 0% – 30% | Weak fit – model explains little of the variation |
| 30% – 60% | Moderate fit – model explains some of the variation |
| 60% – 90% | Strong fit – model explains most of the variation |
| 90% – 100% | Very strong fit – model explains nearly all variation (could be overfitted) |
⚠️ Note: A “good” R² value depends on context. In finance, an R² above 0.7 is strong, while in social sciences, even 0.5 may be acceptable.
R-Squared vs Adjusted R-Squared
- R-Squared increases automatically when new variables are added to the model, even if they’re irrelevant.
- Adjusted R-Squared corrects for this by penalizing unnecessary variables. It increases only if a new variable actually improves the model.
Thus, adjusted R² is a more reliable measure for multiple regression models.
R-Squared vs Beta
While both are used in finance, they measure different things:
- R-Squared shows how closely an asset’s returns track a benchmark.
- Beta shows how strongly the asset moves relative to the benchmark.
Example:
- A fund with R² = 95% and Beta = 1.2 moves almost in line with the market but is 20% more volatile.
Applications of R-Squared
R-squared is used across various fields:
- Finance: To assess how closely a mutual fund tracks its benchmark index.
- Economics: To measure how well GDP, inflation, or employment models explain outcomes.
- Marketing: To evaluate the success of ad spending or pricing models.
- Science & Engineering: To test model accuracy in predicting experimental outcomes.
- Sports Analytics: To assess how well player statistics predict performance results.
Limitations of R-Squared
While R² is useful, it’s not perfect:
- It doesn’t indicate causation—only correlation.
- A high R² can result from overfitting, where the model fits noise rather than signal.
- A low R² doesn’t always mean a bad model, especially in fields with high natural variability (like psychology or human behavior).
- It cannot detect bias in predictions—you must check residual plots for that.
Improving Your R-Squared Value
If your model’s R² is too low, consider these steps:
- Add relevant predictors — use domain knowledge or feature selection techniques.
- Remove irrelevant or redundant variables to avoid noise.
- Transform variables — use logarithmic or polynomial terms if relationships are nonlinear.
- Check for multicollinearity using VIF (Variance Inflation Factor).
- Validate your model using cross-validation or test datasets.
Can R-Squared Be Negative?
Under normal conditions, no. R² ranges from 0 to 1.
However, in some computational cases—such as when using a model that doesn’t include a constant term—R² can appear slightly negative, which simply means the model fits worse than a horizontal line (no predictive power).
What Is a “Good” R-Squared Value?
It depends on your field:
- Finance: Above 0.7 = high correlation; below 0.4 = low correlation.
- Economics: Around 0.6–0.8 is generally acceptable.
- Social Sciences: Even 0.3–0.5 can be considered good due to human variability.
- Physics or Engineering: Above 0.9 is expected for strong predictive models.
R-Squared in Finance Example
If a mutual fund’s R² = 0.95, it moves almost perfectly in line with its benchmark index—making it ideal for investors seeking index-like returns.
If the R² = 0.50, half of the fund’s movement is independent of the benchmark—useful for active managers aiming for uncorrelated returns.
The Bottom Line
R-squared is a key measure in regression analysis, showing how well a model fits observed data.
A high R² suggests your model explains more variability, but it doesn’t guarantee accuracy or reliability.
Always interpret R² alongside adjusted R², residuals, and domain knowledge to ensure your conclusions are valid.
R² helps analysts, investors, and researchers quantify model performance—but wisdom lies in knowing what it doesn’t tell you.




