1. Describing Financial Data: The Core Building Blocks
Before we can build sophisticated models, we must first learn how to describe a single set of financial data, such as the monthly returns of a particular stock. These descriptive statistics are the fundamental building blocks of all financial analysis.
1.1. The Mean: Finding the Center
The mean is simply the average of a set of data points, calculated by summing all the values and dividing by the number of observations. Its primary role is to provide a single number that represents the “center” or “expected” value of the data. For an investor, the mean is a crucial first look at an asset’s performance. For example, if you calculate the mean of a stock’s monthly returns over the past year, that figure gives you a baseline for what you might expect in a typical month going forward.
1.2. Variance and Standard Deviation: Measuring Risk and Volatility
While the mean tells us about the expected return, it tells us nothing about the risk involved. That’s where variance and standard deviation come in.
Variance is a measure of how spread out the data points are around their mean. It is calculated by averaging the squared differences of each data point from the mean. Squaring the differences is a critical step; it ensures that deviations both above and below the mean contribute positively to the measure of spread, and it gives greater weight to larger deviations.
The standard deviation is the positive square root of the variance. The primary advantage of using the standard deviation is that it is expressed in the same units as the mean. If a stock’s average monthly return is 1%, a standard deviation of 2% is immediately understandable, whereas a variance of 0.0004 is not.
For a finance student, this is the key insight: Standard deviation is the most common way to quantify an asset’s risk or volatility. A stock with a high standard deviation has returns that are widely dispersed around the mean, indicating a higher degree of uncertainty and risk. A stock with a low standard deviation has more predictable returns and is therefore considered less risky.
Now that we can describe a single asset, let’s explore how we can measure the relationship between two different assets.