Calculate Capital Asset Pricing Model (CAPM) with Python


Capital Asset Pricing Model (CAPM)

CAPM is one of the most commonly used formula in finance, which describes the relationship betwwen systematic risk and expected return for assets, particularly stocks. The model is based on the relationship between an asset’s beta, the risk-free rate (typically the Treasury bill rate) and the equity risk premium.

The CAPM formula is as below:

$$r_i = r_f + \beta_i(r_m - r_f)$$

Where:

  • $r_i$ is the expected return of a security
  • $r_f$ is the risk free rate
  • $\beta_i$ is the beta of the security relative to the market
  • $r_m$ is the market return which includes all securities in the market. A good representation of the U.S. market portfolio is the S&P 500.

The goal of the CAPM formula is to gauge whether the current price of a stock is consistent with its likely return.

The beta is a measure of how much risk the investment will add to a portfolio that looks like the market, or a measure of a portfolio’s volatility in relation to the overall market. If beta is less than 1, it indicates the portforlio is less volatile than the market, which means it reduces the risk of a portfolio. On the other hand, if a portforlio is riskier than the market, it will have a betwa greater than 1.

Load the list of S&P 500 companies from Wikepedia

S&P 500 is a stock market index tracking the stock performance of 500 of the largest companies listed on stock exchanges in the United States.

# Get the list of S&P 500 stocks from Wikepedia
import pandas as pd

def load_data(url):
    html = pd.read_html(url, header=0)
    return html

# Load the list of S&P 500 companies
url = 'https://en.wikipedia.org/wiki/List_of_S%26P_500_companies'
df = load_data(url)[0]
df.head()

SymbolSecurityGICS SectorGICS Sub-IndustryHeadquarters LocationDate addedCIKFounded
0MMM3MIndustrialsIndustrial ConglomeratesSaint Paul, Minnesota1957-03-04667401902
1AOSA. O. SmithIndustrialsBuilding ProductsMilwaukee, Wisconsin2017-07-26911421916
2ABTAbbottHealth CareHealth Care EquipmentNorth Chicago, Illinois1957-03-0418001888
3ABBVAbbVieHealth CareBiotechnologyNorth Chicago, Illinois2012-12-3115511522013 (1888)
4ACNAccentureInformation TechnologyIT Consulting & Other ServicesDublin, Ireland2011-07-0614673731989

Retrieve the latest stock price using yfinance

We retrieve the latest year’s stock price using yfinance.

import warnings
warnings.filterwarnings('ignore')
import yfinance as yf

data = yf.download(
            # tickers list
            tickers = list(df['Symbol']) + ['^GSPC'],
            # valid periods: 1d, 5d, 1mo, 3mo, 6mo, 1y, 5y, 10y, ytd, max
            period = '1y',
            # valid intervals: 1m, 2m, 5m, 15m, 15m, 30m, 60m, 90m, 1h, 1d, 5d, 1wk, 1mo, 3mo
            interval = '1d',
            # group by ticker
            group_by = 'ticker',
            # adjust all OHLC automatically
            auto_adjust = True,
            # download pre/post regular market hours data
            prepost = True,
            # use threads for mass downloading
            threads = True,
            # proxy URL scheme when downloading
            proxy = None
            )
[*********************100%%**********************]  504 of 504 completed

2 Failed downloads:
['BF.B']: Exception('%ticker%: No price data found, symbol may be delisted (period=1y)')
['BRK.B']: Exception('%ticker%: No data found, symbol may be delisted')
data['^GSPC'].head()

PriceOpenHighLowCloseVolume
Date
2023-02-273992.3601074018.0500493973.5500493982.2399903836950000
2023-02-283977.1899413997.5000003968.9799803970.1499025043400000
2023-03-013963.3400883971.7299803939.0500493951.3898934249480000
2023-03-023938.6799323990.8400883928.1599123981.3500984244900000
2023-03-033998.0200204048.2900393995.1699224045.6398934084730000
# Get the available tickers
tickers_unavailable = ['BF.B', 'BRK.B']
tickers = [ticker for ticker in list(df['Symbol']) + ['^GSPC'] if ticker not in tickers_unavailable]
print('Number of available tickers: ', len(tickers))
Number of available tickers:  502
# Use Close as the price for each stock and create a new dataframe to store this data.
import numpy as np
stock = pd.DataFrame(columns = ['Date'], data = data['MMM'].index)
for i in range(0,len(tickers)):
    ticker = tickers[i]
    stock = stock.join(
             pd.Series(
                 data[ticker]['Close'].to_list(), 
                 name=ticker,
                 index=stock.index))

stock = stock.set_index('Date')
stock.head()

MMMAOSABTABBVACNADBEAMDAESAFLA...GWWWYNNXELXYLYUMZBRAZBHZIONZTS^GSPC
Date
2023-02-27101.75928564.20665097.783730148.282318262.200623322.32000778.76999723.99642066.643791141.154373...668.472412104.07438763.236515101.250412124.182861296.230011122.35057847.859604163.9875953982.239990
2023-02-28101.26115464.51136899.694916147.917068261.511230323.95001278.58000223.80352666.565651140.945908...661.946350107.27166062.433975101.349144124.761726300.250000122.91619148.078060165.5038913970.149902
2023-03-01103.58263465.63193598.822639149.233841259.581085323.38000578.29000123.75530166.536346136.518066...664.501221111.08263461.17697599.808891123.819832302.339996121.13997747.907097166.0687713951.389893
2023-03-02103.29126766.487114100.586800148.378433261.225677333.50000080.44000223.76494465.989365140.648071...677.860413112.26057462.211575100.954201126.253059306.059998122.04296145.884064167.0697483981.350098
2023-03-03104.56948166.968750102.370560149.993103265.105774344.04000981.51999724.20861166.848907142.891754...690.823425114.65604462.946434102.603035127.224380309.450012125.24810046.757858169.0319824045.639893

5 rows × 502 columns

Normalize and visualize the stock data

# Normalize stock data based on initial price
def normalize(df):
    x = df.copy()
    for i in x.columns:
        x[i] = x[i]/x[i][0]
    return x

stock_normalized = normalize(stock)
stock_normalized.head()

MMMAOSABTABBVACNADBEAMDAESAFLA...GWWWYNNXELXYLYUMZBRAZBHZIONZTS^GSPC
Date
2023-02-271.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.000000...1.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.000000
2023-02-280.9951051.0047461.0195450.9975370.9973711.0050570.9975880.9919620.9988270.998523...0.9902371.0307210.9873091.0009751.0046611.0135701.0046231.0045651.0092460.996964
2023-03-011.0179181.0221981.0106251.0064170.9900091.0032890.9939060.9899520.9983880.967154...0.9940591.0673390.9674310.9857630.9970771.0206260.9901051.0009921.0126910.992253
2023-03-021.0150551.0355181.0286661.0006480.9962821.0346861.0212010.9903540.9901800.996413...1.0140441.0786570.9837920.9970741.0166711.0331840.9974860.9587221.0187950.999777
2023-03-031.0276161.0430191.0469081.0115371.0110801.0673861.0349121.0088431.0030781.012308...1.0334361.1016740.9954131.0133591.0244921.0446271.0236820.9769801.0307611.015921

5 rows × 502 columns

# Plot the normalized price of several selected stocks
import plotly.express as px
fig = px.line(title = 'Normalized Prices')
for ticker in ['AMD','AAPL','GOOG', 'TSLA','IBM']:
    fig.add_scatter(x=stock_normalized.index, y=stock_normalized[ticker], name=ticker)
fig.show()

Normalized Price

Calculate the daily return

The daily return for each stock is calculated as follows:

  • Loop to each row of the stock normalized price
  • Calculate the percentage change from the previous day
  • The first row is set to 0 as there is no previous value
def daily_return(df):
    df_daily_return = df.copy()
    for i in df.columns:
        for j in range(1, len(df)):
            df_daily_return[i][j] = ((df[i][j]- df[i][j-1])/df[i][j-1]) * 100
        df_daily_return[i][0] = 0
    return df_daily_return

stock_daily_return = daily_return(stock_normalized)
stock_daily_return.head()

MMMAOSABTABBVACNADBEAMDAESAFLA...GWWWYNNXELXYLYUMZBRAZBHZIONZTS^GSPC
Date
2023-02-270.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000...0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000
2023-02-28-0.4895190.4745891.954503-0.246320-0.2629250.505710-0.241202-0.803845-0.117251-0.147686...-0.9762653.072104-1.2691080.0975130.4661391.3570500.4622890.4564520.924641-0.303600
2023-03-012.2925671.737008-0.8749460.890210-0.738074-0.175955-0.369052-0.202598-0.044023-3.141518...0.3859633.552638-2.013327-1.519749-0.7549550.696085-1.445062-0.3555950.341309-0.472526
2023-03-02-0.2812891.3029921.785178-0.5732000.6335563.1294442.7462020.040595-0.8220803.025244...2.0104091.0604181.6911581.1475021.9651361.2304030.745406-4.2228260.6027480.758219
2023-03-031.2374850.7244051.7733541.0882111.4853433.1604221.3426081.8668951.3025481.595246...1.9123422.1338481.1812261.6332500.7693451.1076312.6262381.9043531.1745001.614774

5 rows × 502 columns

# Plot the daily return for the selected stocks
fig = px.line(title = 'Daily Return')
for ticker in ['AMD','AAPL','GOOG', 'TSLA','IBM']:
    fig.add_scatter(x=stock_daily_return.index, y=stock_daily_return[ticker], name=ticker)
fig.show()

Daily Return

Calculate Beta for AAPL

Beta is the slope of the line regression line, or the market return vs. stock return, which could be calculated using np.polyfit

# S&P 500 has ticker symbol ^GSPC

beta, alpha = np.polyfit(stock_daily_return['^GSPC'], stock_daily_return['AAPL'], 1)
beta
1.0716621967198638

Calculate CAPM for AAPL

# Daily return for the market
stock_daily_return['^GSPC'].mean()
0.09994347738081993
# Annual return for the market would be the daily return multiply the number of trading days in a year
r_m = stock_daily_return['^GSPC'].mean() * 252
print('Return of the market: ', r_m)
Return of the market:  25.185756299966624
# Expected return could be calculated using CAPM formula
# Assume the risk free rate rate is 0
r_f = 0
r_e_AAPL = r_f + beta * (r_m - r_f)
print('Expected return: ', r_e_AAPL)
Expected return:  26.99062292247338

Apparently this is not realistic.

  • The markets are very competitive and efficient, investors who work in the markets are rational and risk-averse. Everything can change very fast.

Reference:

https://www.investopedia.com/terms/c/capm.asp

https://www.mlq.ai/capital-asset-pricing-model-python/


Author: wenvenn
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source wenvenn !