Project II - Portfolio optimization
We consider the problem of choosing a long term stock portfolio, given a set of stocks and their price over some period under risk aversion parameter γ > 0.
Assume there are m stocks to be considered. The portfolio will be represented by a column vector w ∈ ℝm, such that ∑i=1..m wi = 1. If wi > 0, you use a fraction wi of your total money to buy the i‘th stock, while wi < 0 represent shorting that stock. In both cases we assume the stock is bought/shorted for the entire period.
Let pj,i represent the price of the i‘th stock at time step j. If there are n + 1 time steps, then p ∈ ℝ(n+1)×m is a matrix.
We let r ∈ ℝn×m be the matrix, where rj,i represents the fractional reward of stock i at time step j, i.e. rj,i = (pj+1,i − pj,i) / pj,i for 1 ≤ j ≤ n.
By rj we denote the j‘th row of r, viewed as a column vector (rj,1, …, rj,m).
We make the (unrealistic) assumption that we can model r by a random variable, distributed as a multivariate Gaussian, with estimated means
μ ≃ 1/n · ∑j=1..n rj
and estimated covariance matrix
Σ ≃ 1 / n · ∑j=1..n [(rj − μ)(rj − μ)T]
Note that μi and Σi,i are the estimated mean and variance for stock i.
The distribution of returns using some w is then
Rw = N(μw, σw2)
μw = wTμ
σw2 = wTΣw
Now, we want to maximize for a balance between high return μw and low risk σw2. This leads to the following optimization problem, where we want to find the value w* of w maximizing the following expresion:
maximize wTμ − γwTΣw
subject to ∑i=1..m wi = 1
where γ controls the balance between risk and return. A high value of γ indicate we are willing to take low risk and vise versa.
In this project you should find w* for different values of γ and using real stock values of your choice. The project consists of the following three questions.
-
We need a module for collecting stock values. For this you can use the module
pandas-datareader
(pip install pandas-datareader setuptools
). Using this you should write a functionget_prices([stock
1
, ..., stock
k
], step_size, period)
that returns a tuple(stocks, p)
, wherep[j, i]
represents the opening price of stocki
at time stepj
andstocks[i]
is the name of thei
‘th stock (adjust the arguments toget_prices
to the data available at your data source). Make a plot of p, where each stock is labeled with its name, e.g. MSFT or GOGL. You should use at least five stocks. -
Calculate r, μ and Σ using the formulas above and the p calculated in the first question. Plot the probability density function (pdf) of the return of each stock.
Hint. The methodnorm.pdf
from the modulescipy.stats
might become convenient. -
Solve the optimization problem defined above for different values of γ, e.g.
gammas = (np.arange(10) / 5) + 1
, and plot the pdf of each solution to a single plot with appropriate legends. Finally create a scatter plot of how w* changes as γ changes. For each value of γ plot the fraction of each stock in the portfolio.