Quantum Finance Application on Portfolio Optimization¶
Copyright (c) 2021 Institute for Quantum Computing, Baidu Inc. All Rights Reserved.
Overview¶
Current finance problems can be mainly tackled by three areas of quantum algorithms: quantum simulation, quantum optimization, and quantum machine learning [1,2]. Many financial problems are essentially combinatorial optimization problems, and corresponding algorithms usually have high time complexity and are difficult to implement. Due to the power of quantum computing, these complex problems are expected to be solved by quantum algorithms in the future.
The Quantum Finance module of Paddle Quantum focuses on quantum optimization: how to apply quantum algorithms in real finance optimization problems. This tutorial focuses on how to use quantum algorithms to solve the portfolio optimization problem.
Portfolio Optimization Problem¶
A portfolio is a collection of financial investments, such as stocks, bonds, cash, etc. Many investment managers face the portfolio optimization problem. This problem requires practitioners to invest in various projects, according to their target returns and risks. This aims to minimize the risk given a certain return or maximize the return given a certain risk.
A detailed description of portfolio optimization is as follows: If you are an active investment manager who wants to invest $K$ dollars to $N$ projects, each with its return and risk, your goal is to find an optimal way to invest in the projects, taking into account the market impact and transaction costs.
Encoding Portfolio Optimization Problem¶
To transform the portfolio optimization problem into a problem applicable for parameterized quantum circuits, we need to encode the portfolio optimization problem into a Hamiltonian. To make the modeling easy to formulate, two assumptions are made to constrain the problem:
- Each asset is invested with an equal amount of money.
- Budget is a multiple of each investment amount and must be fully spent.
In this model we unitize the investment amount, i.e., if the budget is $3$, then the manager should invest $3$ assets. Since the actual investment budget is limited and there are many investable assets, it is important to set the number of investable assets larger than the budget.
In the theory of portfolio optimization, the overall risk of a portfolio is related to the covariance between assets, which is proportional to the correlation coefficients of any two assets. The smaller the correlation coefficients, the smaller the covariance, and then the smaller the overall risk of the portfolio [3].
Here we use the mean-variance approach to model this problem:
$$ \omega = \max _{x \in\{0,1\}^{n}} \mu^{T} x - q x^{T} S x \quad\quad \tag{1} \text { subject to: } 1^{T} x=B, $$where each symbol has the following meaning:
- $x\in {\{0,1\}}^N$ denotes the vector of binary decision variables, which indicate which each assets is picked ($x_i=1$) or not ($x_i = 0$),
- $\mu \in \mathbb{R}^n$ defines the expected returns for the assets,
- $S \in \mathbb{R}^{n \times n}$ represents the covariances between the assets,
- $q > 0$ represents the risk factor of investment decision making,
- $\mathbb{1}$ denotes a vector with all values of $1$,
- $B$ denotes the budget, i.e. the number of assets to be selected out of $N$.
According to the model equation, we can define the loss function:
$$ C_x = q \sum_i \sum_j s_{ji}x_ix_j - \sum_{i}x_i \mu_i + A \left(B - \sum_i x_i\right)^2, \tag{2} $$where $s_{ij}$ denotes the elements of the covariance matrix $S$.
Since the loss function is to be optimized using the gradient descent method, some modifications are made in the definition based on the equations of the model. The first term represents the risk of the investment. The second term represents the expected return on this investment. The third term constrains the budget $B$ to be invested evenly in different projects. $A$ is the penalty parameter, usually set to a larger number.
We now need to transform the cost function $C_x$ into a Hamiltonian to realize the encoding of the portfolio optimization problem. Each variable $x_{i}$ has two possible values, $0$ and $1$, corresponding to quantum states $|0\rangle$ and $|1\rangle$. Note that every variable corresponds to a qubit and so $n$ qubits are needed for solving the portfolio optimization problem. The Pauli $Z$ operator has two eigenstates, $|0\rangle$ and $|1\rangle$ . Their corresponding eigenvalues are 1 and -1, respectively. So we consider encoding the cost function as a Hamiltonian using the Pauli $Z$ matrix.
Now we would like to consider the mapping $$ x_{i} \mapsto \frac{I-Z_{i}}{2}, \tag{4} $$
where $Z_{i} = I \otimes I \otimes \ldots \otimes Z \otimes \ldots \otimes I$ with $Z$ operates on the qubit at position $i$. Under this mapping, the value of $x_i$ can be illustrated in a different way. If the qubit $i$ is in state $|1\rangle$, then $x_{i} |1\rangle = \frac{I-Z_{i}}{2} |1\rangle = 1|1\rangle $, which means that the stork $i$ is in the optimal portfolio. Also, for a qubit $i$ in state $|0\rangle$, $x_{i}|0\rangle = \frac{I-Z_{i}}{2} |0\rangle = 0 |0\rangle $.
Thus using the above mapping, we can transform the cost function $C_x$ into a Hamiltonian $H_C$ for the system of $n$ qubits and realize the quantumization of the portfolio optimization problem. Then the ground state of $H_C$ is the optimal solution to the portfolio optimization problem. In the following section, we will show how to use a parameterized quantum circuit to find the ground state, i.e., the eigenvector with the smallest eigenvalue.
Paddle Quantum Implementation¶
To investigate the portfolio optimization problem using Paddle Quantum, there are some required packages to import, which are shown below.
# Import packages needed
import numpy as np
import pandas as pd
import datetime
# Import related modules from Paddle Quantum and PaddlePaddle
import paddle
import paddle_quantum
from paddle_quantum.ansatz import Circuit
from paddle_quantum.finance import DataSimulator, portfolio_optimization_hamiltonian
Prepare experimental data¶
In this tutorial, we choose stocks as an investment assets. For the data used in the experimental tests, two options are provided:
- The first method is to generate random data according to certain requirements, e.g. number of assets.
If the user prepares data using this method, then when initializing the data, it is necessary to give the list of parameters: a list of names of investable stocks (assets), the start date, and the end date of the trading data.
num_assets = 7 # Number of investable projects
stocks = [("STOCK%s" % i) for i in range(num_assets)]
data = DataSimulator( stocks = stocks, start = datetime.datetime(2016, 1, 1), end = datetime.datetime(2016, 1, 30))
data.randomly_generate() # Generate random data
- The second method is that the user can choose to set the data themselves, i.e. real stock data collected by themselves. Considering that the number of stocks contained in the file may be large, the user can specify the number of stocks used for this experiment, i.e.
num_assets
as initialized above.
We collect the closing prices of 12 stocks for 35 trading days into the realStockData_12.csv
file, where we choose to read only the first 3 stocks.
In this tutorial, we choose to read real data as experimental data.
df = pd.read_csv('realStockData_12.csv')
dt = []
for i in range(num_assets):
mylist = df['closePrice'+str(i)].tolist()
dt.append(mylist)
# Output the closing price of the seven stocks read from the file for the 35 trading days
print(dt)
# Specify the experimental data as a local file read by the user
data.set_data(dt)
[[16.87, 17.18, 17.07, 17.15, 16.66, 16.79, 16.69, 16.99, 16.76, 16.52, 16.33, 16.39, 16.45, 16.0, 16.09, 15.54, 13.99, 14.6, 14.63, 14.77, 14.62, 14.5, 14.79, 14.77, 14.65, 15.03, 15.37, 15.2, 15.24, 15.59, 15.58, 15.23, 15.04, 14.99, 15.11, 14.5], [32.56, 32.05, 31.51, 31.76, 31.68, 32.2, 31.46, 31.68, 31.39, 30.49, 30.53, 30.46, 29.87, 29.21, 30.11, 28.98, 26.63, 27.62, 27.64, 27.9, 27.5, 28.67, 29.08, 29.08, 29.95, 30.8, 30.42, 29.7, 29.65, 29.85, 29.25, 28.9, 29.33, 30.11, 29.67, 29.59], [5.4, 5.48, 5.46, 5.49, 5.39, 5.47, 5.46, 5.53, 5.5, 5.47, 5.39, 5.35, 5.37, 5.24, 5.26, 5.08, 4.57, 4.44, 4.5, 4.56, 4.52, 4.59, 4.66, 4.67, 4.66, 4.72, 4.84, 4.81, 4.84, 4.88, 4.89, 4.82, 4.74, 4.84, 4.79, 4.63], [3.71, 3.75, 3.73, 3.79, 3.72, 3.77, 3.76, 3.74, 3.78, 3.71, 3.61, 3.58, 3.61, 3.53, 3.5, 3.42, 3.08, 2.95, 3.04, 3.05, 3.05, 3.13, 3.12, 3.14, 3.11, 3.07, 3.23, 3.3, 3.31, 3.3, 3.33, 3.31, 3.22, 3.31, 3.25, 3.12], [5.72, 5.75, 5.74, 5.81, 5.69, 5.79, 5.77, 5.8, 5.89, 5.78, 5.7, 5.69, 5.75, 5.7, 5.71, 5.54, 4.99, 4.89, 4.94, 5.08, 5.39, 5.35, 5.23, 5.26, 5.19, 5.18, 5.31, 5.33, 5.31, 5.38, 5.39, 5.41, 5.28, 5.3, 5.38, 5.12], [7.62, 7.56, 7.68, 7.75, 7.79, 7.84, 7.82, 7.8, 7.92, 7.96, 7.93, 7.87, 7.86, 7.82, 7.9, 7.7, 6.93, 6.91, 7.18, 7.31, 7.35, 7.53, 7.47, 7.48, 7.35, 7.33, 7.46, 7.47, 7.39, 7.47, 7.48, 8.06, 8.02, 8.01, 8.11, 7.87], [3.7, 3.7, 3.68, 3.7, 3.63, 3.66, 3.63, 3.63, 3.66, 3.63, 3.6, 3.59, 3.63, 3.6, 3.61, 3.54, 3.19, 3.27, 3.27, 3.31, 3.3, 3.32, 3.33, 3.38, 3.36, 3.34, 3.39, 3.39, 3.37, 3.42, 3.43, 3.37, 3.32, 3.36, 3.37, 3.3]]
Encoding Hamiltonian¶
Here we construct the Hamiltonian $H_C$ of Eq. (2) with the replacement in Eq. (3).
In the process of encoding Hamiltonian, we first need to calculate the covariance matrix $S$ between the returns of each stock, which is available in the finance
module and can be called directly.
s = data.get_asset_return_covariance_matrix()
The second step is to compute the expected return vector $\mu$ for each stock. Similarly, paddle quantum also supports this function for users.
mu = data.get_asset_return_mean_vector()
Based on the provided and calculated parameters, the Hamiltonian is constructed below. Here we set the penalty parameter to the number of investable stocks.
q = 0.5 # risk appetite of the decision maker
budget = num_assets // 2 # budget
penalty = num_assets # penalty parameter
hamiltonian = portfolio_optimization_hamiltonian(penalty, mu, s, q, budget)
Calculating the loss function¶
We adopt a parameterized quantum circuit consisting of $U_3(\vec{\theta})$ and $\text{CNOT}$ gates. It can be constructed by calling the built-in method complex_entangled_layer()
.
After running the quantum circuit, we obtain the circuit output $|\vec{\theta }\rangle$. From the output state of the circuit, we can calculate the loss function of the portfolio optimization under the classical-quantum hybrid model:
$$ L(\vec{\theta}) = \langle\vec{\theta}|H_C|\vec{\theta}\rangle. \tag{4} $$We then use a classical optimization algorithm to minimize this function and find the optimal parameters $\vec{\theta}^*$. The following code shows a complete network built with Paddle Quantum and PaddlePaddle.
class PONet(paddle.nn.Layer):
def __init__(self, num_qubits, p, dtype="float64"):
super(PONet, self).__init__()
self.depth = p
self.num_qubits = num_qubits
self.cir = Circuit(self.num_qubits)
self.cir.complex_entangled_layer(depth=self.depth)
def forward(self):
"""
Forward propagation
"""
state = self.cir(init_state)
loss = loss_func(state)
return loss, self.cir
Training the quantum neural network¶
After defining the quantum neural network, we use the gradient descent method to update the parameters to minimize the expectation value in Eq. (4).
SEED = 1000 # Set a global RNG seed
p = 2 # Number of layers in the quantum circuit
ITR = 600 # Number of training iterations
LR = 0.4 # Learning rate of the optimization method based on gradient descent
Here, we optimize the network defined above in PaddlePaddle.
# number of qubits
num_qubits = len(mu)
# Fix paddle random seed
paddle.seed(SEED)
# Building Quantum Neural Networks
net = PONet(num_qubits, p)
# Define initial state
init_state = paddle_quantum.state.zero_state(num_qubits)
# Define loss function
loss_func = paddle_quantum.loss.ExpecVal(hamiltonian)
# Use Adam optimizer
opt = paddle.optimizer.Adam(learning_rate=LR, parameters=net.parameters())
# Gradient descent iteration
for itr in range(1, ITR + 1):
# Run the network defined above
loss, cir = net()
# Calculate the gradient and optimize
loss.backward()
opt.minimize(loss)
opt.clear_grad()
if itr % 50 == 0:
print("iter: ", itr, " loss: ", "%.7f"% loss.numpy())
iter: 50 loss: 0.0399189 iter: 100 loss: 0.0098760 iter: 150 loss: 0.0085572 iter: 200 loss: 0.0074596 iter: 250 loss: 0.0066504 iter: 300 loss: 0.0061929 iter: 350 loss: 0.0059874 iter: 400 loss: 0.0059097 iter: 450 loss: 0.0058763 iter: 500 loss: 0.0058761 iter: 550 loss: 0.0058756 iter: 600 loss: 0.0058689
Theoretical minimum loss value¶
The theoretical minimum value of $C_x$ corresponds to the minimum eigenvalue of the Hamiltonian constructed above. So we would like to see the value of the loss function found by the parameterized circuit optimization close to the theoretical minimum. For smaller num_assets
, we can verify this based on the following code.
H_C_matrix = hamiltonian.construct_h_matrix()
print("Theoretical minimum loss value: ", np.linalg.eigvalsh(H_C_matrix)[0])
print("Practical minimum loss value: ", float(loss.numpy()))
Theoretical minimum loss value: 0.0058722496 Practical minimum loss value: 0.0058689117431640625
In this case, the minimum loss from the parameterized circuit optimization is the same as the theoretical minimum loss, which ensures that the investment solution found is optimal. If two values do not match well, we can adjust parameters such as the random seed SEED
, the number of layers of the quantum circuit p
, the number of iterations ITR
and the gradient descent optimization rate LR
, to reapproximate the optimal solution.
Decoding the quantum solution¶
After obtaining the minimum value of the loss function and the corresponding set of parameters $\vec{\theta}^*$, our task has not been completed. To obtain an approximate solution to the portfolio optimization problem, it is necessary to decode the solution to the classical optimization problem from the quantum state $|\vec{\theta}^*\rangle$ output by the circuit. Physically, to decode a quantum state, we need to measure it and then calculate the probability distribution of the measurement results:
$$ p(z) = |\langle z|\vec{\theta}^*\rangle|^2. \tag{5} $$In the case of quantum parameterized circuits with sufficient expressiveness, the greater the probability of a certain bit string, the greater the probability that it corresponds to an optimal solution to the portfolio optimization problem.
Paddle Quantum provides a function to read the probability distribution of the measurement results of the state output by the quantum circuit:
# Repeat the simulated measurement of the circuit output state 2048 times
final_state = cir(init_state)
prob_measure = final_state.measure(shots=2048)
investment = max(prob_measure, key=prob_measure.get)
print("The bit string form of the solution: ", investment)
The bit string form of the solution: 0100110
The result of our measurement is a bit string that represents the solution to the portfolio optimization problem: $1$ appearing at the $i$th bit indicates that the $i$th asset was selected for investment. For example, the result 0100110
above would indicate that the second, fifth, and sixth stocks were selected out of the seven available investments. The number of $1$s in the string should be the same as the budget $B$. If the result is not like this, users can also get better training results by adjusting the parameters or structure of parameterized quantum circuits.
Conclusion¶
In this tutorial, the optimal solution to the portfolio optimization is approximated through the Variational Quantum Eigensolver (VQE) based on the mean-variance approach. Given the budget, available assets, and investment risks, the parameterized quantum circuits are applied to find the optimal portfolio by calculating the returns of investment projects and the covariance matrix between the returns of each investment project.
References¶
[1] Orus, Roman, Samuel Mugel, and Enrique Lizaso. "Quantum computing for finance: Overview and prospects." Reviews in Physics 4 (2019): 100028.
[2] Egger, Daniel J., et al. "Quantum computing for Finance: state of the art and future prospects." IEEE Transactions on Quantum Engineering (2020).
[3] Markowitz, H.M. (March 1952). "Portfolio Selection". The Journal of Finance. 7 (1): 77–91. doi:10.2307/2975974. JSTOR 2975974.