{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Supervised Learning with a Parameterized Quantum Circuit \n", "\n", "Note: Recall that supervised learning, a branch of Machine Learning is meant to learning a function that maps an input to an output based on example input-output pairs (data).\n", "\n", "\n", "Within the course, you have seen the following diagram representing a typical Quantum Machine Learning situation with classical data :\n", "\n", "
\n", " \"qml\"\n", "
Source: https://dkopczyk.quantee.co.uk/wp-content/uploads/2018/11/outline-768x346.png
\n", "
\n", "\n", "The quantum circuit is defined by an quantum operator \"data preparation\" (i.e., the feature map), followed by a operator dependent on real parameters to tweak by classical optimization. We will introduce an example of such situation, showing and explaining each step of the processus." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from scipy.optimize import minimize\n", "\n", "import cirq\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.metrics import mean_squared_error, accuracy_score\n", "\n", "%matplotlib notebook" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## I. Get Data\n", "\n", "We will generate a random 2D binary classification dataset (points with 1/0 labels inside a given radius) given the following functions." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Set a random seed\n", "np.random.seed(42)\n", "\n", "# Make a dataset of points inside and outside of a circle\n", "def circle(samples, center=[0.0, 0.0], radius=np.sqrt(2 / np.pi)):\n", " \"\"\"\n", " Generates a dataset of points with 1/0 labels inside a given radius.\n", "\n", " Args:\n", " samples (int): number of samples to generate\n", " center (tuple): center of the circle\n", " radius (float: radius of the circle\n", "\n", " Returns:\n", " Xvals (array[tuple]): coordinates of points\n", " yvals (array[int]): classification labels\n", " \"\"\"\n", " Xvals, yvals = [], []\n", "\n", " for i in range(samples):\n", " x = 2 * (np.random.rand(2)) - 1\n", " y = 0\n", " if np.linalg.norm(x - center) < radius:\n", " y = 1\n", " Xvals.append(x)\n", " yvals.append(y)\n", " return np.array(Xvals), np.array(yvals)\n", "\n", "\n", "def plot_data(x, y, fig=None, ax=None):\n", " \"\"\"\n", " Plot data with red/blue values for a binary classification.\n", "\n", " Args:\n", " x (array): array of data points \n", " y (array[int]): array of data points labels\n", " \"\"\"\n", " if fig == None:\n", " fig, ax = plt.subplots(1, 1, figsize=(5, 5))\n", " reds = y == 0\n", " blues = y == 1\n", " ax.scatter(x[reds, 0], x[reds, 1], c=\"red\", s=20, edgecolor=\"k\")\n", " ax.scatter(x[blues, 0], x[blues, 1], c=\"blue\", s=20, edgecolor=\"k\")\n", " ax.set_xlabel(\"$x_1$\")\n", " ax.set_ylabel(\"$x_2$\")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "X, Y = circle(200)\n", "fig, ax = plt.subplots(1, 1, figsize=(4, 4))\n", "plot_data(X, Y, fig=fig, ax=ax)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## II. Dataset transformation \n", "\n", "Often in Machine Learning, we transform the data before feeding to a model. For instances, adding new features, or filling missing values. More on this topic can be found at https://scikit-learn.org/stable/data_transforms.html\n", "\n", "Here, we will add another feature as the product of the x1 and x2 features." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "data = np.empty((X.shape[0],X.shape[1]+1))\n", "data[:,:2] = X\n", "data[:,-1] = X[:,0]*X[:,1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## III. Split into train/validation\n", "\n", "In Machine Learning, the goal is to build a model that will generalize well. That means performance should be equivalent on unseen data. One way to train is to split into a training set and a validation set." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "data_train, data_test, Y_train, Y_test = train_test_split(data, Y, test_size=0.5, random_state=42, stratify=Y)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(1, 2, figsize=(10, 4))\n", "plot_data(data_train, Y_train, fig=fig, ax=ax[0])\n", "plot_data(data_test, Y_test, fig=fig, ax=ax[1])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## IV. Setting the model\n", "\n", "We will prepare the quantum circuit preparation mapping input-output pairs. \n", "\n", "### 1. Set a few hyperparameters \n", "\n", "We will work on 3 qubits and in a noiseless environment. \n", "n_layers is for the number of times we will apply the parameterized circuit $U(\\theta)$.\n", "If n_measurements is specified as 0, we will use the amplitudes of the final state as the output.\n", "Otherwise, you will use the results after measurements." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of parameters to optimize: 15\n" ] } ], "source": [ "n_qubits = 3\n", "n_layers = 5\n", "n_measurements = 0\n", "\n", "# Create a register of qubits\n", "qubits = cirq.LineQubit.range(n_qubits) \n", "\n", "# Initialize simulator\n", "simulator = cirq.Simulator()\n", "\n", "# initial parameters \n", "n_params = n_qubits * n_layers\n", "\n", "np.random.seed(42)\n", "theta0 = np.random.uniform(-2*np.pi, 2*np.pi, size=n_params)\n", "\n", "print(\"Number of parameters to optimize: \",n_params)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. Create the function yielding the circuit\n", "\n", "We will set the data preparation circuit to be a layer of RZ-RY-RZ operations and input the data as angles of the layer.\n", "\n", "For the parameterized circuit, we will use a layer of RX rotation followed by an entangling layer of CZs. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "First instance : \n", "[-0.25091976 0.90142861 -0.22618625]\n", "Divided by pi: [-0.07987024 0.28693364 -0.07199732] \n", "\n", "Initial params : \n", "[-1.57657536 5.66384302 2.91532185 1.23977908 -4.32259725 -4.32290035\n", " -5.55328511 4.60150516 1.27064871 2.61471713 -6.02451292 5.90506136\n", " 4.17759743 -3.61485335 -3.99830538]\n", "Divided by pi: [-0.50183952 1.80285723 0.92797577 0.39463394 -1.37592544 -1.37602192\n", " -1.76766555 1.46470458 0.40446005 0.83229031 -1.91766202 1.87963941\n", " 1.32977056 -1.15064356 -1.27270013] \n", "\n", "0 1 2\n", "│ │ │\n", "Rz(-0.08π) Rz(0.287π) Rz(-0.072π)\n", "│ │ │\n", "Ry(-0.08π) Ry(0.287π) Ry(-0.072π)\n", "│ │ │\n", "Rz(-0.08π) Rz(0.287π) Rz(-0.072π)\n", "│ │ │\n", "Rx(-0.502π) Rx(1.803π) Rx(0.928π)\n", "│ │ │\n", "@───────────@ │\n", "│ │ │\n", "Rx(-0.502π) @───────────@\n", "│ │ │\n", "│ Rx(0.928π) Rx(-1.376π)\n", "│ │ │\n", "@───────────@ │\n", "│ │ │\n", "Rx(-0.502π) @───────────@\n", "│ │ │\n", "│ Rx(0.395π) Rx(-1.768π)\n", "│ │ │\n", "@───────────@ │\n", "│ │ │\n", "Rx(-0.502π) @───────────@\n", "│ │ │\n", "│ Rx(-1.376π) Rx(0.404π)\n", "│ │ │\n", "@───────────@ │\n", "│ │ │\n", "Rx(-0.502π) @───────────@\n", "│ │ │\n", "│ Rx(-1.376π) Rx(-1.918π)\n", "│ │ │\n", "@───────────@ │\n", "│ │ │\n", "│ @───────────@\n", "│ │ │\n" ] } ], "source": [ "def data_preparation(x_i):\n", " \n", " # input x_i as angles of RY operations\n", " yield (cirq.ops.Rz(x_i[j]).on(qubits[j]) for j in range(len(x_i)))\n", " yield (cirq.ops.Ry(x_i[j]).on(qubits[j]) for j in range(len(x_i)))\n", " yield (cirq.ops.Rz(x_i[j]).on(qubits[j]) for j in range(len(x_i)))\n", "\n", "def pqc(params):\n", " \n", " for l in range(n_layers):\n", " yield (cirq.ops.Rx(params[(l+1)*j]).on(qubits[j]) for j in range(n_qubits))\n", " yield (cirq.ops.CZ(qubits[j],qubits[j+1]) for j in range(n_qubits-1))\n", " \n", " \n", "def qml_classifier_circuit(params, x_i):\n", " \n", " if n_measurements > 0:\n", " return cirq.Circuit(\n", " data_preparation(x_i),\n", " pqc(params),\n", " cirq.measure(*qubits, key='x'))\n", " else:\n", " return cirq.Circuit(\n", " data_preparation(x_i),\n", " pqc(params))\n", " \n", "initial_circuit = qml_classifier_circuit(theta0, data[0,:])\n", "print(\"First instance : \")\n", "print(data[0,:])\n", "print(\"Divided by pi: \",data[0,:] / np.pi,\"\\n\")\n", "\n", "print(\"Initial params : \")\n", "print(theta0)\n", "print(\"Divided by pi: \",theta0 / np.pi,\"\\n\")\n", "\n", "print(initial_circuit.to_text_diagram(transpose=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3. Specify the output for binary classification\n", "\n", "We will do many measurements when running the circuit. To associate a probability to predict 1, we will take the probability of outputing the bitstring with only 1s. That means, if the output of the parameterized circuit is a quantum state $\\psi_{x_i,\\theta}$, we have:\n", "$$ p(x_i = 1) = |\\langle 111 | \\psi_{x_i,\\theta} \\rangle|^2 $$\n", "\n", "which, when we do many measurements, is estimated by giving the frequence of the bitstring with all bits equal 1, divided by the number of measurements." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.0077321352" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def run_without_measurements(circuit):\n", " results = simulator.simulate(circuit)\n", " return abs(results.final_state[-1])\n", "\n", "def run_with_measurements(circuit):\n", " results = simulator.run(circuit, repetitions=n_measurements)\n", " counter_measurements = results.histogram(key='x')\n", " probability_being_1 = 0.\n", " if 7 in counter_measurements.keys():\n", " probability_being_1 = float(counter_measurements[7]) / n_measurements\n", " return probability_being_1\n", "\n", "def run_circuit(params, x_i):\n", " \n", " circuit = qml_classifier_circuit(params,x_i)\n", " \n", " if n_measurements > 0:\n", " probability_being_1 = run_with_measurements(circuit)\n", " else:\n", " probability_being_1 = run_without_measurements(circuit)\n", " return probability_being_1 \n", " \n", " \n", "run_circuit(theta0, data[0,:])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4. Set the loss function to optimize\n", "\n", "Our circuit will take data rows one by one. After sweeping the entire dataset, one has to define a loss function, characterizing the learning performance of our model. We will use the square loss, which will be our cost passed to a classical optimizer." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def sweep_data(params,X):\n", " \n", " probas = [run_circuit(params,X[i,:]) for i in range(X.shape[0])]\n", " return probas\n", "\n", "def compute_loss(params, X, labels):\n", " predictions = sweep_data(params,X)\n", " return mean_squared_error(labels, predictions)\n", "\n", "# will save by iterations the current cost, and accuracies\n", "tracking_cost = []\n", "\n", "\n", "def cost_to_optimize(params):\n", " cost = compute_loss(params, data_train, Y_train)\n", " tracking_cost.append(cost)\n", " return cost\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## IV. Learning\n", "\n", "Now is the time to optimize the parameters of the model for classification. We will use a simple derivative-free optimizer from scipy." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "32.46814274787903\n" ] } ], "source": [ "from time import time\n", "start_time = time()\n", "final_params = minimize(cost_to_optimize,theta0,method=\"COBYLA\",options={\"maxiter\":80})\n", "end_time = time()\n", "print(end_time-start_time)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " fun: 0.17432712074953746\n", " maxcv: 0.0\n", " message: 'Maximum number of function evaluations has been exceeded.'\n", " nfev: 80\n", " status: 2\n", " success: False\n", " x: array([-1.35375219, 6.52793509, 4.19245231, 1.28112633, -3.76285872,\n", " -4.38606526, -6.46510572, 4.54199861, 1.39333321, 2.69138756,\n", " -5.80072732, 5.84681918, 4.23858995, -3.5551252 , -3.98503786])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "final_params" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## V. Performances\n", "\n", "Let us see the accuracy performance by setting the predict function, converting probabilities to binary labels." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def predict(params,X):\n", " probas = sweep_data(params,X)\n", " return np.array([1 if p > .5 else 0 for p in probas])" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(list(range(len(tracking_cost))), tracking_cost)\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy on training set: 0.5\n", "Accuracy on validation set: 0.51\n" ] } ], "source": [ "# before training\n", "\n", "print(\"Accuracy on training set: \", accuracy_score(Y_train,predict(theta0,data_train)))\n", "print(\"Accuracy on validation set: \", accuracy_score(Y_test,predict(theta0,data_test)))" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy on training set: 0.94\n", "Accuracy on validation set: 0.92\n" ] } ], "source": [ "# after training\n", "\n", "print(\"Accuracy on training set: \", accuracy_score(Y_train,predict(final_params.x,data_train)))\n", "print(\"Accuracy on validation set: \", accuracy_score(Y_test,predict(final_params.x,data_test)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may have different results at each run. Normally, one should improve the results. There are many ways to do so (more optimization steps, better optimizer, different circuit architecture...)." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig, axes = plt.subplots(1, 3, figsize=(10, 3))\n", "plot_data(data_test, predict(theta0,data_test), fig, axes[0])\n", "plot_data(data_test, predict(final_params.x,data_test), fig, axes[1])\n", "plot_data(data_test, Y_test, fig, axes[2])\n", "axes[0].set_title(\"Predictions with random weights\")\n", "axes[1].set_title(\"Predictions after training\")\n", "axes[2].set_title(\"True test data\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "References:\n", "\n", "https://pennylane.ai/qml/app/tutorial_data_reuploading_classifier.html\n", "\n", "https://cirq.readthedocs.io/en/stable/tutorial.html\n", "\n", "https://github.com/Qiskit/qiskit-aqua/blob/master/qiskit/aqua/components/variational_forms/ry.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }