Tutorials
=====

.. container:: cell code

   .. code:: python

      import jax.numpy as jnp
      import numpy as np
      import matplotlib.pyplot as plt
      import scienceplots
      plt.style.use(['science', 'no-latex', 'notebook', 'grid'])

.. container:: cell markdown

   .. rubric:: Welcome to CARPoolGP!
      :name: welcome-to-carpoolgp

.. container:: cell markdown

   This is an emulator developed in **Zooming by in the CARPoolGP lane:
   new CAMELS-TNG simulations of zoomed-in massive halos**. The goal of
   CARPoolGP is to emulate a value of parameter space where we are
   limited to the number of samples we can obtain at each parameter
   space location. There is a general structure to CARPoolGP which I
   will outline below, we can use this as a guide throughout this
   notebook.

.. container:: cell markdown

   #. Generate a set of parameters, :math:`B`, at which to extract Base
      sample quantities :math:`\tilde{Q}`.
   #. Generate a set :math:`S` of Parameter Islands, :math:`\theta_{S}`
      at which to extract Surrogate sample quantities,
      :math:`\tilde{Q}^S`. Ensure that Base-Surrogate pairs have some
      level of correlation between them.
   #. Determine the noise kernels and associated hyperparameters for
      Base, Surrogate, and Combined quantities: :math:`C_{ij}`,
      :math:`D_{ij}`, :math:`X_{ij}`.
   #. Maximize the likelihood function to obtain the optimal set of
      hyperparameters: :math:`\hat{\tau}`.
   #. Emulate to find $ Q(\\theta_{p, i}^\\ast)$, and
      :math:`\sigma^2(\theta_{p, i}^\ast, \theta_{p,j}^\ast)`.

.. container:: cell markdown

   .. rubric:: A Toy model to get started
      :name: a-toy-model-to-get-started

.. container:: cell markdown

   Before even starting with CARPoolGP we need to outline some of the
   preliminary information... like the model that we are going to use.
   In practice we are not lucky enough to know the model, otherwise that
   would defeat the whole purpose... But here as a toy example to get us
   started we can make this simple.

   We consider some mean quantity, $ Q$, which has a functional form,

   \\begin{equation} \\begin{split} Q(\\theta_{i}) &= a \\theta_{i} +
   b\\theta_{i}^3\\sin(\\theta_{i})\\ \\tilde{Q}(\\theta_{i}) &=
   Q(\\theta_{i}) + \\epsilon, \\end{split} \\end{equation}

   Where a and b are some constants, and :math:`\theta_i` is our
   independent parameter, and :math:`\epsilon` is gaussian noise where
   we take :math:`\epsilon` to have :math:`\langle\epsilon\rangle=0` and
   :math:`\langle\epsilon^2\rangle=\sigma_Q^2(\theta_{i})`. We choose
   the dimensionality of our parameter to be 1. We define this model
   below

.. container:: cell code

   .. code:: python

      def get_Q(theta, A=-0.02, B=0.19):
          """
          This is the model which we ultimitely want to predict, represents the 
          smooth variation on some function. 
          """
          return (A * theta  +  theta**3*B**4 * np.sin(theta))

.. container:: cell markdown

   We need to define some global variables that we will use through out
   the toy model as well.

.. container:: cell code

   .. code:: python

      sigma_Q  = 0.1       # jitter ontop of smooth variation
      N        =50         # Number of Base data samples
      Domain   = [-10,10]  # Domain of parameters
      LR       = 1e-2      # Optimization learning rate 
      ITERS    = 5000      # maximum no of iteration

.. container:: cell markdown

   Lets make a set of "True" values that we can test our emulator
   against in the future. This will also show us how the functoin moves
   through spae

.. container:: cell code

   .. code:: python

      # underlying truth: 
      plt.figure(figsize=(12, 6))
      theta=np.linspace(*np.array(Domain)+0.001, 1001, endpoint=True)
      Y = get_Q(theta) 
      plt.plot(theta, Y, 'k--', label='Truth')
      plt.xlabel(r'$\theta$')
      plt.ylabel(r'$Q$');

   .. container:: output display_data

      .. image:: 15cd34a90a316fc60a9e4c9c95eaff9d3d2ce59f.png

.. container:: cell markdown

   .. rubric:: 1. Generate a set of parameters, :math:`B`, at which to
      extract Base sample quantities :math:`\tilde{Q}`
      :name: 1-generate-a-set-of-parameters-b-at-which-to-extract-base-sample-quantities-tildeq

.. container:: cell markdown

   We build the Base quantities by sampling a uniformly random
   distribution within the domain :math:`[-10, 10]` to generate a set of
   parameters :math:`B\equiv\{{\bf \theta}_{i}| i=1, 2, ..., N\}`, and
   quantities, :math:`\tilde{Q}({\theta}_i)`. We choose :math:`N=50` and
   draw :math:`\epsilon_{Q,i}` from a normal distribution with
   :math:`\langle\epsilon_{Q}\rangle = 0` and
   :math:`\langle\epsilon_Q^2\rangle = \sigma_Q^2`.

.. container:: cell code

   .. code:: python

      def model_data(theta, seed=1993, noise=None):
          """
          Args:
              theta (array) : the set of parameters B
              noise (array)  : an array of noise values to add to the raw Q
              seed (int) : seed used to generate random numbers

          Returns:
              \tilde{Q}: noisey data
              noise : noise added to data. 
          """
          Q = get_Q(theta)
          if noise is not None:
              return Q + noise
          np.random.seed(seed)
          noise = np.random.normal(0, sigma_Q, len(theta))
          return Q+noise, noise

.. container:: cell code

   .. code:: python

      # Sample Data
      np.random.seed(194)
      theta_B = np.random.uniform(Domain[0], Domain[1], N)
      Q_B, intrinsic_noise = model_data(theta_B, 194, None)
      plt.plot(theta, Y, 'k--', label='Truth')
      plt.plot(theta_B, Q_B, '.', label='Base')
      plt.legend();

   .. container:: output display_data

      .. image:: 8de8b253c36b46a7cbcd0f63c1aed40716a39cfe.png

.. container:: cell markdown

   .. rubric:: 2. Generate a set :math:`S` of Parameter Islands,
      :math:`\theta_{S}` at which to extract Surrogate sample
      quantities, :math:`\tilde{Q}^S`. Ensure that Base-Surrogate pairs
      have some level of correlation between them.
      :name: 2-generate-a-set-s-of-parameter-islands-theta_s-at-which-to-extract-surrogate-sample-quantities-tildeqs-ensure-that-base-surrogate-pairs-have-some-level-of-correlation-between-them

.. container:: cell markdown

   We then generate parameter islands in the set
   :math:`S\equiv\{\theta_{i}| i=1, 2, ..., N_S\}` by linearly spacing
   :math:`N_S=5` points in the range :math:`[-8, 8]` with the same
   process as defined above. For each base sample, the island closest to
   the parameter is identified, and a surrogate sample is drawn at this
   island location, :math:`\theta_i`, to generate
   :math:`\tilde{Q}^S(\theta_i)` where the noise, :math:`\epsilon_s`, is
   perfectly correlated with the noise of the base simulation (i.e., the
   same amplitude of the noise is used
   :math:`\epsilon_{s, i} = \epsilon{i}`).

.. container:: cell code

   .. code:: python

      def match_surrogates(theta_Q, intrinsic_noise, Groups, f=0.0):
          """
          Generate the correlated surrogate samples given the set of groups, the set of base samples and the noise

          Args:
              theta_Q (array): base samples
              intrinsic_noise (array): noise associated with tilde{Q}
              Groups (int): number of groups
              f (float: _description_. Defaults to 0.

          Returns:
              theta_S, and S
          """
          Surrogate_locs = np.linspace(-8, 8, Groups, endpoint=True)
          nearest_island = np.zeros_like(theta_Q)
          for i, pi in enumerate(theta_Q):
              nearest_island[i] = Surrogate_locs[np.argmin((pi - Surrogate_locs)**2)]
          S_raw = get_Q(nearest_island)
          S = S_raw + intrinsic_noise*(1-f) + f*np.random.normal(0, sigma_Q, len(S_raw))
          return nearest_island, S

.. container:: cell code

   .. code:: python

      theta_S, Q_S = match_surrogates(theta_B, intrinsic_noise, Groups=5, f=0)

.. container:: cell code

   .. code:: python

      fig, axs = plt.subplots(ncols=1, sharex=True, sharey=True, figsize=(12,6), gridspec_kw={'wspace':0.02})
      axs.plot(theta_B, Q_B, 'o', color='r', markersize=5, label='Base Samples')
      axs.plot(theta_S, Q_S, 's', color='b', label='Surrogate Samples')
      axs.plot(theta, Y, 'k', label='True Variation')
      axs.set_title('CARPoolGP Sampling')
      axs.set_xlabel(r'$\theta$')
      axs.set_ylabel('$Q$')
      axs.legend(ncol=2, loc='upper center');

   .. container:: output display_data

      .. image:: bf6f18a661b92e39ec304e5e3bddc88c4a6dcfa8.png

.. container:: cell markdown

   .. rubric:: 3. Determine the noise kernels and associated
      hyperparameters for Base, Surrogate, and Combined quantities:
      :math:`C_{ij}`, :math:`D_{ij}`, :math:`X_{ij}`.
      :name: 3-determine-the-noise-kernels-and-associated-hyperparameters-for-base-surrogate-and-combined-quantities-c_ij-d_ij-x_ij

.. container:: cell markdown

   For both base and surrogate samples, we use a radial basis function
   defined in Eq.~\\ref{eq:rbf} to describe the smooth varying component
   of the covariance. Base and surrogates are drawn from the same
   underlying process and with the same level of sample variance, so the
   hyperparameters, :math:`\bm{\tau}`, are shared across both matrices.

   \\begin{equation} \\begin{split} V_{ij} = \\alpha\\exp\\left(-\\gamma
   , d_E(\\theta_{i} - \\theta_{j})^2\\right)\\ W_{ij} =
   \\alpha\\exp\\left(-\\gamma , d_E(\\theta_{i} -
   \\theta_{j})^2\\right). \\end{split} \\end{equation}

   The only difference between the two matrices is the parameters that
   are used to generate them. :math:`V`, uses the base samples, while
   :math:`W` uses the surrogate samples. The full covariance for the
   base samples and the surrogate samples can be written following
   covariance functions

   \\begin{equation} \\begin{split} C_{ij} &=
   \\alpha\\exp\\left(-\\gamma , d_E(\\theta_{i} -
   \\theta_{j})^2\\right) + \\sigma_Q^2\\mathcal{I}\\ D_{ij} &=
   \\alpha\\exp\\left(-\\gamma , d_E(\\theta_{i} -
   \\theta_{j})^2\\right) + \\sigma_Q^2\\mathcal{I}. \\end{split}
   \\end{equation}

   We choose the kernel that describes the smooth covariance between the
   base and surrogate samples to be an RBF, but we set the additional
   parameter, :math:`\Delta q_{BS} =0`, as the processes between the
   base and surrogates are the same. We use the same scale and amplitude
   parameters for the :math:`V_{ij}` and :math:`W_{ij}` matrices to
   define the covariance between base and surrogate samples,

   | \\begin{equation} Y_{ij} = \\alpha\\exp\\left(-\\gamma ,
     \\left(d_E(\\theta_{i}, \\theta_{j})^2\\right)\\right).\\
   | \\end{equation}

   To relate the base samples to the surrogates, we use the fact that we
   have set a perfect correlation between the sample fluctuations and,
   therefore, set the :math:`M` matrix to \\begin{equation} M_{ij} =
   \\sigma_Q^2\\delta_{ij}, \\end{equation} where the
   :math:`\delta_{ij}` is a delta function that is :math:`1` at
   locations of base-surrogate pairs, and :math:`0` elsewhere. Recall
   that the distance between parameter space locations in :math:`Y_{ij}`
   and :math:`M_{ij}` are evaluated between base and surrogate samples.
   Following Eq.~\\ref{eq:cov_X}, we then have \\begin{equation}
   \\begin{split} X_{ij} = &\\alpha\\exp\\left(-\\gamma ,
   \\left(d_E(\\theta_{i} - \\theta_{j})^2\\right)\\right) +
   \\sigma_Q^2.\\ \\end{split} \\end{equation} We can now build the
   block covariance matrix containing all of these components following
   Eq.~\\ref{eq:sigma} where :math:`\bm{\tau}` is the vector of
   hyperparameters, :math:`\bm{\tau}=(\alpha, \gamma, \sigma^2_Q)`

.. container:: cell markdown

   All of these kernels are taken care of internally by CARPoolGP. So
   what we have to do is define the simulations and surrogates which we
   can do as follows:

.. container:: cell code

   .. code:: python

      from src import CARPoolSimulations
      from src import CARPoolEmulator


      # Create simulation objects and surrogate objects
      sims = CARPoolSimulations.Simulation()
      surrs = CARPoolSimulations.Simulation()

      # Set the parameters and quantities for these simulations
      sims.parameters  = theta_B  ;  sims.quantities  = Q_B
      surrs.parameters = theta_S  ;  surrs.quantities = Q_S

.. container:: cell markdown

   .. rubric:: 4. Maximize the likelihood function (minimize inverse
      Wishart function) to obtain the optimal set of hyperparameters:
      :math:`\hat{\tau}`.
      :name: 4-maximize-the-likelihood-function-minimize-inverse-wishart-function-to-obtain-the-optimal-set-of-hyperparameters-hattau

.. container:: cell markdown

   We use the Gaussian likelihood function as defined below and choose
   uninformative priors for :math:`\mu_B` and :math:`\mu_S`, but allow
   them to be learned as additional hyperparameters in the regression.
   We then minimize the negative log of the likelihood function to
   obtain an optimal set of hyperparameters, :math:`\hat{\tau}` using
   Stochastic Gradient Descent (SGD).

   \\begin{equation} \\begin{split} \\mathcal{L}(\\tau) =
   &\\frac{1}{(2\\pi)^{N/2}} \|\\Sigma(\\tau)|^{-1/2},\\times
   \\exp\\left(-\\frac{1}{2}\\begin{pmatrix} \\tilde{Q}-\\mu_Q\\
   \\tilde{Q}^S-\\mu_S \\end{pmatrix}^T
   \\Sigma(\\tau)^{-1}\\begin{pmatrix} \\tilde{Q}-\\mu_Q\\
   \\tilde{Q}^S-\\mu_S \\end{pmatrix}\\right) \\end{split}
   \\end{equation}

.. container:: cell code

   .. code:: python

      #Build an emulator object (this generates the kernels which you can find in the CARPoolKernels file) 
      emu = CARPoolEmulator.Emulator(sims, surrs)

      params = {"log_scaleV":3.0, "log_ampV":0.1,
                "log_scaleM":1.0,  "log_jitterV":-1.0, "log_mean":0.0}

      # Train the emulator
      best_params = emu.train(params, learning_rate=0.01, max_iterations=ITERS)

.. container:: cell code

   .. code:: python

      plt.semilogx(np.diff(emu.losses))
      plt.xlabel('Iterations')
      plt.ylabel(r'$\Delta\mathcal{L}$');

   .. container:: output display_data

      .. image:: 7354b0b696bff5df5cf064e142a2bfa6c51acf0d.png

.. container:: cell markdown

   .. rubric:: 5. Emulate to find $ Q(\\theta_{p, i}')$, and
      :math:`\sigma^2(\theta_{p, i}', \theta_{p,j}')`.
      :name: 5-emulate-to-find--qtheta_p-i-and-sigma2theta_p-i-theta_pj

.. container:: cell markdown

   We now have all we need to perform an emulation at sample points from
   the set :math:`T` using:

.. container:: cell markdown

   \\begin{equation} \\begin{split} Q(\\theta_{p, i}') & =
   \\text{K}\ *s(\\hat{\\tau}), \\Sigma^{-1}*\ {ij}(\\hat{\\tau})
   \\begin{pmatrix} \\tilde{Q}-\\mu_Q\\ \\tilde{Q}^S-\\mu_S
   \\end{pmatrix} + \\begin{pmatrix} \\mu_Q\\ \\mu_S \\end{pmatrix}\\
   \\sigma^2(\\theta_{p, i}', \\theta_{p, j}') & =
   \\text{K}_{tt}(\\hat{\\tau}) -
   \\text{K}\ *t(\\hat{\\tau})\\Sigma*\ {ij}^{-1}(\\hat{\\tau})\\text{K}_t^T(\\hat{\\tau})
   \\end{split} \\end{equation}

.. container:: cell markdown

   With

   \\begin{equation} \\begin{split} \\text{K}\ *t(\\hat{\\tau}) &=
   \\Sigma(\\theta*\ {p,i}', \\theta_{p,j} ; \\hat{\\tau})\\
   \\text{K}\ *{tt}(\\hat{\\tau}) &= \\begin{pmatrix}
   V(\\theta*\ {p,i}', \\theta_{p,j}' ; \\hat{\\tau}) ,,,
   Y(\\theta_{p,i}', \\theta_{p,j}' ; \\hat{\\tau}) \\
   Y^T(\\theta_{p,i}', \\theta_{p,j}' ; \\hat{\\tau}) ,
   W(\\theta_{p,i}', \\theta_{p,j}' ; \\hat{\\tau}) \\end{pmatrix}\\
   \\end{split} \\end{equation}

.. container:: cell markdown

   All of this is taken care by the package

.. container:: cell code

   .. code:: python

      # now emulate! 
      pred_mean, pred_var = emu.predict(theta)

.. container:: cell code

   .. code:: python

      fig, axs = plt.subplots(nrows=2, sharex=True, figsize=(12, 8))
      std = np.sqrt(np.diag(pred_var))
      axs[0].fill_between(theta, pred_mean - 2*std, pred_mean+2*std, label='98% Confidence interval')
      axs[0].plot(theta, Y, 'k', label='True Evolution')

      axs[1].fill_between(theta, (pred_mean-Y) - 2*std, (pred_mean-Y)+2*std, label='98% Confidence interval')
      axs[1].plot(theta, (pred_mean - Y), 'k', label='Residual')
      axs[0].set_ylabel('Q')
      axs[1].set_xlabel(r'$\theta$')
      axs[1].set_ylabel(r'$Q_{\rm pred} - Q_{\rm True}$')
      axs[0].legend();

   .. container:: output display_data

      .. image:: ca612b0fda49340e82b8858935a30a0151ae95e2.png

.. container:: cell markdown

   .. rubric:: Active learning with CARPoolGP
      :name: active-learning-with-carpoolgp

.. container:: cell markdown

   We introduce an active learning method to predict the best next
   places to sample in parameter space. This is all taken care of in
   CARPoolGP!

.. container:: cell code

   .. code:: python

      from src import CARPoolEmulator

      # Generate an active learning model
      model = CARPoolEmulator.ActiveLearning(sims, surrs, theta, Domain[0], Domain[1])

      # Initialize the training
      best_params = model.train(params, learning_rate=LR, max_iterations=ITERS)

.. container:: cell code

   .. code:: python

      # Run an active learning step to find the next state (Ngrid is for 2**N)
      num_new = 10 # Number of new points to sample
      Ngrid   = 7  # The number of locations to test at in base 2, (eg, 2^7)
      next_thetas, next_surrogates = model.active_learning_step(num_new=10, Ngrid=7, normalize=False)

.. container:: cell code

   .. code:: python

      print('Next base samples:', [i[0] for i in next_thetas])
      print('Next surrogate samples:', [i[0] for i in next_surrogates])

   .. container:: output stream stdout

      ::

         Next base samples: [-5.848567672073841, 9.901670515537262, 2.133758831769228, -6.054342966526747, -2.0871826633810997, -5.965835005044937, 2.1017765067517757, -2.064566109329462, -6.006976924836636, 5.986908171325922]
         Next surrogate samples: [-4.0, 8.0, 4.0, -8.0, -4.0, -4.0, 4.0, -4.0, -8.0, 4.0]