# How it works "Modern Evolution Strategies for Creativity" {cite}`tian2021` uses Policy Gradients with Parameter-based Exploration (PGPE) {cite}`sehnke2010` with the ClipUp extension {cite}`toklu2020`. The first thing to do is to understand how PGPE works before going onto the shape fitting algorithm. :::{note} The examples in this section use the same [Mona Lisa](https://github.com/google/brain-tokyo-workshop/blob/67c73645121d599f855714788db8a5c44e329c29/es-clip/assets/monalisa.png) image from {cite}`tian2021`. ::: ## PGPE The PGPE algorithm can be summarized as 1. Select a starting solution $\vec{x}$. 2. For $n$ iterations: 1. Sample $k$ solutions from a normal distribution $N(\vec{x}, \Sigma)$. 2. Compute the relative fitness $f(\vec{x}_k)$ of each sample $\vec{x}_k$. 3. Compute an update $\Delta \vec{x}$ so that $f(\vec{x} + \Delta \vec{x}) > f(\vec{x})$. 4. Let the next iteration solution $\vec{x}' = \vec{x} + \Delta \vec{x}$. The specific details on how to implement PGPE, along with how to interpret the various parameters, can be found in the ClipUP paper {cite}`toklu2020`. The [What is PGPE?](https://github.com/nnaisense/pgpelib/blob/d00907fe4b237e875ac0cc5b9d9c22591e4a7206/README.md#what-is-pgpe) section of the [`pgpelib`](https://github.com/nnaisense/pgpelib) README has a good example of how the algorithm works. :::{seealso} See the [](api/pgpe.rst) in the API documentation for how the optimizer is implemented by 'abstractions'. ::: ## Fitting shapes to images The algorithm is initialized by randomly generating $S$ shapes that will cover the frame, all with random colours and dimensions. The PGPE optimizer will then generate a sets of $k$ candidate solutions to be rendered by [Blend2D](https://blend2d.com/), a general-purpose 2D vector graphics engine. The negative $L2$-norm between the image generated from $\vec{x}_k$ and the orignal is then used as the fitness (the negative is used because PGPE finds a maximum). The example below shows what this looks like. The optimizer will often converge to a "good enough" solution very quickly and spend the bulk of its time "fine-tuning". :::{table} Generating an abstraction over 3000 iterations | $n=1$ | $n=500$ | $n=3000$ | Original | |-------|---------|----------|----------| | ![](examples/monalisa-00000.png) | ![](examples/monalisa-00500.png) | ![](examples/monalisa-02999.png) | ![](examples/original/monalisa.png) ::: ```{admonition} See the full optimization :class: dropdown :::{video} examples/monalisa-a1.mp4 :align: center :width: 400 ::: ``` ### Other shape types A key contribution by {cite:p}`tian2021` is recognizing that a set of shapes can be describe as fixed-sized vectors. The format is :::{math} \vec{x} = \begin{bmatrix} \vec{s} & \vec{c} \\ \end{bmatrix}, ::: where $\vec{s}$ is the set of shape parameters and :::{math} \vec{c} = \begin{bmatrix} r & g & b & a \end{bmatrix} ::: is the shape's RGBA colour vector. For example, in {cite}`tian2021` (and in 'abstractions') triangles are represented by a triplet of 2D coordinates :::{math} \vec{s} = \begin{bmatrix} x_1 & y_1 & x_2 & y_2 & x_3 & y_3 \end{bmatrix} ::: representing the vertices of a single triangle. Since Blend2D is a general-purpose vector engine it supports a wide variety of shapes. It can quickly render a wide variety of shapes, not just triangles. 'abstractions' takes advantage of this to also support circles and rectangles instead of just triangles. :::{table} Generating an abstraction with different shapes. All other parameters are the same. | Rectangles | Circles | |------------|---------| | ![](examples/monalisa-rects.png) | ![](examples/monalisa-circs.png) | ::: ```{admonition} See how the optimization looks like for the two examples :class: dropdown :::{video} examples/monalisa-rects.mp4 :align: center :width: 400 ::: :::{video} examples/monalisa-circs.mp4 :align: center :width: 400 ::: ``` Shapes can also be mixed together. This can produce some different effects since it allows the optimizer to place shapes that better match different regions of the image. :::{figure} examples/monalisa-circ-tri.png An abstraction constructed using both circles and triangles. ::: ```{admonition} See the "circles + triangles" optimization :class: dropdown :::{video} examples/monalisa-circ-tri.mp4 :align: center :width: 400 ::: ``` ## Rendering The renderer converts a solution vector $\vec{x}$ into a set of Blend2D operations. Blend2D uses floating-point colour values so the colour components $\vec{c}$ are always clamped to $[0, 1]$ prior to rendering. The shape vector uses scale-free coordinate axes so that the top-left corner is $(0,0)$ and the bottom-right corner is $(1,1)$. These are converted into image coordinates in a two-step process. First all coordinates are rescaled so that the minimum of an coordinate column is $0$ and the maximum is $1$. Then they are transformed by :::{math} \mathbf{T} = 1.2 \begin{bmatrix} W-1 & 0 & 0 \\ 0 & H - 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 1 & 0 & -0.1 \\ 0 & 1 & -0.1 \\ 0 & 0 & 1 \end{bmatrix} ::: The scaling by $1.2$ and shift by $-0.1$ are to allow shapes to be *slightly* larger than the image. ### Alpha scaling 'abstractions' supports applying an alpha scale value $0 < \alpha_s \le 1$. This is applied to the alpha channel just before it is clamped, i.e., :::{math} A' = \mathrm{clamp}(\alpha_s A, 0, 1) ::: This prevents shapes from becoming completely opaque and has the effect of "softening" the image. The example below shows the difference between two different values of $\alpha_s$ *for the same PRNG seed*. :::{table} | $\alpha_s = 0.1$ | $\alpha_s = 1.0$ | |------------------|------------------| | ![](examples/monalisa-a01.png) | ![](examples/monalisa-a1.png) | ::: ```{admonition} See the optimization when $\alpha_s = 0.1$ :class: dropdown :::{video} examples/monalisa-a01.mp4 :align: center :width: 400 ::: ``` ### Limitations There are some limitations when using multiple shapes in a single abstraction: * The shapes are always rendered in the following order: circles, then rectangles, and then triangles. * There is no way to individual control the number of shapes to use. For example, 10 circles and 20 triangles. Instead, the same number is used for all shape types.