Merge pull request #21 from albarji/feauture/multiresolution

Gatys multiresolution and automatic tile adjustment based on GPU model
albarji · Dec 8, 2017 · f5ee2d5 · f5ee2d5
2 parents a78399a + 427ebdd
commit f5ee2d5
Show file tree

Hide file tree

Showing 9 changed files with 188 additions and 40 deletions.
diff --git a/Dockerfile b/Dockerfile
@@ -29,6 +29,13 @@ RUN curl -o Miniconda3-latest-Linux-x86_64.sh https://repo.continuum.io/minicond
   && chmod +x Miniconda3-latest-Linux-x86_64.sh \
   && ./Miniconda3-latest-Linux-x86_64.sh -b -p "${MINICONDA_HOME}" \
 && rm Miniconda3-latest-Linux-x86_64.sh
+COPY conda.txt conda.txt
+RUN conda install -y --file=conda.txt
+RUN conda clean -y -i -l -p -t && \
+    rm -f conda.txt
+COPY pip.txt pip.txt
+RUN pip install -r pip.txt && \
+    rm -f pip.txt
 
 # Clone neural-style app
 WORKDIR /app
@@ -58,9 +65,11 @@ RUN ln -s /app/neural-style/models /app/style-swap/models
 # Add precomputed inverse network model
 ADD models/dec-tconv-sigmoid.t7 /app/style-swap/models/dec-tconv-sigmoid.t7
 
-# Copy wrapper scripts
+# Copy wrapper scripts and config files
 COPY ["entrypoint.py" ,"/app/entrypoint/"]
 COPY ["/neuralstyle/*.py", "/app/entrypoint/neuralstyle/"]
+COPY ["gpuconfig.json", "/app/entrypoint/"]
 
+WORKDIR /app/entrypoint
 ENTRYPOINT ["python", "/app/entrypoint/entrypoint.py"]
 
diff --git a/README.md b/README.md
@@ -22,7 +22,7 @@ A dockerized version of neural style transfer algorithms.
 
 * [docker](https://www.docker.com/)
 * [nvidia-docker](https://github.com/NVIDIA/nvidia-docker)
-* Appropriate nvidia drivers for your GPU
+* Appropriate [nvidia drivers](http://www.nvidia.es/Download/index.aspx) for your GPU
 
 ### Installation
 
@@ -79,14 +79,16 @@ Better results can be attained by modifying some of the transfer parameters.
 The --alg parameter allows changing the neural style transfer algorithm to use.
 
 * **gatys**: highly detailed transfer, slow processing times (default)
+* **gatys-multiresolution**: multipass version of Gatys method, provides even better quality, but is also much slower
 * **chen-schmidt**: fast patch-based style transfer
 * **chen-schmidt-inverse**: even faster aproximation to chen-schmidt through the use of an inverse network
 
 The following example illustrates kind of results to be expected by these different algorithms
 
 | Content image | Algorithm | Style image |
 | ------------- | --------- | ----------- |
-| ![Content](./doc/avila-walls.jpg) | Gatys ![Gatys](./doc/avila-walls_broca_gatys_ss1.0_sw10.0.jpg) | ![Style](./doc/broca.jpg) | 
+| ![Content](./doc/avila-walls.jpg) | Gatys ![Gatys](./doc/avila-walls_broca_gatys_ss1.0_sw10.0.jpg) | ![Style](./doc/broca.jpg) |
+| ![Content](./doc/avila-walls.jpg) | Gatys Multiresolution ![Gatys-Multiresolution](./doc/avila-walls_broca_gatys-multiresolution_ss1.0_sw3.0.jpg) | ![Style](./doc/broca.jpg) |
 | ![Content](./doc/avila-walls.jpg) | Chen-Schmidt ![Chen-Schmidt](./doc/avila-walls_broca_chen-schmidt_ss1.0.jpg) | ![Style](./doc/broca.jpg) | 
 | ![Content](./doc/avila-walls.jpg) | Chen-Schmidt Inverse ![Chen-Schmidt Inverse](./doc/avila-walls_broca_chen-schmidt-inverse_ss1.0.jpg) | ![Style](./doc/broca.jpg) | 
 
@@ -102,28 +104,37 @@ of the target image, the height being scaled accordingly to keep proportion.
 
 If the image to be generated is large, a tiling strategy will be used, applying the neural style transfer method
 to small tiles of the image and stitching them together. Tiles overlap to provide some guarantees on overall
-consistency.
+consistency, though results might vary depending on the algorithm used.
 
 ![Tiling](./doc/tiling.png)
 
-You can control the size of these tiles through the --tilesize parameter.
-Higher values will generally produce better quality results and faster rendering times, but they will also incur in
-larger memory consumption.
-Note also that since the full style image is applied to each tile, as a result the style features will appear
+The size of these tiles is defined through the configuration file **gpuconfig.json** inside the container.
+This file contains dictionary keys for different GPU models and each neural style algorithm. Your GPU will be 
+automatically checked against the registered configurations and the appropriate tile size will be selected. These values
+have been chosen to maximize the use of the available GPU memory, asumming the whole GPU is available for the style
+transfer task.  
+
+If your GPU is not included in the configuration file, the *default* values will we used instead, though to obtain
+better performance you might want to edit this file and rebuild the docker images.
+
+Note also that since the full style image is applied to each tile separately, as a result the style features will appear
 as smaller in the rendered image.
 
 #### Style weight
 
-Gatys algorithm allows to adjust the amount of style imposed over the content image, by means of the --sw parameter.
-By default a value of **5** is used, meaning the importance of the style is 5 times the importance of the content.
-Smaller weight values result in the transfer of colors, while higher values transfer textures and even objects of the 
-style.
+Gatys and Gatys Multiresolution algorithms allow to adjust the amount of style imposed over the content image, by means 
+of the --sw parameter. By default a value of **5** is used, meaning the importance of the style is 5 times the 
+importance of the content. Smaller weight values result in the transfer of colors, while higher values transfer textures 
+and even objects of the style.
 
 If several weight values are provided, all combinations will be generated. For instance, to generate the same
 style transfer with three different weights, use
 
     nvidia-docker run --rm -v $(pwd):/images albarji/neural-style --content contents/docker.png --style styles/vangogh.png --sw 5 10 20
-
+
+Note also that they Gatys Multiresolution algorithm tends to produce a stronger style imprint, and this you might want
+to use weight values smaller than the default (e.g. 3). 
+
 #### Style scale
 
 If the transferred style results in too large or too small features, the scaling can be modified through the --ss 
@@ -145,5 +156,7 @@ logo example above the transparent background is not transformed.
 * [Gatys et al method](https://arxiv.org/abs/1508.06576), [implementation by jcjohnson](https://github.com/jcjohnson/neural-style)
 * [Chen-Schmidt method](https://arxiv.org/pdf/1612.04337.pdf), [implementation](https://github.com/rtqichen/style-swap)
 * [A review on style transfer methods](https://arxiv.org/pdf/1705.04058.pdf)
+* [Controlling Perceptual Factors in Neural Style Transfer](https://arxiv.org/abs/1611.07865)
 * [Neural-tiling method](https://github.com/ProGamerGov/Neural-Tile)
+* [Multiresolution strategy](https://gist.github.com/jcjohnson/ca1f29057a187bc7721a3a8c418cc7db)
 * [The Wikipedia logo](https://en.wikipedia.org/wiki/Wikipedia_logo)
diff --git a/conda.txt b/conda.txt
@@ -0,0 +1 @@
+numpy
diff --git a/doc/avila-walls_broca_gatys-multiresolution_ss1.0_sw3.0.jpg b/doc/avila-walls_broca_gatys-multiresolution_ss1.0_sw3.0.jpg
diff --git a/entrypoint.py b/entrypoint.py
@@ -19,11 +19,11 @@
     --ss STYLE_SCALE (default 1.0): scaling or list of scaling factors for the style images
     --alg ALGORITHM: style-transfer algorithm to use. Must be one of the following:
         gatys                   Highly detailed transfer, slow processing times (default)
+        gatys-multiresolution   Multipass version of Gatys method, provides even better quality
         chen-schmidt            Fast patch-based style transfer
         chen-schmidt-inverse    Even faster aproximation to chen-schmidt through the use of an inverse network
-    --tilesize TILE_SIZE: maximum size of each tile in the style transfer.
-        If your GPU runs out of memory you should try reducing this value. Default: 400
-    --tileoverlap TILE_OVERLAP: overlap of tiles in the style transfer, measured in pixels. Default: 100
+    --tileoverlap TILE_OVERLAP: overlap of tiles in the style transfer, measured in pixels. If you experience
+        artifacts in the image you should try increasing this. Default: 100
 
     Additionally provided parameters are carried on to the underlying algorithm.
     
@@ -42,7 +42,6 @@ def main(argv=None):
         alg = "gatys"
         weights = None
         stylescales = None
-        tilesize = None
         tileoverlap = None
         otherparams = []
 
@@ -72,9 +71,6 @@ def main(argv=None):
             elif argv[i] == "--ss":
                 stylescales = [float(x) for x in sublist(argv[i+1:], stopper="-")]
                 i += len(stylescales) + 1
-            elif argv[i] == "--tilesize":
-                tilesize = int(argv[i+1])
-                i += 2
             elif argv[i] == "--tileoverlap":
                 tileoverlap = int(argv[i+1])
                 i += 2
@@ -100,10 +96,8 @@ def main(argv=None):
         LOGGER.info("\tStyle weights = %s" % str(weights))
         LOGGER.info("\tStyle scales = %s" % str(stylescales))
         LOGGER.info("\tSize = %s" % str(size))
-        LOGGER.info("\tTile size = %s" % str(tilesize))
         LOGGER.info("\tTile overlap = %s" % str(tileoverlap))
-        styletransfer(contents, styles, savefolder, size, alg, weights, stylescales, tilesize, tileoverlap,
-                      algparams=otherparams)
+        styletransfer(contents, styles, savefolder, size, alg, weights, stylescales, tileoverlap, algparams=otherparams)
         return 1
 
     except Exception:

diff --git a/gpuconfig.json b/gpuconfig.json
@@ -0,0 +1,26 @@
+{
+  "GeForce GTX 970M": {
+    "gatys": 512,
+    "gatys-multiresolution": 750,
+    "chen-schmidt": 750,
+    "chen-schmidt-inverse": 400
+  },
+  "Tesla K80": {
+    "gatys": 1300,
+    "gatys-multiresolution": 1300,
+    "chen-schmidt": 1500,
+    "chen-schmidt-inverse": 800
+  },
+  "Tesla P100-PCIE-16GB": {
+    "gatys": 1300,
+    "gatys-multiresolution": 1300,
+    "chen-schmidt": 2048,
+    "chen-schmidt-inverse": 900
+  },
+  "default": {
+    "gatys": 512,
+    "gatys-multiresolution": 750,
+    "chen-schmidt": 750,
+    "chen-schmidt-inverse": 400
+  }
+}
diff --git a/neuralstyle/algorithms.py b/neuralstyle/algorithms.py
@@ -5,6 +5,9 @@
 from shutil import copyfile
 import logging
 from math import ceil
+import numpy as np
+import json
+import GPUtil
 from neuralstyle.utils import filename, fileext
 from neuralstyle.imagemagick import (convert, resize, shape, assertshape, choptiles, feather, smush, composite,
                                      extractalpha, mergealpha)
@@ -29,6 +32,7 @@
             "-num_iterations", "500"
         ]
     },
+    "gatys-multiresolution": {},
     "chen-schmidt": {
         "folder": "/app/style-swap",
         "command": "th style-swap.lua",
@@ -46,16 +50,20 @@
     }
 }
 
+# Load file with GPU configuration
+with open("gpuconfig.json", "r") as f:
+    GPUCONFIG = json.load(f)
+
 
 def styletransfer(contents, styles, savefolder, size=None, alg="gatys", weights=None, stylescales=None,
-                  maxtilesize=400, tileoverlap=100, algparams=None):
+                  tileoverlap=100, algparams=None):
     """General style transfer routine over multiple sets of options"""
     # Check arguments
     if alg not in ALGORITHMS.keys():
         raise ValueError("Unrecognized algorithm %s, must be one of %s" % (alg, str(list(ALGORITHMS.keys()))))
 
     # Plug default options
-    if alg != "gatys":
+    if alg != "gatys" and alg != "gatys-multiresolution":
         if weights is not None:
             LOGGER.warning("Only gatys algorithm accepts style weights. Ignoring style weight parameters")
         weights = [None]
@@ -64,8 +72,6 @@ def styletransfer(contents, styles, savefolder, size=None, alg="gatys", weights=
             weights = [5.0]
     if stylescales is None:
         stylescales = [1.0]
-    if maxtilesize is None:
-        maxtilesize = 400
     if tileoverlap is None:
         tileoverlap = 100
     if algparams is None:
@@ -75,13 +81,13 @@ def styletransfer(contents, styles, savefolder, size=None, alg="gatys", weights=
     for content, style, weight, scale in product(contents, styles, weights, stylescales):
         outfile = outname(savefolder, content, style, alg, scale, weight)
         # If the desired size is smaller than the maximum tile size, use a direct neural style
-        if fitsingletile(targetshape(content, size), maxtilesize):
+        if fitsingletile(targetshape(content, size), alg):
             styletransfer_single(content=content, style=style, outfile=outfile, size=size, alg=alg, weight=weight,
                                  stylescale=scale, algparams=algparams)
         # Else use a tiling strategy
         else:
-            neuraltile(content=content, style=style, outfile=outfile, size=size, maxtilesize=maxtilesize,
-                       overlap=tileoverlap, alg=alg, weight=weight, stylescale=scale, algparams=algparams)
+            neuraltile(content=content, style=style, outfile=outfile, size=size, overlap=tileoverlap, alg=alg,
+                       weight=weight, stylescale=scale, algparams=algparams)
 
 
 def styletransfer_single(content, style, outfile, size=None, alg="gatys", weight=5.0, stylescale=1.0, algparams=None):
@@ -101,6 +107,8 @@ def styletransfer_single(content, style, outfile, size=None, alg="gatys", weight
     algfile = workdir.name + "/" + "algoutput.png"
     if alg == "gatys":
         gatys(rgbfile, stylepng, algfile, size, weight, stylescale, algparams)
+    elif alg == "gatys-multiresolution":
+        gatys_multiresolution(rgbfile, stylepng, algfile, size, weight, stylescale, algparams)
     elif alg in ["chen-schmidt", "chen-schmidt-inverse"]:
         chenschmidt(alg, rgbfile, stylepng, algfile, size, stylescale, algparams)
     # Enforce correct size
@@ -111,8 +119,8 @@ def styletransfer_single(content, style, outfile, size=None, alg="gatys", weight
     mergealpha(algfile, alphafile, outfile)
 
 
-def neuraltile(content, style, outfile, size=None, maxtilesize=400, overlap=100, alg="gatys", weight=5.0,
-               stylescale=1.0, algparams=None):
+def neuraltile(content, style, outfile, size=None, overlap=100, alg="gatys", weight=5.0, stylescale=1.0,
+               algparams=None):
     """Strategy to generate a high resolution image by running style transfer on overlapping image tiles"""
     LOGGER.info("Starting tiling strategy")
     if algparams is None:
@@ -123,7 +131,7 @@ def neuraltile(content, style, outfile, size=None, maxtilesize=400, overlap=100,
     fullshape = targetshape(content, size)
 
     # Compute number of tiles required to map all the image
-    xtiles, ytiles = tilegeometry(fullshape, maxtilesize, overlap)
+    xtiles, ytiles = tilegeometry(fullshape, alg, overlap)
 
     # First scale image to target resolution
     firstpass = workdir.name + "/" + "lowres.png"
@@ -187,6 +195,69 @@ def gatys(content, style, outfile, size, weight, stylescale, algparams):
     tmpout.close()
 
 
+def gatys_multiresolution(content, style, outfile, size, weight, stylescale, algparams, startres=256):
+    """Runs a multiresolution version of Gatys et al method
+
+    The multiresolution strategy starts by generating a small image, then using that image as initializer
+    for higher resolution images. This procedure is repeated up to the tilesize.
+
+    Once the maximum tile size attainable by L-BFGS is reached, more iterations are run by using Adam. This allows
+    to produce larger images using this method than the basic Gatys.
+
+    References:
+        * Gatys et al - Controlling Perceptual Factors in Neural Style Transfer (https://arxiv.org/abs/1611.07865)
+        * https://gist.github.com/jcjohnson/ca1f29057a187bc7721a3a8c418cc7db
+    """
+    # Multiresolution strategy: list of rounds, each round composed of a optimization method and a number of
+    # upresolution steps.
+    # Using "adam" as optimizer means that Adam will be used when necessary to attain higher resolutions
+    strategy = [
+        ["lbfgs", 7],
+        ["lbfgs", 7],
+        ["lbfgs", 7],
+        ["lbfgs", 7],
+        ["lbfgs", 7]
+    ]
+    LOGGER.info("Starting gatys-multiresolution with strategy " + str(strategy))
+
+    # Initialization
+    workdir = TemporaryDirectory()
+    maxres = targetshape(content, size)[0]
+    if maxres < startres:
+        LOGGER.warning("Target resolution (%d) might too small for the multiresolution method to work well" % maxres)
+        startres = maxres / 2.0
+    seed = None
+    tmpout = workdir.name + "/tmpout.png"
+
+    # Iterate over rounds
+    for roundnumber, (optimizer, steps) in enumerate(strategy):
+        LOGGER.info("gatys-multiresolution round %d with %s optimizer and %d steps" % (roundnumber, optimizer, steps))
+        roundmax = min(maxtile("gatys"), maxres) if optimizer == "lbfgs" else maxres
+        resolutions = np.linspace(startres, roundmax, steps, dtype=int)
+        iters = 1000
+        for stepnumber, res in enumerate(resolutions):
+            stepopt = "adam" if res > maxtile("gatys") else "lbfgs"
+            LOGGER.info("Step %d, resolution %d, optimizer %s" % (stepnumber, res, stepopt))
+            passparams = algparams[:]
+            passparams.extend([
+                "-num_iterations", iters,
+                "-tv_weight", "0",
+                "-print_iter", "0",
+                "-optimizer", stepopt
+            ])
+            if seed is not None:
+                passparams.extend([
+                    "-init", "image",
+                    "-init_image", seed
+                ])
+            gatys(content, style, tmpout, res, weight, stylescale, passparams)
+            seed = workdir.name + "/seed.png"
+            copyfile(tmpout, seed)
+            iters = max(iters/2.0, 100)
+
+    convert(tmpout, outfile)
+
+
 def chenschmidt(alg, content, style, outfile, size, stylescale, algparams):
     """Runs Chen and Schmidt fast style-transfer algorithm
 
@@ -250,16 +321,20 @@ def correctshape(result, original, size=None):
     assertshape(result, targetshape(original, size))
 
 
-def tilegeometry(imshape, maxtilesize=400, overlap=50):
+def tilegeometry(imshape, alg, overlap=50):
     """Given the shape of an image, computes the number of X and Y tiles to cover it"""
+    maxtilesize = maxtile(alg)
     xtiles = ceil(float(imshape[0] - maxtilesize) / float(maxtilesize - overlap) + 1)
     ytiles = ceil(float(imshape[1] - maxtilesize) / float(maxtilesize - overlap) + 1)
     return xtiles, ytiles
 
 
-def fitsingletile(imshape, maxtilesize):
-    """Returns whether a given image shape will fit in a single tile or not"""
-    return all([x <= maxtilesize for x in imshape])
+def fitsingletile(imshape, alg):
+    """Returns whether a given image shape will fit in a single tile or not.
+
+    This depends on the algorithm used and the GPU available in the system"""
+    mx = maxtile(alg)
+    return mx*mx >= np.prod(imshape)
 
 
 def targetshape(content, size=None):
@@ -272,3 +347,24 @@ def targetshape(content, size=None):
         return contentshape
     else:
         return [size, int(size * contentshape[1] / contentshape[0])]
+
+
+def gpuname():
+    """Returns the model name of the first available GPU"""
+    gpus = GPUtil.getGPUs()
+    if len(gpus) == 0:
+        raise ValueError("No GPUs detected in the system")
+    return gpus[0].name
+
+
+def maxtile(alg="gatys"):
+    """Returns the recommended configuration maximum tile size, based on the available GPU and algorithm to be run
+
+    The size returned should be understood as the maximum tile size for a square tile. If non-square tiles are used,
+    a maximum tile of the same number of pixels should be used.
+    """
+    gname = gpuname()
+    if gname not in GPUCONFIG:
+        LOGGER.warning("Unknown GPU model %s, will use default tiling parameters")
+        gname = "default"
+    return GPUCONFIG[gname][alg]