Skip to content

Commit

Permalink
Merge pull request #21 from albarji/feauture/multiresolution
Browse files Browse the repository at this point in the history
Gatys multiresolution and automatic tile adjustment based on GPU model
  • Loading branch information
albarji authored Dec 8, 2017
2 parents a78399a + 427ebdd commit f5ee2d5
Show file tree
Hide file tree
Showing 9 changed files with 188 additions and 40 deletions.
11 changes: 10 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,13 @@ RUN curl -o Miniconda3-latest-Linux-x86_64.sh https://repo.continuum.io/minicond
&& chmod +x Miniconda3-latest-Linux-x86_64.sh \
&& ./Miniconda3-latest-Linux-x86_64.sh -b -p "${MINICONDA_HOME}" \
&& rm Miniconda3-latest-Linux-x86_64.sh
COPY conda.txt conda.txt
RUN conda install -y --file=conda.txt
RUN conda clean -y -i -l -p -t && \
rm -f conda.txt
COPY pip.txt pip.txt
RUN pip install -r pip.txt && \
rm -f pip.txt

# Clone neural-style app
WORKDIR /app
Expand Down Expand Up @@ -58,9 +65,11 @@ RUN ln -s /app/neural-style/models /app/style-swap/models
# Add precomputed inverse network model
ADD models/dec-tconv-sigmoid.t7 /app/style-swap/models/dec-tconv-sigmoid.t7

# Copy wrapper scripts
# Copy wrapper scripts and config files
COPY ["entrypoint.py" ,"/app/entrypoint/"]
COPY ["/neuralstyle/*.py", "/app/entrypoint/neuralstyle/"]
COPY ["gpuconfig.json", "/app/entrypoint/"]

WORKDIR /app/entrypoint
ENTRYPOINT ["python", "/app/entrypoint/entrypoint.py"]

37 changes: 25 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ A dockerized version of neural style transfer algorithms.

* [docker](https://www.docker.com/)
* [nvidia-docker](https://github.com/NVIDIA/nvidia-docker)
* Appropriate nvidia drivers for your GPU
* Appropriate [nvidia drivers](http://www.nvidia.es/Download/index.aspx) for your GPU

### Installation

Expand Down Expand Up @@ -79,14 +79,16 @@ Better results can be attained by modifying some of the transfer parameters.
The --alg parameter allows changing the neural style transfer algorithm to use.

* **gatys**: highly detailed transfer, slow processing times (default)
* **gatys-multiresolution**: multipass version of Gatys method, provides even better quality, but is also much slower
* **chen-schmidt**: fast patch-based style transfer
* **chen-schmidt-inverse**: even faster aproximation to chen-schmidt through the use of an inverse network

The following example illustrates kind of results to be expected by these different algorithms

| Content image | Algorithm | Style image |
| ------------- | --------- | ----------- |
| ![Content](./doc/avila-walls.jpg) | Gatys ![Gatys](./doc/avila-walls_broca_gatys_ss1.0_sw10.0.jpg) | ![Style](./doc/broca.jpg) |
| ![Content](./doc/avila-walls.jpg) | Gatys ![Gatys](./doc/avila-walls_broca_gatys_ss1.0_sw10.0.jpg) | ![Style](./doc/broca.jpg) |
| ![Content](./doc/avila-walls.jpg) | Gatys Multiresolution ![Gatys-Multiresolution](./doc/avila-walls_broca_gatys-multiresolution_ss1.0_sw3.0.jpg) | ![Style](./doc/broca.jpg) |
| ![Content](./doc/avila-walls.jpg) | Chen-Schmidt ![Chen-Schmidt](./doc/avila-walls_broca_chen-schmidt_ss1.0.jpg) | ![Style](./doc/broca.jpg) |
| ![Content](./doc/avila-walls.jpg) | Chen-Schmidt Inverse ![Chen-Schmidt Inverse](./doc/avila-walls_broca_chen-schmidt-inverse_ss1.0.jpg) | ![Style](./doc/broca.jpg) |

Expand All @@ -102,28 +104,37 @@ of the target image, the height being scaled accordingly to keep proportion.

If the image to be generated is large, a tiling strategy will be used, applying the neural style transfer method
to small tiles of the image and stitching them together. Tiles overlap to provide some guarantees on overall
consistency.
consistency, though results might vary depending on the algorithm used.

![Tiling](./doc/tiling.png)

You can control the size of these tiles through the --tilesize parameter.
Higher values will generally produce better quality results and faster rendering times, but they will also incur in
larger memory consumption.
Note also that since the full style image is applied to each tile, as a result the style features will appear
The size of these tiles is defined through the configuration file **gpuconfig.json** inside the container.
This file contains dictionary keys for different GPU models and each neural style algorithm. Your GPU will be
automatically checked against the registered configurations and the appropriate tile size will be selected. These values
have been chosen to maximize the use of the available GPU memory, asumming the whole GPU is available for the style
transfer task.

If your GPU is not included in the configuration file, the *default* values will we used instead, though to obtain
better performance you might want to edit this file and rebuild the docker images.

Note also that since the full style image is applied to each tile separately, as a result the style features will appear
as smaller in the rendered image.

#### Style weight

Gatys algorithm allows to adjust the amount of style imposed over the content image, by means of the --sw parameter.
By default a value of **5** is used, meaning the importance of the style is 5 times the importance of the content.
Smaller weight values result in the transfer of colors, while higher values transfer textures and even objects of the
style.
Gatys and Gatys Multiresolution algorithms allow to adjust the amount of style imposed over the content image, by means
of the --sw parameter. By default a value of **5** is used, meaning the importance of the style is 5 times the
importance of the content. Smaller weight values result in the transfer of colors, while higher values transfer textures
and even objects of the style.

If several weight values are provided, all combinations will be generated. For instance, to generate the same
style transfer with three different weights, use

nvidia-docker run --rm -v $(pwd):/images albarji/neural-style --content contents/docker.png --style styles/vangogh.png --sw 5 10 20


Note also that they Gatys Multiresolution algorithm tends to produce a stronger style imprint, and this you might want
to use weight values smaller than the default (e.g. 3).

#### Style scale

If the transferred style results in too large or too small features, the scaling can be modified through the --ss
Expand All @@ -145,5 +156,7 @@ logo example above the transparent background is not transformed.
* [Gatys et al method](https://arxiv.org/abs/1508.06576), [implementation by jcjohnson](https://github.com/jcjohnson/neural-style)
* [Chen-Schmidt method](https://arxiv.org/pdf/1612.04337.pdf), [implementation](https://github.com/rtqichen/style-swap)
* [A review on style transfer methods](https://arxiv.org/pdf/1705.04058.pdf)
* [Controlling Perceptual Factors in Neural Style Transfer](https://arxiv.org/abs/1611.07865)
* [Neural-tiling method](https://github.com/ProGamerGov/Neural-Tile)
* [Multiresolution strategy](https://gist.github.com/jcjohnson/ca1f29057a187bc7721a3a8c418cc7db)
* [The Wikipedia logo](https://en.wikipedia.org/wiki/Wikipedia_logo)
1 change: 1 addition & 0 deletions conda.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
numpy
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 4 additions & 10 deletions entrypoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@
--ss STYLE_SCALE (default 1.0): scaling or list of scaling factors for the style images
--alg ALGORITHM: style-transfer algorithm to use. Must be one of the following:
gatys Highly detailed transfer, slow processing times (default)
gatys-multiresolution Multipass version of Gatys method, provides even better quality
chen-schmidt Fast patch-based style transfer
chen-schmidt-inverse Even faster aproximation to chen-schmidt through the use of an inverse network
--tilesize TILE_SIZE: maximum size of each tile in the style transfer.
If your GPU runs out of memory you should try reducing this value. Default: 400
--tileoverlap TILE_OVERLAP: overlap of tiles in the style transfer, measured in pixels. Default: 100
--tileoverlap TILE_OVERLAP: overlap of tiles in the style transfer, measured in pixels. If you experience
artifacts in the image you should try increasing this. Default: 100
Additionally provided parameters are carried on to the underlying algorithm.
Expand All @@ -42,7 +42,6 @@ def main(argv=None):
alg = "gatys"
weights = None
stylescales = None
tilesize = None
tileoverlap = None
otherparams = []

Expand Down Expand Up @@ -72,9 +71,6 @@ def main(argv=None):
elif argv[i] == "--ss":
stylescales = [float(x) for x in sublist(argv[i+1:], stopper="-")]
i += len(stylescales) + 1
elif argv[i] == "--tilesize":
tilesize = int(argv[i+1])
i += 2
elif argv[i] == "--tileoverlap":
tileoverlap = int(argv[i+1])
i += 2
Expand All @@ -100,10 +96,8 @@ def main(argv=None):
LOGGER.info("\tStyle weights = %s" % str(weights))
LOGGER.info("\tStyle scales = %s" % str(stylescales))
LOGGER.info("\tSize = %s" % str(size))
LOGGER.info("\tTile size = %s" % str(tilesize))
LOGGER.info("\tTile overlap = %s" % str(tileoverlap))
styletransfer(contents, styles, savefolder, size, alg, weights, stylescales, tilesize, tileoverlap,
algparams=otherparams)
styletransfer(contents, styles, savefolder, size, alg, weights, stylescales, tileoverlap, algparams=otherparams)
return 1

except Exception:
Expand Down
26 changes: 26 additions & 0 deletions gpuconfig.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
"GeForce GTX 970M": {
"gatys": 512,
"gatys-multiresolution": 750,
"chen-schmidt": 750,
"chen-schmidt-inverse": 400
},
"Tesla K80": {
"gatys": 1300,
"gatys-multiresolution": 1300,
"chen-schmidt": 1500,
"chen-schmidt-inverse": 800
},
"Tesla P100-PCIE-16GB": {
"gatys": 1300,
"gatys-multiresolution": 1300,
"chen-schmidt": 2048,
"chen-schmidt-inverse": 900
},
"default": {
"gatys": 512,
"gatys-multiresolution": 750,
"chen-schmidt": 750,
"chen-schmidt-inverse": 400
}
}
124 changes: 110 additions & 14 deletions neuralstyle/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
from shutil import copyfile
import logging
from math import ceil
import numpy as np
import json
import GPUtil
from neuralstyle.utils import filename, fileext
from neuralstyle.imagemagick import (convert, resize, shape, assertshape, choptiles, feather, smush, composite,
extractalpha, mergealpha)
Expand All @@ -29,6 +32,7 @@
"-num_iterations", "500"
]
},
"gatys-multiresolution": {},
"chen-schmidt": {
"folder": "/app/style-swap",
"command": "th style-swap.lua",
Expand All @@ -46,16 +50,20 @@
}
}

# Load file with GPU configuration
with open("gpuconfig.json", "r") as f:
GPUCONFIG = json.load(f)


def styletransfer(contents, styles, savefolder, size=None, alg="gatys", weights=None, stylescales=None,
maxtilesize=400, tileoverlap=100, algparams=None):
tileoverlap=100, algparams=None):
"""General style transfer routine over multiple sets of options"""
# Check arguments
if alg not in ALGORITHMS.keys():
raise ValueError("Unrecognized algorithm %s, must be one of %s" % (alg, str(list(ALGORITHMS.keys()))))

# Plug default options
if alg != "gatys":
if alg != "gatys" and alg != "gatys-multiresolution":
if weights is not None:
LOGGER.warning("Only gatys algorithm accepts style weights. Ignoring style weight parameters")
weights = [None]
Expand All @@ -64,8 +72,6 @@ def styletransfer(contents, styles, savefolder, size=None, alg="gatys", weights=
weights = [5.0]
if stylescales is None:
stylescales = [1.0]
if maxtilesize is None:
maxtilesize = 400
if tileoverlap is None:
tileoverlap = 100
if algparams is None:
Expand All @@ -75,13 +81,13 @@ def styletransfer(contents, styles, savefolder, size=None, alg="gatys", weights=
for content, style, weight, scale in product(contents, styles, weights, stylescales):
outfile = outname(savefolder, content, style, alg, scale, weight)
# If the desired size is smaller than the maximum tile size, use a direct neural style
if fitsingletile(targetshape(content, size), maxtilesize):
if fitsingletile(targetshape(content, size), alg):
styletransfer_single(content=content, style=style, outfile=outfile, size=size, alg=alg, weight=weight,
stylescale=scale, algparams=algparams)
# Else use a tiling strategy
else:
neuraltile(content=content, style=style, outfile=outfile, size=size, maxtilesize=maxtilesize,
overlap=tileoverlap, alg=alg, weight=weight, stylescale=scale, algparams=algparams)
neuraltile(content=content, style=style, outfile=outfile, size=size, overlap=tileoverlap, alg=alg,
weight=weight, stylescale=scale, algparams=algparams)


def styletransfer_single(content, style, outfile, size=None, alg="gatys", weight=5.0, stylescale=1.0, algparams=None):
Expand All @@ -101,6 +107,8 @@ def styletransfer_single(content, style, outfile, size=None, alg="gatys", weight
algfile = workdir.name + "/" + "algoutput.png"
if alg == "gatys":
gatys(rgbfile, stylepng, algfile, size, weight, stylescale, algparams)
elif alg == "gatys-multiresolution":
gatys_multiresolution(rgbfile, stylepng, algfile, size, weight, stylescale, algparams)
elif alg in ["chen-schmidt", "chen-schmidt-inverse"]:
chenschmidt(alg, rgbfile, stylepng, algfile, size, stylescale, algparams)
# Enforce correct size
Expand All @@ -111,8 +119,8 @@ def styletransfer_single(content, style, outfile, size=None, alg="gatys", weight
mergealpha(algfile, alphafile, outfile)


def neuraltile(content, style, outfile, size=None, maxtilesize=400, overlap=100, alg="gatys", weight=5.0,
stylescale=1.0, algparams=None):
def neuraltile(content, style, outfile, size=None, overlap=100, alg="gatys", weight=5.0, stylescale=1.0,
algparams=None):
"""Strategy to generate a high resolution image by running style transfer on overlapping image tiles"""
LOGGER.info("Starting tiling strategy")
if algparams is None:
Expand All @@ -123,7 +131,7 @@ def neuraltile(content, style, outfile, size=None, maxtilesize=400, overlap=100,
fullshape = targetshape(content, size)

# Compute number of tiles required to map all the image
xtiles, ytiles = tilegeometry(fullshape, maxtilesize, overlap)
xtiles, ytiles = tilegeometry(fullshape, alg, overlap)

# First scale image to target resolution
firstpass = workdir.name + "/" + "lowres.png"
Expand Down Expand Up @@ -187,6 +195,69 @@ def gatys(content, style, outfile, size, weight, stylescale, algparams):
tmpout.close()


def gatys_multiresolution(content, style, outfile, size, weight, stylescale, algparams, startres=256):
"""Runs a multiresolution version of Gatys et al method
The multiresolution strategy starts by generating a small image, then using that image as initializer
for higher resolution images. This procedure is repeated up to the tilesize.
Once the maximum tile size attainable by L-BFGS is reached, more iterations are run by using Adam. This allows
to produce larger images using this method than the basic Gatys.
References:
* Gatys et al - Controlling Perceptual Factors in Neural Style Transfer (https://arxiv.org/abs/1611.07865)
* https://gist.github.com/jcjohnson/ca1f29057a187bc7721a3a8c418cc7db
"""
# Multiresolution strategy: list of rounds, each round composed of a optimization method and a number of
# upresolution steps.
# Using "adam" as optimizer means that Adam will be used when necessary to attain higher resolutions
strategy = [
["lbfgs", 7],
["lbfgs", 7],
["lbfgs", 7],
["lbfgs", 7],
["lbfgs", 7]
]
LOGGER.info("Starting gatys-multiresolution with strategy " + str(strategy))

# Initialization
workdir = TemporaryDirectory()
maxres = targetshape(content, size)[0]
if maxres < startres:
LOGGER.warning("Target resolution (%d) might too small for the multiresolution method to work well" % maxres)
startres = maxres / 2.0
seed = None
tmpout = workdir.name + "/tmpout.png"

# Iterate over rounds
for roundnumber, (optimizer, steps) in enumerate(strategy):
LOGGER.info("gatys-multiresolution round %d with %s optimizer and %d steps" % (roundnumber, optimizer, steps))
roundmax = min(maxtile("gatys"), maxres) if optimizer == "lbfgs" else maxres
resolutions = np.linspace(startres, roundmax, steps, dtype=int)
iters = 1000
for stepnumber, res in enumerate(resolutions):
stepopt = "adam" if res > maxtile("gatys") else "lbfgs"
LOGGER.info("Step %d, resolution %d, optimizer %s" % (stepnumber, res, stepopt))
passparams = algparams[:]
passparams.extend([
"-num_iterations", iters,
"-tv_weight", "0",
"-print_iter", "0",
"-optimizer", stepopt
])
if seed is not None:
passparams.extend([
"-init", "image",
"-init_image", seed
])
gatys(content, style, tmpout, res, weight, stylescale, passparams)
seed = workdir.name + "/seed.png"
copyfile(tmpout, seed)
iters = max(iters/2.0, 100)

convert(tmpout, outfile)


def chenschmidt(alg, content, style, outfile, size, stylescale, algparams):
"""Runs Chen and Schmidt fast style-transfer algorithm
Expand Down Expand Up @@ -250,16 +321,20 @@ def correctshape(result, original, size=None):
assertshape(result, targetshape(original, size))


def tilegeometry(imshape, maxtilesize=400, overlap=50):
def tilegeometry(imshape, alg, overlap=50):
"""Given the shape of an image, computes the number of X and Y tiles to cover it"""
maxtilesize = maxtile(alg)
xtiles = ceil(float(imshape[0] - maxtilesize) / float(maxtilesize - overlap) + 1)
ytiles = ceil(float(imshape[1] - maxtilesize) / float(maxtilesize - overlap) + 1)
return xtiles, ytiles


def fitsingletile(imshape, maxtilesize):
"""Returns whether a given image shape will fit in a single tile or not"""
return all([x <= maxtilesize for x in imshape])
def fitsingletile(imshape, alg):
"""Returns whether a given image shape will fit in a single tile or not.
This depends on the algorithm used and the GPU available in the system"""
mx = maxtile(alg)
return mx*mx >= np.prod(imshape)


def targetshape(content, size=None):
Expand All @@ -272,3 +347,24 @@ def targetshape(content, size=None):
return contentshape
else:
return [size, int(size * contentshape[1] / contentshape[0])]


def gpuname():
"""Returns the model name of the first available GPU"""
gpus = GPUtil.getGPUs()
if len(gpus) == 0:
raise ValueError("No GPUs detected in the system")
return gpus[0].name


def maxtile(alg="gatys"):
"""Returns the recommended configuration maximum tile size, based on the available GPU and algorithm to be run
The size returned should be understood as the maximum tile size for a square tile. If non-square tiles are used,
a maximum tile of the same number of pixels should be used.
"""
gname = gpuname()
if gname not in GPUCONFIG:
LOGGER.warning("Unknown GPU model %s, will use default tiling parameters")
gname = "default"
return GPUCONFIG[gname][alg]
Loading

0 comments on commit f5ee2d5

Please sign in to comment.