Monday, November 25, 2019

Thoughts about how to keep the original colors of the content image when using Gatys' Neural Style

In this post, I am gonna present three ways to preserve the colors of the original image when using Gatys' Neural Style algorithm.

Here are the content and style images:


Content image.


Style image.

Hmm, I didn't even realize there was a watermark on the style image. Should really have been cropped out but it doesn't really matter.

Experiment 1:

Let's use the content and style images as is and ask Gatys' Neural Style (Justin Johnson's implementation on github) to use the original colors (of the content image). Note that I am using a weight for the content image equal to 1. and a weight for the style image equal to 5. The rest is pretty standard.

Here are the Neural Style parameters used:

#!/bin/bash
th ../../neural_style.lua \
-style_image style_image.jpg \
-style_blend_weights nil \
-content_image content_image.jpg \
-image_size 512 \
-gpu -1 \
-content_weight 1. \
-style_weight 5. \
-tv_weight 1e-5 \
-num_iterations 1000 \
-normalize_gradients \
-init image \
-init_image content_image.jpg \
-optimizer lbfgs \
-learning_rate 1e1 \
-lbfgs_num_correction 0 \
-print_iter 50 \
-save_iter 100 \
-output_image new_content_image.jpg \
-style_scale 1.0 \
-original_colors 1 \
-pooling max \
-proto_file ../../models/VGG_ILSVRC_19_layers_deploy.prototxt \
-model_file ../../models/VGG_ILSVRC_19_layers.caffemodel \
-backend nn \
-seed -1 \
-content_layers relu4_2 \
-style_layers relu1_1,relu2_1,relu3_1,relu4_1,relu5_1


Resulting image (512 pixels).

Yeah, it's not that great. I think we can do better.

Experiment 2:

Let's color match the style image to the content image and ask Gatys' Neural Style not to use the original colors. Note that I am using a weight for the content image equal to 1. and a weight for the style image equal to 5. The rest is pretty standard.

To color match the style image to the content image, I use my own software called thecolormatcher. It's very basic stuff.


New style image (color matched to the content image).

Here are the Neural Style parameters used:

#!/bin/bash
th ../../neural_style.lua \
-style_image new_style_image.jpg \
-style_blend_weights nil \
-content_image content_image.jpg \
-image_size 512 \
-gpu -1 \
-content_weight 1. \
-style_weight 5. \
-tv_weight 1e-5 \
-num_iterations 1000 \
-normalize_gradients \
-init image \
-init_image content_image.jpg \
-optimizer lbfgs \
-learning_rate 1e1 \
-lbfgs_num_correction 0 \
-print_iter 50 \
-save_iter 100 \
-output_image new_content_image.jpg \
-style_scale 1.0 \
-original_colors 0 \
-pooling max \
-proto_file ../../models/VGG_ILSVRC_19_layers_deploy.prototxt \
-model_file ../../models/VGG_ILSVRC_19_layers.caffemodel \
-backend nn \
-seed -1 \
-content_layers relu4_2 \
-style_layers relu1_1,relu2_1,relu3_1,relu4_1,relu5_1


Resulting image (512 pixels).


Resulting image (768 pixels).

Let's change the weight for the style image from 5. to 20. in order to get more of the texture of the style image in the resulting image.

Here are the Neural Style parameters used:

#!/bin/bash
th ../../neural_style.lua \
-style_image new_style_image.jpg \
-style_blend_weights nil \
-content_image content_image.jpg \
-image_size 512 \
-gpu -1 \
-content_weight 1. \
-style_weight 20. \
-tv_weight 1e-5 \
-num_iterations 1000 \
-normalize_gradients \
-init image \
-init_image content_image.jpg \
-optimizer lbfgs \
-learning_rate 1e1 \
-lbfgs_num_correction 0 \
-print_iter 50 \
-save_iter 100 \
-output_image new_content_image.jpg \
-style_scale 1.0 \
-original_colors 0 \
-pooling max \
-proto_file ../../models/VGG_ILSVRC_19_layers_deploy.prototxt \
-model_file ../../models/VGG_ILSVRC_19_layers.caffemodel \
-backend nn \
-seed -1 \
-content_layers relu4_2 \
-style_layers relu1_1,relu2_1,relu3_1,relu4_1,relu5_1


Resulting image (512 pixels).


Resulting image (768 pixels).

Yeah, this method is pretty good at keeping the original colors but there are some colors that are lost, like the red on the lips.

Experiment 3:

Let's create luminance images for both the content image and the style image and ask Gatys' Neural Style not to use the original colors. Note that I am using a weight for the content image equal to 1. and a weight for the style image equal to 5. The rest is pretty standard.

To extract the luminance of the content image and style image, I use my own software called theluminanceextracter. To inject back the resulting luminance image into the content image, I also use my own software called theluminanceswapper. It's very basic stuff.


Content luminance image.


Style luminance image.

Here are the Neural Style parameters used:

#!/bin/bash
th ../../neural_style.lua \
-style_image style_luminance_image.jpg \
-style_blend_weights nil \
-content_image content_luminance_image.jpg \
-image_size 512 \
-gpu -1 \
-content_weight 1. \
-style_weight 5. \
-tv_weight 1e-5 \
-num_iterations 1000 \
-normalize_gradients \
-init image \
-init_image content_luminance_image.jpg \
-optimizer lbfgs \
-learning_rate 1e1 \
-lbfgs_num_correction 0 \
-print_iter 50 \
-save_iter 100 \
-output_image new_content_luminance_image.jpg \
-style_scale 1.0 \
-original_colors 0 \
-pooling max \
-proto_file ../../models/VGG_ILSVRC_19_layers_deploy.prototxt \
-model_file ../../models/VGG_ILSVRC_19_layers.caffemodel \
-backend nn \
-seed -1 \
-content_layers relu4_2 \
-style_layers relu1_1,relu2_1,relu3_1,relu4_1,relu5_1


Resulting luminance image (512 pixels).


Resulting image (512 pixels).


Resulting luminance image (768 pixels).


Resulting image (768 pixels).

Let's change the weight for the style image from 5. to 20. in order to get more of the texture of the style image in the resulting image.

Here are the Neural Style parameters used:

#!/bin/bash
th ../../neural_style.lua \
-style_image style_luminance_image.jpg \
-style_blend_weights nil \
-content_image content_luminance_image.jpg \
-image_size 512 \
-gpu -1 \
-content_weight 1. \
-style_weight 20. \
-tv_weight 1e-5 \
-num_iterations 1000 \
-normalize_gradients \
-init image \
-init_image content_luminance_image.jpg \
-optimizer lbfgs \
-learning_rate 1e1 \
-lbfgs_num_correction 0 \
-print_iter 50 \
-save_iter 100 \
-output_image new_content_luminance_image.jpg \
-style_scale 1.0 \
-original_colors 0 \
-pooling max \
-proto_file ../../models/VGG_ILSVRC_19_layers_deploy.prototxt \
-model_file ../../models/VGG_ILSVRC_19_layers.caffemodel \
-backend nn \
-seed -1 \
-content_layers relu4_2 \
-style_layers relu1_1,relu2_1,relu3_1,relu4_1,relu5_1


Resulting luminance image (512 pixels).


Resulting image (512 pixels).


Resulting luminance image (768 pixels).


Resulting image (768 pixels).

Yeah, this is probably the best method to preserve the original colors of the image. Method 2 is pretty good as well and probably easier to use as you don't have to bother with getting/swapping the luminance images.

If you are interested in trying to understand how Justin Johnson's Torch implementation of Gatys' Neural Style on github works, I made a post about it: Justin Johnson's Neural Style Torch Implementation Explained.

No comments:

Post a Comment