首页 > 解决方案 > 仅使用图片特定部分的颜色进行风格转移

问题描述

我有一个神经风格迁移模型。我目前正在尝试使用图像的不同部分来传输不同的图片。我想知道如何让模型只使用图像中存在的颜色。下面是一个例子:

在此处输入图像描述

在此处输入图像描述

上图是我使用阈值处理得到的风格图像以及原始图像。现在传输的图片如下:

在此处输入图像描述

显然它转移了图像的一些黑色部分,但我只希望转移存在的非黑色。下面是我的模型代码:

import torch
import torch.nn as nn
import torch.optim as optim
from PIL import Image
import torchvision.transforms as transforms
import torchvision.models as models
from torchvision.utils import save_image


class VGG(nn.Module):
    def __init__(self):
        super(VGG, self).__init__()

        self.chosen_features = ["0", "5", "10", "19", "28"]


        self.model = models.vgg19(pretrained=True).features[:29]

    def forward(self, x):
        # Store relevant features
        features = []

        for layer_num, layer in enumerate(self.model):
            x = layer(x)

            if str(layer_num) in self.chosen_features:
                features.append(x)

        return features


device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    
def load_image(image_name):
    image = Image.open(image_name)
    image = loader(image).unsqueeze(0)
    return image.to(device)

imsize = 384

loader = transforms.Compose(
    [
        transforms.Resize((imsize, imsize)),
        transforms.ToTensor(),
        # transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ]
)

original_img = load_image("Content Image.jpg")
style_img = load_image("Adaptive Image 2.jpg")

# initialized generated as white noise or clone of original image.
# Clone seemed to work better for me.

generated = original_img.clone().requires_grad_(True)
# generated = load_image("20epoctom.png")
model = VGG().to(device).eval()


# Hyperparameters
total_steps = 10000
learning_rate = 0.001
alpha = 1
beta = 0.01
optimizer = optim.Adam([generated], lr=learning_rate)

for step in range(total_steps):
    # Obtain the convolution features in specifically chosen layers
    generated_features = model(generated)
    original_img_features = model(original_img)
    style_features = model(style_img)

    # Loss is 0 initially
    style_loss = original_loss = 0

    # iterate through all the features for the chosen layers
    for gen_feature, orig_feature, style_feature in zip(
        generated_features, original_img_features, style_features
    ):

        # batch_size will just be 1
        batch_size, channel, height, width = gen_feature.shape
        original_loss += torch.mean((gen_feature - orig_feature) ** 2)
        # Compute Gram Matrix of generated
        G = gen_feature.view(channel, height * width).mm(
            gen_feature.view(channel, height * width).t()
        )
        # Compute Gram Matrix of Style
        A = style_feature.view(channel, height * width).mm(
            style_feature.view(channel, height * width).t()
        )
        style_loss += torch.mean((G - A) ** 2)

    total_loss = alpha * original_loss + beta * style_loss
    optimizer.zero_grad()
    total_loss.backward()
    optimizer.step()
    
    if step % 500 == 0:
        print(total_loss)
        save_image(generated, f"Generated Pictures/{step//500} Iterations Generated Picture.png")

任何关于可能去哪里的想法都将不胜感激!

标签: pythonopencvpytorchimage-thresholding

解决方案


如果你想要一些方法来在你的风格转移模型中保留非黑色,我建议在这里查看 github repo 。它有 .ipnyb 笔记本,其中包含整个训练管道、模型权重、良好的自述文件等可供参考。根据他们的自述文件,他们尝试实现这篇关于在神经艺术风格迁移中保留颜色的论文,这应该会对您有所帮助您也可以参考其他 repo 并在本文的代码 repo 中运行其中的一些 repo 尽管我确实建议先查看第一个 repo。

如果你想在你的样式转移模型之外进行颜色转移,而是有两个图像在 linrary 中的一些函数的帮助下转移颜色,那么我建议你看看这个教程

萨塔克耆那教


推荐阅读