作者 | 李秋键
责编 | 刘静
出品 | CSDN(ID:CSDNnews)
咱们咱们都知道清明上河图是我国国画的代表作之一,是我国十大传世名画 之一。为北宋风俗画 ,北宋画家张择端 仅见的存世精品,属国宝级文物 ,现藏于北京故宫博物院。
清明上河图宽24.8厘米、长528.7厘米 ,绢本设色 。著作以长卷 方式,选用散点透视 构图法,生动记录了我国十二世纪北宋 国都东京(又称汴京 ,今河南开封 )的城市相貌和其时社会各阶层公民的生活状况,是北宋时期国都汴京当年昌盛的见证,也是北宋城市经济状况的描写。
这在我国甚至国际绘画史上都是绝无仅有的。在五米多长的画卷里,共绘了数量巨大的各色人物,牛、骡、驴等家畜,车、轿、巨细船舶,房子、桥梁、城楼 等各有特色,体现了宋代修建的特征。具有很高的前史价值和艺术价值。《清明上河图》尽管局面热烈,但体现的并非昌盛市景,而是一幅带有忧患意识的"盛世危图",官兵懒散税务重。
而咱们今日的项目便是经过对算法的改造,完成归于自己的清明上河图。
下面咱们将使用vgg19模型练习画作,具体进程如下,而且我在每个代码上面都注释了便利检查:
首要咱们导入先关的库:
import tensorflow as tf
import numpy as np
import scipy.io
import scipy.misc
import os
import time
接着界说一些变量便利调用:
CONTENT_IMG = '1.png'
STYLE_IMG = 'sty.jpg'
OUTPUT_DIR = 'neural_style_transfer_tensorflow/'
再创立一个目录用来保存图片:
if not os.path.exists(OUTPUT_DIR):
os.mkdir(OUTPUT_DIR)
界说生成图画的长宽通道等信息:
IMAGE_W = 400
IMAGE_H = 300
COLOR_C = 3
NOISE_RATIO = 0.7
BETA = 5
ALPHA = 100
再接着界说模型途径
VGG_MODEL = 'imagenet-vgg-verydeep-19.mat'
生成一个参数矩阵,作为图画的处理进程之一,对像素值运算:
MEAN_VALUES = np.array([123.68, 116.779, 103.939]).reshape((1, 1, 1, 3))
再接着界说读取模型函数,下面我都有所注解:
def load_vgg_model(path):
'''
Details of the VGG19 model:
- 0 is conv1_1 (3, 3, 3, 64)
- 1 is relu
- 2 is conv1_2 (3, 3, 64, 64)
- 3 is relu
- 4 is maxpool
- 5 is conv2_1 (3, 3, 64, 128)
- 6 is relu
- 7 is conv2_2 (3, 3, 128, 128)
- 8 is relu
- 9 is maxpool
- 10 is conv3_1 (3, 3, 128, 256)
- 11 is relu
- 12 is conv3_2 (3, 3, 256, 256)
- 13 is relu
- 14 is conv3_3 (3, 3, 256, 256)
- 15 is relu
- 16 is conv3_4 (3, 3, 256, 256)
- 17 is relu
- 18 is maxpool
- 19 is conv4_1 (3, 3, 256, 512)
- 20 is relu
- 21 is conv4_2 (3, 3, 512, 512)
- 22 is relu
- 23 is conv4_3 (3, 3, 512, 512)
- 24 is relu
- 25 is conv4_4 (3, 3, 512, 512)
- 26 is relu
- 27 is maxpool
- 28 is conv5_1 (3, 3, 512, 512)
- 29 is relu
- 30 is conv5_2 (3, 3, 512, 512)
- 31 is relu
- 32 is conv5_3 (3, 3, 512, 512)
- 33 is relu
- 34 is conv5_4 (3, 3, 512, 512)
- 35 is relu
- 36 is maxpool
- 37 is fullyconnected (7, 7, 512, 4096)
- 38 is relu
- 39 is fullyconnected (1, 1, 4096, 4096)
- 40 is relu
- 41 is fullyconnected (1, 1, 4096, 1000)
- 42 is softmax
'''
vgg = scipy.io.loadmat(path)
vgg_layers = vgg['layers']
#加载vgg模型获取模型各层参数和称号
def _weights(layer, expected_layer_name):
W = vgg_layers[0][layer][0][0][2][0][0]
b = vgg_layers[0][layer][0][0][2][0][1]
layer_name = vgg_layers[0][layer][0][0][0][0]
assert layer_name == expected_layer_name
return W, b
#将加载的变量初始化成tf可运算的张量类型,函数回来值为激活函数的输出
def _conv2d_relu(prev_layer, layer, layer_name):
W, b = _weights(layer, layer_name)
W = tf.constant(W)
b = tf.constant(np.reshape(b, (b.size)))
return tf.nn.relu(tf.nn.conv2d(prev_layer, filter=W, strides=[1, 1, 1, 1], padding='SAME') + b)
#界说池化层函数
def _avgpool(prev_layer):
return tf.nn.avg_pool(prev_layer, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
#将各层输出值都放到列表中便利加载,构成字典
graph = {}
graph['input'] = tf.Variable(np.zeros((1, IMAGE_H, IMAGE_W, COLOR_C)), dtype='float32')
#界说['conv1_1']为vgg模型的第0层,输入层为上一层的['input' ]
graph['conv1_1'] = _conv2d_relu(graph['input'], 0, 'conv1_1')
graph['conv1_2'] = _conv2d_relu(graph['conv1_1'], 2, 'conv1_2')
graph['avgpool1'] = _avgpool(graph['conv1_2'])
graph['conv2_1'] = _conv2d_relu(graph['avgpool1'], 5, 'conv2_1')
graph['conv2_2'] = _conv2d_relu(graph['conv2_1'], 7, 'conv2_2')
graph['avgpool2'] = _avgpool(graph['conv2_2'])
graph['conv3_1'] = _conv2d_relu(graph['avgpool2'], 10, 'conv3_1')
graph['conv3_2'] = _conv2d_relu(graph['conv3_1'], 12, 'conv3_2')
graph['conv3_3'] = _conv2d_relu(graph['conv3_2'], 14, 'conv3_3')
graph['conv3_4'] = _conv2d_relu(graph['conv3_3'], 16, 'conv3_4')
graph['avgpool3'] = _avgpool(graph['conv3_4'])
graph['conv4_1'] = _conv2d_relu(graph['avgpool3'], 19, 'conv4_1')
graph['conv4_2'] = _conv2d_relu(graph['conv4_1'], 21, 'conv4_2')
graph['conv4_3'] = _conv2d_relu(graph['conv4_2'], 23, 'conv4_3')
graph['conv4_4'] = _conv2d_relu(graph['conv4_3'], 25, 'conv4_4')
graph['avgpool4'] = _avgpool(graph['conv4_4'])
graph['conv5_1'] = _conv2d_relu(graph['avgpool4'], 28, 'conv5_1')
graph['conv5_2'] = _conv2d_relu(graph['conv5_1'], 30, 'conv5_2')
graph['conv5_3'] = _conv2d_relu(graph['conv5_2'], 32, 'conv5_3')
graph['conv5_4'] = _conv2d_relu(graph['conv5_3'], 34, 'conv5_4')
graph['avgpool5'] = _avgpool(graph['conv5_4'])
return graph
为了完成自己的项目作用,设定丢失函数:
#界说内容丢失函数,变量为tf核算图和vgg模型参数,回来值为丢失值
def content_loss_func(sess, model):
#p便是model['conv4_2'])参数,x是model['conv4_2'])
def _content_loss(p, x):
#p的值为Tensor("Relu_9:0", shape=(1, 75, 100, 512), dtype=float32),故N为512,M为75*100,分别为卷积核个数,卷积核巨细的宽*100
N = p.shape[3]
M = p.shape[1] * p.shape[2]
return (1 / (4 * N * M)) * tf.reduce_sum(tf.pow(x - p, 2))
return _content_loss(sess.run(model['conv4_2']), model['conv4_2'])
STYLE_LAYERS = [('conv1_1', 0.5), ('conv2_1', 1.0), ('conv3_1', 1.5), ('conv4_1', 3.0), ('conv5_1', 4.0)]
#回来值为_style_loss的值*0.5,1,1.5,4的加和
def style_loss_func(sess, model):
def _gram_matrix(F, N, M):
Ft = tf.reshape(F, (M, N))
return tf.matmul(tf.transpose(Ft), Ft)
#a,x都为'conv1_1', conv2_1', 'conv3_1', 'conv4_1','conv5_1'中的参数遍历
def _style_loss(a, x):
#同内容丢失函数
N = a.shape[3]
M = a.shape[1] * a.shape[2]
A = _gram_matrix(a, N, M)
G = _gram_matrix(x, N, M)
return (1 / (4 * N ** 2 * M ** 2)) * tf.reduce_sum(tf.pow(G - A, 2))
return sum([_style_loss(sess.run(model[layer_name]), model[layer_name]) * w for layer_name, w in STYLE_LAYERS])
再界说生成图片,读取图片,保存图片函数:
#发生噪声图片
def generate_noise_image(content_image, noise_ratio=NOISE_RATIO):
#随机发生矩阵图片,矩阵元素内容契合规范正太散布
noise_image = np.random.uniform(-20, 20, (1, IMAGE_H, IMAGE_W, COLOR_C)).astype('float32')
#将发生的矩阵内各元素与神经网络加和
input_image = noise_image * noise_ratio + content_image * (1 - noise_ratio)
return input_image
#读取图片,改动尺度,变成1行多列矩阵,将矩阵与初始值相减回来
def load_image(path):
image = scipy.misc.imread(path)
image = scipy.misc.imresize(image, (IMAGE_H, IMAGE_W))
#image.shape为[800,600,3],则(1, ) + image.shape)为[1,800,600,3]
image = np.reshape(image, ((1, ) + image.shape))
#MEAN_VALUES = np.array([123.68, 116.779, 103.939]).reshape((1, 1, 1, 3))
#其间image为三通道矩阵,MEAN_VALUES为三维矩阵能够相减
image = image - MEAN_VALUES
return image
#保存图片
def save_image(path, image):
image = image + MEAN_VALUES
#拜见上面图画加载时多加了1维,故构成时要削减维度,
image = image[0]
#截取一切数值在0-255之间的,由于像素值有必要是这个规模。而参数运算后可能会超越这个值
image = np.clip(image, 0, 255).astype('uint8')
#保存
scipy.misc.imsave(path, image)
下面是练习加载:
#发动核算图
with tf.Session() as sess:
#读取图片,回来值为减去MEAN_VALUES的矩阵,矩阵形状为[1,800,600,3]
content_image = load_image(CONTENT_IMG)
style_image = load_image(STYLE_IMG)
#加载vgg19模型,回来值为一个字典,里边为各网络层参数,输入和输出
model = load_vgg_model(VGG_MODEL)
#发生噪声图片,回来值为随机矩阵加上网络层参数的新矩阵
input_image = generate_noise_image(content_image)
#变量初始化
sess.run(tf.global_variables_initializer())
#从网络层input层开端运算内容图片矩阵
sess.run(model['input'].assign(content_image))
content_loss = content_loss_func(sess, model)
# 从网络层input层开端运算内容图片矩阵
sess.run(model['input'].assign(style_image))
style_loss = style_loss_func(sess, model)
#总丢失为内容丢失加上风格丢失
total_loss = BETA * content_loss + ALPHA * style_loss
#树立优化器以调整参数
optimizer = tf.train.AdamOptimizer(2.0)
#优化器调整参数,使得丢失为最小
train = optimizer.minimize(total_loss)
sess.run(tf.global_variables_initializer())
# 从网络层input层开端运算构成新的图片
sess.run(model['input'].assign(input_image))
ITERATIONS = 2000
#练习2000轮
for i in range(ITERATIONS):
sess.run(train)
print('Iteration %d' % i)
print('Cost: ', sess.run(total_loss))
if i % 100 == 0:
#每一百次加载一次网络参数以保存图片
output_image = sess.run(model['input'])
print('Iteration %d' % i)
print('Cost: ', sess.run(total_loss))
save_image(os.path.join(OUTPUT_DIR, 'output_%d.jpg' % i), output_image)
终究得到的作用如图所示:
左面是电视里找的图片,右边是模仿的图片,由此可见生成的作用仍是能够的。而这个程序的首要思路便是在一个生成随机矩阵的基础上,经过加载网络层练习参数,然后生成的矩阵值按书札乘以网络参数,然后把矩阵保存为图片即可到达模仿生成的作用。而其间参数的调整是根据深层次网络提取的图画特征按公式运算,经过优化器优化参数,经过练习次数的添加,参数也在逐步改进,终究构成自己需求的图片作用。
作者简介:李秋键,CSDN 博客专家,CSDN达人课作者。硕士在读于我国矿业大学,开发有安卓武侠游戏一部,VIP视频解析,辞意转化写作机器人等项目,宣布论文若干,屡次高数比赛获奖等等。
声明:本文为作者原创投稿,未经答应请勿转载。
【END】