TF Convolution Layer

tf.nn.conv2d() and tf.nn.bias_add()

1
2
3
4
5
6
7
8
9
10
11
12
13
# input/Image
input = tf.placeholder(tf.float32, shape=[None, image_height, image_width, color_channels])
# weight and bias
weight = tf.Variable(tf.truncated_normal(
[filter_size_height, filter_size_width, color_channels, k_output]))
bias = tf.Variable(tf.zeros(k_output))
def conv2d(x, W, b, stride=1):
x = tf.nn.conv2d(x, W, strides=[1, stride, stride, 1], padding='SAME')
x = tf.nn.bias_add(x, b)
return tf.nn.relu(x)
conv_layer = conv2d(input, weight, bias, strides=2)

padding can be 'SAME'or 'VALID'

Pooling

Conceptually, the benefit of the max pooling operation is to reduce the size of the input(which can help prevent overfitting), and allow the neural network to focus on only the most important elements. Max pooling does this by only retaining the maximum value for each filtered area, and removing the remaining values.

1
2
def maxpool2d(x, k=2):
return tf.nn.max_pool(x,ksize=[1, k, k, 1],strides=[1, k, k, 1],padding='SAME')

4 element lists of ksize and strides corresponde to the dimension of the input tensor ([batch, height, width, channels]), batch and channel dimensions are typically set to 1.

Recently, pooling layers have fallen out of favor. Some reasons are:

  • Recent datasets are so big and complex we’re more concerned about underfitting.
  • Dropout is a much better regularizer.
  • Pooling results in a loss of information. Think about the max pooling operation as an example. We only keep the largest of n numbers, thereby disregarding n-1 numbers completely.

Walk Through MNIST again with CNN

Define the model:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Store layers weight & bias
# [filter_size_height, filter_size_width, color_channels, k_output]
weights = {
'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])),
'out': tf.Variable(tf.random_normal([1024, n_classes]))}
biases = {
'bc1': tf.Variable(tf.random_normal([32])),
'bc2': tf.Variable(tf.random_normal([64])),
'bd1': tf.Variable(tf.random_normal([1024])),
'out': tf.Variable(tf.random_normal([n_classes]))}
def conv_net(x, weights, biases, dropout):
# Layer 1 - 28*28*1 to 14*14*32
conv1 = conv2d(x, weights['wc1'], biases['bc1'])
conv1 = maxpool2d(conv1, k=2)
# Layer 2 - 14*14*32 to 7*7*64
conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
conv2 = maxpool2d(conv2, k=2)
# Fully connected layer - 7*7*64 to 1024
fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])
fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
fc1 = tf.nn.relu(fc1)
fc1 = tf.nn.dropout(fc1, dropout)
# Output Layer - class prediction - 1024 to 10
out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
return out