Question 1

I have a neural network currently implemented in tensorflow, but I am having a problem making predictions after training, because I have a conv2d_transpose operations, and the shapes of these ops are dependent on the batch size. I have a layer that requires output_shape as an argument:

def deconvLayer(input, filter_shape, output_shape, strides):W1_1 = weight_variable(filter_shape)output = tf.nn.conv2d_transpose(input, W1_1, output_shape, strides, padding="SAME")return output

That is actually used in a larger model I have constructed like the following:

 conv3 = layers.convLayer(conv2['layer_output'], [3, 3, 64, 128], use_pool=False)conv4 = layers.deconvLayer(conv3['layer_output'],filter_shape=[2, 2, 64, 128],output_shape=[batch_size, 32, 40, 64],strides=[1, 2, 2, 1])

The problem is, if I go to make a prediction using the trained model, my test data has to have the same batch size, or else I get the following error.

tensorflow.python.framework.errors.InvalidArgumentError: Conv2DBackpropInput: input and out_backprop must have the same batch size

Is there some way that I can get a prediction for an input with variable batch size? When I look at the trained weights, nothing seems to depend on batch size, so I can't see why this would be a problem.

Question 2

So I came across a solution based on the issues forum of tensorflow at https://github.com/tensorflow/tensorflow/issues/833.

In my code

 conv4 = layers.deconvLayer(conv3['layer_output'],filter_shape=[2, 2, 64, 128],output_shape=[batch_size, 32, 40, 64],strides=[1, 2, 2, 1])

my output shape that get passed to deconvLayer was hard coded with a predetermined batch shape when training. By altering this to the following:

def deconvLayer(input, filter_shape, output_shape, strides):W1_1 = weight_variable(filter_shape)dyn_input_shape = tf.shape(input)batch_size = dyn_input_shape[0]output_shape = tf.pack([batch_size, output_shape[1], output_shape[2], output_shape[3]])output = tf.nn.conv2d_transpose(input, W1_1, output_shape, strides, padding="SAME")return output

This allows the shape to be dynamically inferred at run time and can handle a variable batch size.

Running the code, I no longer receive this error when passing in any batch size of test data. I believe this is necessary due to the fact that the inference of shapes for transpose ops is not as straightforward at the moment as it is for normal convolutional ops. So where we would usually use None for the batch_size in normal convolutional ops, we must provide a shape, and since this could vary based on input, we must go through the effort of dynamically determining it.

conv2d_transpose is dependent on batch_size when making predictions

Related Q&A

How SelectKBest (chi2) calculates score?

Refer to multiple Models in View/Template in Django

Can I use a machine learning model as the objective function in an optimization problem?

How to store data like Freebase does?

Django-celery : Passing request Object to worker

How to get ROC curve for decision tree?

pandas - stacked bar chart with timeseries data

Get element at position with Selenium

Facing obstacle to install pyodbc and pymssql in ubuntu 16.04

Cross entropy loss suddenly increases to infinity