conv2d_transpose is dependent on batch_size when making predictions

2024/9/28 2:42:04

I have a neural network currently implemented in tensorflow, but I am having a problem making predictions after training, because I have a conv2d_transpose operations, and the shapes of these ops are dependent on the batch size. I have a layer that requires output_shape as an argument:

def deconvLayer(input, filter_shape, output_shape, strides):W1_1 = weight_variable(filter_shape)output = tf.nn.conv2d_transpose(input, W1_1, output_shape, strides, padding="SAME")return output

That is actually used in a larger model I have constructed like the following:

 conv3 = layers.convLayer(conv2['layer_output'], [3, 3, 64, 128], use_pool=False)conv4 = layers.deconvLayer(conv3['layer_output'],filter_shape=[2, 2, 64, 128],output_shape=[batch_size, 32, 40, 64],strides=[1, 2, 2, 1])

The problem is, if I go to make a prediction using the trained model, my test data has to have the same batch size, or else I get the following error.

tensorflow.python.framework.errors.InvalidArgumentError: Conv2DBackpropInput: input and out_backprop must have the same batch size

Is there some way that I can get a prediction for an input with variable batch size? When I look at the trained weights, nothing seems to depend on batch size, so I can't see why this would be a problem.

Answer

So I came across a solution based on the issues forum of tensorflow at https://github.com/tensorflow/tensorflow/issues/833.

In my code

 conv4 = layers.deconvLayer(conv3['layer_output'],filter_shape=[2, 2, 64, 128],output_shape=[batch_size, 32, 40, 64],strides=[1, 2, 2, 1])

my output shape that get passed to deconvLayer was hard coded with a predetermined batch shape when training. By altering this to the following:

def deconvLayer(input, filter_shape, output_shape, strides):W1_1 = weight_variable(filter_shape)dyn_input_shape = tf.shape(input)batch_size = dyn_input_shape[0]output_shape = tf.pack([batch_size, output_shape[1], output_shape[2], output_shape[3]])output = tf.nn.conv2d_transpose(input, W1_1, output_shape, strides, padding="SAME")return output

This allows the shape to be dynamically inferred at run time and can handle a variable batch size.

Running the code, I no longer receive this error when passing in any batch size of test data. I believe this is necessary due to the fact that the inference of shapes for transpose ops is not as straightforward at the moment as it is for normal convolutional ops. So where we would usually use None for the batch_size in normal convolutional ops, we must provide a shape, and since this could vary based on input, we must go through the effort of dynamically determining it.

https://en.xdnf.cn/q/71283.html

Related Q&A

How SelectKBest (chi2) calculates score?

I am trying to find the most valuable features by applying feature selection methods to my dataset. Im using the SelectKBest function for now. I can generate the score values and sort them as I want, b…

Refer to multiple Models in View/Template in Django

Im making my first steps with Python/Django and wrote an example application with multiple Django apps in one Django project. Now I added another app called "dashboard" where Id like to displ…

Can I use a machine learning model as the objective function in an optimization problem?

I have a data set for which I use Sklearn Decision Tree regression machine learning package to build a model for prediction purposes. Subsequently, I am trying to utilize scipy.optimize package to solv…

How to store data like Freebase does?

I admit that this is basically a duplicate question of Use freebase data on local server? but I need more detailed answers than have already been given thereIve fallen absolutely in love with Freebase…

Django-celery : Passing request Object to worker

How can i pass django request object to celery worker. When try to pass the request object it throws a Error Cant Pickle Input ObjectsIt seems that celery serialize any arguments passed to worker. I tr…

How to get ROC curve for decision tree?

I am trying to find ROC curve and AUROC curve for decision tree. My code was something likeclf.fit(x,y) y_score = clf.fit(x,y).decision_function(test[col]) pred = clf.predict_proba(test[col]) print(skl…

pandas - stacked bar chart with timeseries data

Im trying to create a stacked bar chart in pandas using time series data:DATE TYPE VOL0 2010-01-01 Heavy 932.6129031 2010-01-01 Light 370.6129032 2010-01-01 Medium 569.4516133 …

Get element at position with Selenium

Is it possible to either run or get the same functionality provided by document.elementFromPoint using a Selenium webdriver?

Facing obstacle to install pyodbc and pymssql in ubuntu 16.04

I want to install pyodbc for connection mssql server using sqlalchemy I am googling and tried in several ways like pip install pyodbcFollowed this link Pyodbc installation error on Ubuntu 16.04 with S…

Cross entropy loss suddenly increases to infinity

I am attempting to replicate an deep convolution neural network from a research paper. I have implemented the architecture, but after 10 epochs, my cross entropy loss suddenly increases to infinity. Th…