How to preprocess training set for VGG16 fine tuning in Keras?

2024/9/28 3:25:59

I have fine tuned the Keras VGG16 model, but I'm unsure about the preprocessing during the training phase.

I create a train generator as follow:

train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(train_folder,target_size=(IMAGE_SIZE, IMAGE_SIZE),batch_size=train_batchsize,class_mode="categorical")

Is the rescale enough or I have to apply others preprocessing functions?

When I use the network to classify an image I use this code:

from keras.models import load_model
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as npimg = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)

I think this is the correct preprocess and I should apply it before training.

Thanks for your help.

Answer

ImageDataGenerator has a preprocessing_function argument which allows you to pass the same preprocess_input function that you are using during inference. This function will do the rescaling for you, so can omit the scaling:

from keras.applications.vgg16 import preprocess_input
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

Most of the pretrained models in keras_applications use the same preprocessing function. You can inspect the docstring to see what it does:

def preprocess_input(x, data_format=None, mode='caffe', **kwargs):"""Preprocesses a tensor or Numpy array encoding a batch of images.# Argumentsx: Input Numpy or symbolic tensor, 3D or 4D.The preprocessed data is written over the input dataif the data types are compatible. To avoid thisbehaviour, `numpy.copy(x)` can be used.data_format: Data format of the image tensor/array.mode: One of "caffe", "tf" or "torch".- caffe: will convert the images from RGB to BGR,then will zero-center each color channel withrespect to the ImageNet dataset,without scaling.- tf: will scale pixels between -1 and 1,sample-wise.- torch: will scale pixels between 0 and 1 and thenwill normalize each channel with respect to theImageNet dataset.# ReturnsPreprocessed tensor or Numpy array.
https://en.xdnf.cn/q/71383.html

Related Q&A

Using Python like PHP in Apache/Windows

I understand that I should use mod_wsgi to run Python, and I have been trying to get that set up, but Im confused about it:This is a sample configuration I found for web.py:LoadModule wsgi_module modul…

django-oauth-toolkit : Customize authenticate response

I am new to Django OAuth Toolkit. I want to customize the authenticate response.My authenticate url configuration on django application is : url(authenticate/,include(oauth2_provider.urls, namespace=oa…

Pushing local branch to remote branch

I created new repository in my Github repository.Using the gitpython library Im able to get this repository. Then I create new branch, add new file, commit and try to push to the new branch.Please chec…

Does Pandas, SciPy, or NumPy provide a cumulative standard deviation function?

I have a Pandas series. I need to get sigma_i, which is the standard deviation of a series up to index i. Is there an existing function which efficiently calculates that? I noticed that there are the …

Python: compile into an Unix commandline app

I am not sure if I searched for the wrong terms, but I could not find much on this subject. I am on osx and Id like to compile a commandline python script into a small commandline app, that I can put i…

ModuleNotFoundError in PySpark Worker on rdd.collect()

I am running an Apache Spark program in python, and I am getting an error that I cant understand and cant begin to debug. I have a driver program that defines a function called hound in a file called h…

Sphinx is not able to import anything

I am trying to use sphinx to document a project of mine. I have used autodoc strings within all of my modules and files. I used sphinx-apidoc to automatically generate rst files for my code. So far, so…

Python : why a method from super class not seen?

i am trying to implement my own version of a DailyLogFile from twisted.python.logfile import DailyLogFileclass NDailyLogFile(DailyLogFile):def __init__(self, name, directory, rotateAfterN = 1, defaultM…

Extract features from last hidden layer Pytorch Resnet18

I am implementing an image classifier using the Oxford Pet dataset with the pre-trained Resnet18 CNN. The dataset consists of 37 categories with ~200 images in each of them. Rather than using the final…

Python Graphs: Latex Math rendering of node labels

I am using the following code to create a pygraphviz graph. But is it possible to make it render latex math equations (see Figure 1)? If not, is there an alternative python library that plots similar…