Preprocess a Tensorflow tensor in Numpy

2024/10/2 12:19:56

I have set up a CNN in Tensorflow where I read my data with a TFRecordReader. It works well but I would like to do some more preprocessing and data augmentation than offered by the tf.image functions. I would specifically like to do some randomized scaling.

Is it possible to process a Tensorflow tensor in Numpy? Or do I need to drop the TFRecordReader and rather do all my preprocessing in Numpy and feed data using the feed_dict? I suspect that the feed_dict method is slow when training on images, but I might be wrong?

Answer

If you could create a custom I/O pipeline that fetches intermediate results back from TensorFlow using one or more threads, applies arbitrary Python logic, and then feeds them into a queue for subsequent processing. The resulting program would be somewhat more complicated, but I suggest you look at the threading and queues HOWTO for information on how to get started.


There is an experimental feature that might make this easier, if you install from source.

If you have already built a preprocessing pipeline using TensorFlow ops, the easiest way to add some custom Python code is to use the tf.py_func() operator, which takes a list of Tensor objects, and a Python function that maps one or more NumPy arrays to one or more NumPy arrays.

For example, let's say you have a pipeline like this:

reader = tf.TFRecordReader(...)
image_t = tf.image.decode_png(tf.parse_single_example(reader.read(), ...))

...you could use tf.py_func() to apply some custom NumPy processing as follows:

from scipy import ndimage
def preprocess(array):# `array` is a NumPy array containing.return ndimage.rotate(array, 45)image_t = tf.py_func(preprocess, [image_t], [tf.float32])
https://en.xdnf.cn/q/70850.html

Related Q&A

Os.path : can you explain this behavior?

I love Python because it comes batteries included, and I use built-in functions, a lot, to do the dirty job for me.I have always been using happily the os.path module to deal with file path but recentl…

admin.py for project, not app

How can I specify a project level admin.py?I asked this question some time ago and was just awarded the Tumbleweed award because of the lack of activity on the question! >_<Project:settings.py a…

Python Socket Receive/Send Multi-threading

I am writing a Python program where in the main thread I am continuously (in a loop) receiving data through a TCP socket, using the recv function. In a callback function, I am sending data through the …

numpy array2string applied on huge array, skips central values, ( ... in the middle )

I have array of size (3, 3, 19, 19), which I applied flatten to get array of size 3249.I had to write these values to file along with some other data, so I did following to get the array in string.np.a…

save password as salted hash in mongodb in users collection using python/bcrypt

I want to generate a salted password hash and store it in MongoDB collection called users, like this:users_doc = { "username": "James","password": "<salted_hash_pa…

Get the min of [0, x] element wise for a column

I need to compute a column where the value is the result of a vectorized operation over other columns: df["new_col"] = df["col1"] - min(0,df["col2"])It turned out, however…

Virtual column in QTableView?

Im started to learning Qt4 Model/View Programming and I have beginner question.I have simple application which show sqlite table in QTableView:class Model(QtSql.QSqlTableModel):def __init__(self, paren…

Python httplib2, AttributeError: set object has no attribute items

Im playing with the Python library httplib2. The following is my code. import urllib.parse import httplib2httplib2.debuglevel = 1http = httplib2.Http()url = "http://login.sina.com.cn/hd/signin.php…

Atom IDE autocomplete-python not working

I have just installed the Atom IDE and the package autocomplete-python (on Windows). But the package is not working. Do I have to make any setting changes? (I have disabled autocomplete-plus and autoc…

Multiple instances of a class being overwritten at the same time? (Python)

Heres a very simple code I made to demonstrate the problem Im encountering. Whats happening here is that Im creating two different instances of the same class but changing an attribute of one will cha…