(I'm using tensorflow 1.0 and Python 2.7)
I'm having trouble getting an Estimator to work with queues. Indeed, if I use the deprecated SKCompat interface with custom data files and a given batch size, the model trains properly. I'm trying to use the new interface with an input_fn that batches features out of TFRecord files (equivalent to my custom data files). The scripts runs properly but the loss value doesn't change after 200 or 300 steps. It seems that the model is looping on a small input batch (this would explain why the loss converges so fast).
I have a 'run.py' script that looks like the following:
import tensorflow as tf
from tensorflow.contrib import learn, metrics#[...]
evalMetrics = {'accuracy':learn.MetricSpec(metric_fn=metrics.streaming_accuracy)}
runConfig = learn.RunConfig(save_summary_steps=10)
estimator = learn.Estimator(model_fn=myModel,params=myParams,modelDir='/tmp/myDir',config=runConfig)session = tf.Session(graph=tf.get_default_graph())with session.as_default():tf.global_variables_initializer()coordinator = tf.train.Coordinator()threads = tf.train.start_queue_runners(sess=session,coord=coordinator)estimator.fit(input_fn=lambda: inputToModel(trainingFileList),steps=10000)estimator.evaluate(input_fn=lambda: inputToModel(evalFileList),steps=10000,metrics=evalMetrics)coordinator.request_stop()coordinator.join(threads)
session.close()
My inputToModel function looks like this:
import tensorflow as tfdef inputToModel(fileList):features = {'rawData': tf.FixedLenFeature([100],tf.float32),'label': tf.FixedLenFeature([],tf.int64)}tensorDict = tf.contrib.learn.read_batch_record_features(fileList,batch_size=100,features=features,randomize_input=True,reader_num_threads=4,num_epochs=1,name='inputPipeline')tf.local_variables_initializer()data = tensorDict['rawData']labelTensor = tensorDict['label']inputTensor = tf.reshape(data,[-1,10,10,1])return inputTensor,labelTensor
Any help or suggestions is welcome !