How to provide input for a TensorFlow DNNRegressor in Java?

2024/10/12 20:23:52

I managed to write a TensorFlow python program with a DNNRegressor. I have trained the model and is able to get a prediction from the model in Python by manually created input (constant tensors). I have also been able to export the model in binary format.

import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.python.framework import graph_util#######################
# Setup
######################## Converting Data into Tensors
def input_fn(df, training = True):# Creates a dictionary mapping from each continuous feature column name (k) to# the values of that column stored in a constant Tensor.continuous_cols = {k: tf.constant(df[k].values)for k in continuous_features}feature_cols = dict(list(continuous_cols.items()))if training:# Converts the label column into a constant Tensor.label = tf.constant(df[LABEL_COLUMN].values)# Returns the feature columns and the label.return feature_cols, label# Returns the feature columns    return feature_colsdef train_input_fn():return input_fn(train_df)def eval_input_fn():return input_fn(evaluate_df)#######################
# Data Preparation
#######################
df_train_ori = pd.read_csv('training.csv')
df_test_ori = pd.read_csv('test.csv')
train_df = df_train_ori.head(10000)
evaluate_df = df_train_ori.tail(5)
test_df = df_test_ori.head(1)
MODEL_DIR = "/tmp/model"
BIN_MODEL_DIR = "/tmp/modelBinary"
features = train_df.columns
continuous_features = [feature for feature in features if 'label' not in feature]
LABEL_COLUMN = 'label'engineered_features = []for continuous_feature in continuous_features:engineered_features.append(tf.contrib.layers.real_valued_column(column_name=continuous_feature,dimension=1,default_value=None,dtype=tf.int64,normalizer=None))#######################
# Define Our Model
#######################
regressor = tf.contrib.learn.DNNRegressor(feature_columns=engineered_features,label_dimension=1,hidden_units=[128, 256, 512], model_dir=MODEL_DIR)#######################
# Training Our Model
#######################
wrap = regressor.fit(input_fn=train_input_fn, steps=5)#######################
# Evaluating Our Model
#######################
results = regressor.evaluate(input_fn=eval_input_fn, steps=1)
for key in sorted(results):print("%s: %s" % (key, results[key]))#######################
# Save binary model (to be used in Java)
#######################
tfrecord_serving_input_fn = tf.contrib.learn.build_parsing_serving_input_fn(tf.contrib.layers.create_feature_spec_for_parsing(engineered_features)) 
regressor.export_savedmodel(export_dir_base=BIN_MODEL_DIR, serving_input_fn = tfrecord_serving_input_fn,assets_extra=None,as_text=False,checkpoint_path=None,strip_default_attrs=False)

My next step was to load the model into java and make some predictions. I do however have a problem with specifying the input for the model in Java.

import org.tensorflow.*;
import org.tensorflow.framework.MetaGraphDef;
import org.tensorflow.framework.SignatureDef;
import org.tensorflow.framework.TensorInfo;
import java.util.List;
import java.util.Map;public class ModelEvaluator {public static void main(String[] args) throws Exception {System.out.println("Using TF version: " + TensorFlow.version());SavedModelBundle model = SavedModelBundle.load("/tmp/modelBinary/1546510038", "serve");Session session = model.session();printSignature(model);printAllNodes(model);float[][] km1 = new float[1][1];km1[0][0] = 10;Tensor inKm1 = Tensor.create(km1);float[][] km2 = new float[1][1];km2[0][0] = 10000;Tensor inKm2 = Tensor.create(km2);List<Tensor<?>> outputs = session.runner().feed("dnn/input_from_feature_columns/input_from_feature_columns/km1/ToFloat", inKm1).feed("dnn/input_from_feature_columns/input_from_feature_columns/km2/ToFloat", inKm2).fetch("dnn/regression_head/predictions/Identity:0").run();System.out.println("\n\nOutputs from evaluation:");for (Tensor<?> output : outputs) {if (output.dataType() == DataType.STRING) {System.out.println(new String(output.bytesValue()));} else {float[] outArray = new float[1];output.copyTo(outArray);System.out.println(outArray[0]);}}}public static void printAllNodes(SavedModelBundle model) {model.graph().operations().forEachRemaining(x -> {System.out.println(x.name() + "   " + x.numOutputs());});}/*** This info can also be obtained from a command prompt via the command:* saved_model_cli show  --dir <dir-to-the-model> --tag_set serve --signature_def serving_default* <p>* See this where they also try to input data to a DNN regressor:* https://github.com/tensorflow/tensorflow/issues/12367* <p>* https://github.com/tensorflow/tensorflow/issues/14683* <p>* https://github.com/migueldeicaza/TensorFlowSharp/issues/293*/public static void printSignature(SavedModelBundle model) throws Exception {MetaGraphDef m = MetaGraphDef.parseFrom(model.metaGraphDef());SignatureDef sig = m.getSignatureDefOrThrow("serving_default");int numInputs = sig.getInputsCount();int i = 1;System.out.println("-----------------------------------------------");System.out.println("MODEL SIGNATURE");System.out.println("Inputs:");for (Map.Entry<String, TensorInfo> entry : sig.getInputsMap().entrySet()) {TensorInfo t = entry.getValue();System.out.printf("%d of %d: %-20s (Node name in graph: %-20s, type: %s)\n",i++, numInputs, entry.getKey(), t.getName(), t.getDtype());}int numOutputs = sig.getOutputsCount();i = 1;System.out.println("Outputs:");for (Map.Entry<String, TensorInfo> entry : sig.getOutputsMap().entrySet()) {TensorInfo t = entry.getValue();System.out.printf("%d of %d: %-20s (Node name in graph: %-20s, type: %s)\n",i++, numOutputs, entry.getKey(), t.getName(), t.getDtype());}System.out.println("-----------------------------------------------");}
}

As can be seen from the java code I provide input for two nodes (named something with "km1" and "km2"). But I guess that is not the correct way to do it. Guess I need to provide input for the node "input_example_tensor:0"?

So question is: How do I actually create an input for the model that is loaded into java? In python I had to create a dictionary with keys "km1" and "km2", and values two constant tensors.

Answer

On Python, Try

feature_spec = tf.feature_column.make_parse_example_spec(columns)
example_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)

Please look into build_parsing_serving_input_receiver_fn, and an input named input_example_tensor that expects a serialized tf.Example.

On Java, Try create An Example input(packaged in the org.tensorflow:proto artifact), and some codes like this:

public static void main(String[] args) {Example example = buildExample(yourFeatureNameAndValueMap);byte[][] exampleBytes = {example.toByteArray()};try (Tensor<String> inputBatch = Tensors.create(exampleBytes);Tensor<Float> output =yourSession.runner().feed(yourInputsName, inputBatch).fetch(yourOutputsName).run().get(0).expect(Float.class)) {long[] shape = output.shape();int batchSize = (int) shape[0];int labelNum = (int) shape[1];float[][] resultValues = output.copyTo(new float[batchSize][labelNum]);System.out.println(resultValues);}
}public static Example buildExample(Map<String, ?> yourFeatureNameAndValueMap) {Features.Builder builder = Features.newBuilder();for (String attr : yourFeatureNameAndValueMap.keySet()) {Object value = yourFeatureNameAndValueMap.get(attr);if (value instanceof Float) {builder.putFeature(attr, feature((Float) value));} else if (value instanceof float[]) {builder.putFeature(attr, feature((float[]) value));} else if (value instanceof String) {builder.putFeature(attr, feature((String) value));} else if (value instanceof String[]) {builder.putFeature(attr, feature((String[]) value));} else if (value instanceof Long) {builder.putFeature(attr, feature((Long) value));} else if (value instanceof long[]) {builder.putFeature(attr, feature((long[]) value));} else {throw new UnsupportedOperationException("Not supported attribute value data type!");}}Features features = builder.build();Example example = Example.newBuilder().setFeatures(features).build();return example;
}private static Feature feature(String... strings) {BytesList.Builder b = BytesList.newBuilder();for (String s : strings) {b.addValue(ByteString.copyFromUtf8(s));}return Feature.newBuilder().setBytesList(b).build();
}private static Feature feature(float... values) {FloatList.Builder b = FloatList.newBuilder();for (float v : values) {b.addValue(v);}return Feature.newBuilder().setFloatList(b).build();
}private static Feature feature(long... values) {Int64List.Builder b = Int64List.newBuilder();for (long v : values) {b.addValue(v);}return Feature.newBuilder().setInt64List(b).build();
}

If you want auto get yourInputsName and yourOutputsName, you can try

SignatureDef signatureDef;
try {signatureDef = MetaGraphDef.parseFrom(model.metaGraphDef()).getSignatureDefOrThrow(SIGNATURE_DEF_KEY);
} catch (InvalidProtocolBufferException e) {throw new RuntimeException(e.getMessage(), e);
}
String yourInputsName = signatureDef.getInputsOrThrow(SIGNATURE_DEF_INPUT_KEY).getName();
String yourOutputsName = signatureDef.getOutputsOrThrow(SIGNATURE_DEF_OUTPUT_KEY).getName();

On java, please refer to DetectObjects.java. On Python, please refer to wide_deep

https://en.xdnf.cn/q/69615.html

Related Q&A

Adding breakpoint command lists in GDB controlled from Python script

Im using Python to control GDB via batch commands. Heres how Im calling GDB:$ gdb --batch --command=cmd.gdb myprogramThe cmd.gdb listing just contains the line calling the Python scriptsource cmd.pyAnd…

Getting the maximum accuracy for a binary probabilistic classifier in scikit-learn

Is there any built-in function to get the maximum accuracy for a binary probabilistic classifier in scikit-learn?E.g. to get the maximum F1-score I do:# AUCPR precision, recall, thresholds = sklearn.m…

Pydantic does not validate when assigning a number to a string

When assigning an incorrect attribute to a Pydantic model field, no validation error occurs. from pydantic import BaseModelclass pyUser(BaseModel):username: strclass Config:validate_all = Truevalidate_…

PyUsb USB Barcode Scanner

Im trying to output a string from a barcode or qrcode using a Honeywell USB 3310g scanner in Ubuntu. I have libusb and a library called metro-usb (http://gitorious.org/other/metro-usb) which are enabli…

Count unique dates in pandas dataframe

I have a dataframe of surface weather observations (fzraHrObs) organized by a station identifier code and date. fzraHrObs has several columns of weather data. The station code and date (datetime object…

Miniforge / VScode - Python is not installed and virtualenv is not found

I have been stuck on this issue for several days, so any help is greatly appreciated. I recently had to move away from Anaconda (due to their change in the commercial policy) and decided to try Minifo…

How to merge pandas table by regex

I am wondering if there a fast way to merge two pandas tables by the regular expression in python .For example: table A col1 col2 1 apple_3dollars_5 2 apple_2dollar_4 1 o…

Scipy Optimize is only returning x0, only completing one iteration

I am using scipy optimize to get the minimum value on the following function: def randomForest_b(a,b,c,d,e):return abs(rf_diff.predict([[a,b,c,d,e]]))I eventually want to be able to get the optimal val…

Order of sess.run([op1, op2...]) in Tensorflow

I wonder whats the running order of the op list in sess.run(ops_list, ...). for example:for a typical classification scenario: _, loss = sess.run([train_op, loss_op]), if train_op run first,then the lo…

Django form validation: get errors in JSON format

I have this very simple Django formfrom django import formsclass RegistrationForm(forms.Form):Username = forms.CharField()Password = forms.CharField()I manage this manually and dont use the template en…