Given a trained keras
model I am trying to compute the gradient of the output with respect to the input.
This example tries to fit the function y=x^2
with a keras model composed by 4 layers of relu activations, and compute the gradient of the model output with respect to the input.
from keras.models import Sequential
from keras.layers import Dense
from keras import backend as k
from sklearn.model_selection import train_test_split
import numpy as np
import tensorflow as tf# random data
x = np.random.random((1000, 1))
y = x**2# split train/val
x_train, x_val, y_train, y_val = train_test_split(x, y, test_size=0.15)# model
model = Sequential()
# 1d input
model.add(Dense(10, input_shape=(1, ), activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(10, activation='relu'))
# 1d output
model.add(Dense(1))## compile and fit
model.compile(loss='mse', optimizer='rmsprop', metrics=['mae'])
model.fit(x_train, y_train, batch_size=256, epochs=100, validation_data=(x_val, y_val), shuffle=True)## compute derivative (gradient)
session = tf.Session()
session.run(tf.global_variables_initializer())
y_val_d_evaluated = session.run(tf.gradients(model.output, model.input), feed_dict={model.input: x_val})print(y_val_d_evaluated)
x_val
is a vector of 150 random number between 0
and 1
.
My expectations is that y_val_d_evaluated
(the gradient) should be:
A. an array
of 150 different numbers (because x_val
contains 150 different numbers);
B. the values should be near to 2*x_val
(the derivative of x^2
).
Instead, every time I run this example, y_val_d_evaluated
contains 150 equal values (e.g. [0.0150494]
, [-0.0150494]
, [0.0150494]
, [-0.0150494]
, ...), moreover the value is very different from 2x
, and the value change every time I run the example.
Anyone has some suggestions to help me to understand why this code does not give the expected gradient results?