Problem
I need to compute the Pearson and Spearman correlations, and use it as metrics in tensorflow.
For Pearson, it's trivial :
tf.contrib.metrics.streaming_pearson_correlation(y_pred, y_true)
But for Spearman, I am clueless !
What I tried :
From this answer :
samples = 1predictions_rank = tf.nn.top_k(y_pred, k=samples, sorted=True, name='prediction_rank').indicesreal_rank = tf.nn.top_k(y_true, k=samples, sorted=True, name='real_rank').indicesrank_diffs = predictions_rank - real_rankrank_diffs_squared_sum = tf.reduce_sum(rank_diffs * rank_diffs)six = tf.constant(6)one = tf.constant(1.0)numerator = tf.cast(six * rank_diffs_squared_sum, dtype=tf.float32)divider = tf.cast(samples * samples * samples - samples, dtype=tf.float32)spearman_batch = one - numerator / divider
But this return NaN
...
Following the definition of Wikipedia :
I tried :
size = tf.size(y_pred)
indice_of_ranks_pred = tf.nn.top_k(y_pred, k=size)[1]
indice_of_ranks_label = tf.nn.top_k(y_true, k=size)[1]
rank_pred = tf.nn.top_k(-indice_of_ranks_pred, k=size)[1]
rank_label = tf.nn.top_k(-indice_of_ranks_label, k=size)[1]
rank_pred = tf.to_float(rank_pred)
rank_label = tf.to_float(rank_label)
spearman = tf.contrib.metrics.streaming_pearson_correlation(rank_pred, rank_label)
But running this I got the following error :
tensorflow.python.framework.errors_impl.InvalidArgumentError: inputmust have at least k columns. Had 1, needed 32
[[{{node metrics/spearman/TopKV2}} = TopKV2[T=DT_FLOAT, sorted=true,_device="/job:localhost/replica:0/task:0/device:CPU:0"](lambda_1/add, metrics/pearson/pearson_r/variance_predictions/Size)]]