Keras custom loss function per tensor group

2024/4/15 1:43:15

I am writing a custom loss function that requires calculating ratios of predicted values per group. As a simplified example, here is what my Data and model code looks like:

def main():df = pd.DataFrame(columns=["feature_1", "feature_2", "condition_1", "condition_2", "label"],data=[[5, 10, "a", "1", 0],[30, 20, "a", "1", 1],[50, 40, "a", "1", 0],[15, 20, "a", "2", 0],[25, 30, "b", "2", 1],[35, 40, "b", "1", 0],[10, 80, "b", "1", 1]])features = ["feature_1", "feature_2"]conds_and_label = ["condition_1", "condition_2", "label"]X = df[features]Y = df[conds_and_label]model = my_model(input_shape=len(features)), Y, epochs=10, batch_size=128)model.evaluate(X, Y)def custom_loss(conditions, y_pred):  # this is what I need help withconds = ["condition_1", "condition_2"]conditions["label_pred"] = y_predg = conditions.groupby(by=conds,as_index=False).apply(lambda x: x["label_pred"].sum() /len(x)).reset_index(name="pred_ratio")# true_ratios will be a constant, external DataFrame. Simplified example here:true_ratios = pd.DataFrame(columns=["condition_1", "condition_2", "true_ratio"],data=[["a", "1", 0.1],["a", "2", 0.2],["b", "1", 0.8],["b", "2", 0.9]])merged = pd.merge(g, true_ratios, on=conds)merged["diff"] = merged["pred_ratio"] - merged["true_ratio"]return K.mean(K.abs(merged["diff"]))def joint_loss(conds_and_label, y_pred):y_true = conds_and_label[:, 2]conditions = tf.gather(conds_and_label, [0, 1], axis=1)loss_1 = standard_loss(y_true=y_true, y_pred=y_pred)  # not shownloss_2 = custom_loss(conditions=conditions, y_pred=y_pred)return 0.5 * loss_1 + 0.5 * loss_2def my_model(input_shape=None):model = Sequential()model.add(Dense(units=2, activation="relu"), input_shape=(input_shape,))model.add(Dense(units=1, activation='sigmoid'))model.add(Flatten())model.compile(loss=joint_loss, optimizer="Adam",metrics=[joint_loss, custom_loss, "accuracy"])return model

What I need help with is the custom_loss function. As you can see, it is currently written as if the inputs are Pandas DataFrames. However, the inputs will be Keras Tensors (with tensorflow backend), so I am trying to figure out how to convert the current code in custom_loss to use Keras/TF backend functions. For example, I searched online and couldn't find out a way to do a groupby in Keras/TF to get the ratios I need...

Some context/explanation that might be helpful to you:

  1. My main loss function is joint_loss, which consists of standard_loss (not shown) and custom_loss. But I only need help converting custom_loss.
  2. What custom_loss does is:
    1. Groupby on two condition columns (these two columns represent the groups of the data).
    2. Get the ratio of predicted 1s to total number of batch samples per each group.
    3. Compare the "pred_ratio" to a set of "true_ratio" and get the difference.
    4. Calculate mean absolute error from the differences.

I ended up figuring out a solution to this, though I would like some feedback on it (specifically some parts). Here is the solution:

import pandas as pd
import tensorflow as tf
import keras.backend as K
from keras.models import Sequential
from keras.layers import Dense, Flatten, Dropout
from tensorflow.python.ops import gen_array_opsdef main():df = pd.DataFrame(columns=["feature_1", "feature_2", "condition_1", "condition_2", "label"],data=[[5, 10, "a", "1", 0],[30, 20, "a", "1", 1],[50, 40, "a", "1", 0],[15, 20, "a", "2", 0],[25, 30, "b", "2", 1],[35, 40, "b", "1", 0],[10, 80, "b", "1", 1]])df = pd.concat([df] * 500)  # making data artificially largertrue_ratios = pd.DataFrame(columns=["condition_1", "condition_2", "true_ratio"],data=[["a", "1", 0.1],["a", "2", 0.2],["b", "1", 0.8],["b", "2", 0.9]])features = ["feature_1", "feature_2"]conditions = ["condition_1", "condition_2"]conds_ratios_label = conditions + ["true_ratio", "label"]df = pd.merge(df, true_ratios, on=conditions, how="left")X = df[features]Y = df[conds_ratios_label]# need to convert strings to ints because tensors can't mix strings with floats/intsmapping_1 = {"a": 1, "b": 2}mapping_2 = {"1": 1, "2": 2}Y.replace({"condition_1": mapping_1}, inplace=True)Y.replace({"condition_2": mapping_2}, inplace=True)X = tf.convert_to_tensor(X)Y = tf.convert_to_tensor(Y)model = my_model(input_shape=len(features)), Y, epochs=1, batch_size=64)print()print(model.evaluate(X, Y))def custom_loss(conditions, true_ratios, y_pred):y_pred = tf.sigmoid((y_pred - 0.5) * 1000)uniques, idx, count = gen_array_ops.unique_with_counts_v2(conditions, [0])num_unique = tf.size(count)sums = tf.math.unsorted_segment_sum(data=y_pred, segment_ids=idx, num_segments=num_unique)lengths = tf.cast(count, tf.float32)pred_ratios = tf.divide(sums, lengths)mean_pred_ratios = tf.math.reduce_mean(pred_ratios)mean_true_ratios = tf.math.reduce_mean(true_ratios)diff = mean_pred_ratios - mean_true_ratiosreturn K.mean(K.abs(diff))def standard_loss(y_true, y_pred):return tf.losses.binary_crossentropy(y_true=y_true, y_pred=y_pred)def joint_loss(conds_ratios_label, y_pred):y_true = conds_ratios_label[:, 3]true_ratios = conds_ratios_label[:, 2]conditions = tf.gather(conds_ratios_label, [0, 1], axis=1)loss_1 = standard_loss(y_true=y_true, y_pred=y_pred)loss_2 = custom_loss(conditions=conditions, true_ratios=true_ratios, y_pred=y_pred)return 0.5 * loss_1 + 0.5 * loss_2def my_model(input_shape=None):model = Sequential()model.add(Dropout(0, input_shape=(input_shape,)))model.add(Dense(units=2, activation="relu"))model.add(Dense(units=1, activation='sigmoid'))model.add(Flatten())model.compile(loss=joint_loss, optimizer="Adam",metrics=[joint_loss, "accuracy"],  # had to remove custom_loss because it takes 3 args nowrun_eagerly=True)return modelif __name__ == '__main__':main()

The main updates are to custom_loss. I removed creating the true_ratios DataFrame from custom_loss and instead appended it to my Y in main. Now custom_loss takes 3 arguments, one of which is the true_ratios tensor. I had to use gen_array_ops.unique_with_counts_v2 and unsorted_segment_sum to get sums per group of conditions. And then I got the lengths of each group in order to create pred_ratios (calculated ratios per group based on y_pred). Finally I get the mean predicted ratios and mean true ratios, and take the absolute difference to get my custom loss.

Some things of note:

  1. Because the last layer of my model is a sigmoid, my y_pred values are probabilities between 0 and 1. So I needed to convert them to 0s and 1s in order to calculate the ratios I need in my custom loss. At first I tried using tf.round, but I realized that is not differentiable. So instead I replaced it with y_pred = tf.sigmoid((y_pred - 0.5) * 1000) inside of custom_loss. This essentially takes all the y_pred values to 0 and 1, but in a differentiable way. It seems like a bit of a "hack" though, so please let me know if you have any feedback on this.
  2. I noticed that my model only works if I use run_eagerly=True in model.compile(). Otherwise I get this error: "ValueError: Dimensions must be equal, but are 1 and 2 for ...". I'm not sure why this is the case, but the error originates from the line where I use tf.unsorted_segment_sum.
  3. unique_with_counts_v2 does not actually exist in tensorflow API yet, but it exists in the source code. I needed this to be able to group by multiple conditions (not just a single one).

Feel free to comment if you have any feedback on this, in general, or in response to the bullets above.

Related Q&A

How does numpy.linalg.inv calculate the inverse of an orthogonal matrix?

Im implementing a LinearTransformation class, which inherits from numpy.matrix and uses numpy.matrix.I to calculate the inverse of the transformation matrix.Does anyone know whether numpy checks for or…

pandas: Using color in a scatter plot

I have a pandas dataframe:-------------------------------------- | field_0 | field_1 | field_2 | -------------------------------------- | 0 | 1.5 | 2.9 | -------------------…

Framing Errors in Celery 3.0.1

I recently upgraded to Celery 3.0.1 from 2.3.0 and all the tasks run fine. Unfortunately. Im getting a "Framing Error" exception pretty frequently. Im also running supervisor to restart the t…

decorator() got an unexpected keyword argument

I have this error on Django view:TypeError at /web/host/1/ decorator() got an unexpected keyword argument host_id Request Method: GET Request URL: Django Versio…

Conflict between sys.stdin and input() - EOFError: EOF when reading a line

I cant get the following script to work without throwing an EOFError exception:#!/usr/bin/env python3import json import sys# usage: # echo [{"testname": "testval"}] | python3 test.p…

Requests - inability to handle two cookies with same name, different domain

I am writing a Python 2.7 script using Requests to automate access to a website that sets two cookies with the same name, but different domains, E.g. Name mycookie, Domain and subdomain…

Python logging from multiple processes

I have a possibly long running program that currently has 4 processes, but could be configured to have more. I have researched logging from multiple processes using pythons logging and am using the So…

Error while fetching Tweets with Tweepy

I have a Python script that fetch tweets. In the script i use the libary Tweepy . I use a valid authentication parameters. After running this script some tweets are stored in my MongoDB and some are r…

Example to throw a BufferError

On reading in the Python 3.3 documentation I noticed the entry about a BufferError exception: "Raised when a buffer related operation cannot be performed.". Now Im wondering in which cases co…

Which algorithm would fit best to solve a word-search game like Boggle with Python

Im coding a game similar to Boggle where the gamer should find words inside a big string made of random letters.For example, there are five arrays with strings inside like this. Five rows, made of six …