Where is `_softmax_cross_entropy_with_logits` defined in tensorflow?

2024/9/20 17:31:05

I am trying to see how softmax_cross_entropy_with_logits_v2() is implemented. It calls _softmax_cross_entropy_with_logits(). But I don't see where the latter is defined. Does anybody know how to locate its definition?

$ ack '\b_softmax_cross_entropy_with_logits\b'
176:          gen_nn_ops._softmax_cross_entropy_with_logits,tensorflow/python/kernel_tests/xent_op_test.py
52:      loss, backprop = gen_nn_ops._softmax_cross_entropy_with_logits(
75:        loss, backprop = gen_nn_ops._softmax_cross_entropy_with_logits(
93:                              gen_nn_ops._softmax_cross_entropy_with_logits,
135:        gen_nn_ops._softmax_cross_entropy_with_logits(
141:        gen_nn_ops._softmax_cross_entropy_with_logits([0., 1., 2., 3.],tensorflow/python/ops/nn_ops.py
1803:    cost, unused_backprop = gen_nn_ops._softmax_cross_entropy_with_logits(

The answer by kmario23 is correct: basically, when you see a reference to a gen_* package, it means automatically generated python code.

In this case, it's gen_nn_ops.py:

def _softmax_cross_entropy_with_logits(features, labels, name=None):r"""Computes softmax cross entropy cost and gradients to backpropagate.Inputs are the logits, not probabilities.Args:features: A `Tensor`. Must be one of the following types: `half`, `float32`, `float64`.batch_size x num_classes matrixlabels: A `Tensor`. Must have the same type as `features`.batch_size x num_classes matrixThe caller must ensure that each batch of labels represents a validprobability distribution.name: A name for the operation (optional).Returns:A tuple of `Tensor` objects (loss, backprop).loss: A `Tensor`. Has the same type as `features`. Per example loss (batch_size vector).backprop: A `Tensor`. Has the same type as `features`. backpropagated gradients (batch_size x num_classes matrix)."""_ctx = _context.context()if _ctx.in_graph_mode():_, _, _op = _op_def_lib._apply_op_helper("SoftmaxCrossEntropyWithLogits", features=features, labels=labels,name=name)_result = _op.outputs[:]_inputs_flat = _op.inputs_attrs = ("T", _op.get_attr("T"))else:_attr_T, _inputs_T = _execute.args_to_matching_eager([features, labels], _ctx)(features, labels) = _inputs_T_attr_T = _attr_T.as_datatype_enum_inputs_flat = [features, labels]_attrs = ("T", _attr_T)_result = _execute.execute(b"SoftmaxCrossEntropyWithLogits", 2,inputs=_inputs_flat, attrs=_attrs, ctx=_ctx,name=name)_execute.record_gradient("SoftmaxCrossEntropyWithLogits", _inputs_flat, _attrs, _result, name)_result = _SoftmaxCrossEntropyWithLogitsOutput._make(_result)return _result

But since this function is a wrapper over native C++ implementation, you might be interested to see the actual C++ code. It's in tensorflow/core/kernels/xent_op.cc, for both CPU and GPU:

template <typename Device, typename T>
class SoftmaxXentWithLogitsOp : public OpKernel {public:explicit SoftmaxXentWithLogitsOp(OpKernelConstruction* context): OpKernel(context) {}void Compute(OpKernelContext* context) override {const Tensor& logits_in = context->input(0);const Tensor& labels_in = context->input(1);OP_REQUIRES(context, logits_in.IsSameSize(labels_in),errors::InvalidArgument("logits and labels must be same size: logits_size=",logits_in.shape().DebugString(), " labels_size=",labels_in.shape().DebugString()));OP_REQUIRES(context, TensorShapeUtils::IsMatrix(logits_in.shape()),errors::InvalidArgument("logits must be 2-dimensional"));// As we already tested that both inputs have the same shape no need to// check that "labels" is a matrix too.// loss is 1-D (one per example), and size is batch_size.Tensor scratch;OP_REQUIRES_OK(context, context->allocate_temp(DataTypeToEnum<T>::value,TensorShape({logits_in.dim_size(0), 1}),&scratch));Tensor* loss_out = nullptr;OP_REQUIRES_OK(context,context->allocate_output(0, TensorShape({logits_in.dim_size(0)}), &loss_out));Tensor* back_out = nullptr;// Try to reuse the logits_in buffer for the backprop output.OP_REQUIRES_OK(context, context->forward_input_or_allocate_output({0}, 1, logits_in.shape(), &back_out));functor::XentFunctor<Device, T> functor;functor(context->eigen_device<Device>(), logits_in.matrix<T>(),labels_in.matrix<T>(), scratch.matrix<T>(), loss_out->vec<T>(),back_out->matrix<T>());}

If you're interested to dive deeper, the main call is in the last line: functor(...), where functor is XentFunctor<Device, T>. The actual logic is dispatched to the third-party Eigen library. See this very similar question, which shows how deep it all goes in the end.


