Here is the link of the official docs. https://www.tensorflow.org/versions/r1.3/api_docs/python/tf/colocate_with
Here is the link of the official docs. https://www.tensorflow.org/versions/r1.3/api_docs/python/tf/colocate_with
It's a context manager to make sure that the operation or tensor you're about to create will be placed on the same device the reference operation is on. Consider this piece of code (tested):
import tensorflow as tfwith tf.device("/cpu:0"):a = tf.constant(0.0, name="a")with tf.device("/gpu:0"):b = tf.constant(0.0, name="b")with tf.colocate_with(a):c = tf.constant(0.0, name="c")d = tf.constant(0.0, name="d")for operation in tf.get_default_graph().get_operations():print(operation.name, operation.device)
Outputs:
(u'a', u'/device:CPU:0')
(u'b', u'/device:GPU:0')
(u'c', u'/device:CPU:0')
(u'd', u'/device:GPU:0')
So it places tensor c on the same device where a is, regardless of the active device context of GPU when c is created. This can be very important for multi-GPU training. Imagine if you're not careful and you have a graph with tensors dependent on each other placed on 8 devices randomly. A complete disaster efficiency-wise. tf.colocate_with()
can make sure this doesn't happen.
It is not explained in the docs because it's meant to be used by internal libraries only, so no guarantees it will stay. (Very likely it will, however. If you want to know more, you can look it up in the source code as of May 2018; might move as changes to the code happen.)
You're not likely to need this unless you're working on some low-level stuff. Most people use only one GPU, and even if you use multiple, you're generally building your graph one GPU at a time, that is within one tf.device()
context manager at a time.
One example where it's used is the tf.train.ExponentialMovingAverage
class. Clearly it looks a good idea to make sure to colocate the decay and moving average variables with the value tensor they are tracking.