View on TensorFlow.org | Run in Google Colab | View source on GitHub | Download notebook |
TensorFlow provides a set of pseudo-random number generators (RNG), in the tf.random
module. This document describes how you can control the random number generators, and how these generators interact with other tensorflow sub-systems.
TensorFlow provides two approaches for controlling the random number generation process:
Through the explicit use of
tf.random.Generator
objects. Each such object maintains a state (intf.Variable
) that will be changed after each number generation.Through the purely-functional stateless random functions like
tf.random.stateless_uniform
. Calling these functions with the same arguments (which include the seed) and on the same device will always produce the same results.
Setup
import tensorflow as tf
# Creates some virtual devices (cpu:0, cpu:1, etc.) for using distribution strategy
physical_devices = tf.config.list_physical_devices("CPU")
tf.config.experimental.set_virtual_device_configuration(
physical_devices[0], [
tf.config.experimental.VirtualDeviceConfiguration(),
tf.config.experimental.VirtualDeviceConfiguration(),
tf.config.experimental.VirtualDeviceConfiguration()
])
2024-07-19 02:25:29.589538: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-07-19 02:25:29.610729: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-07-19 02:25:29.617242: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1721355932.210930 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.214715 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.218489 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.221706 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.233102 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.236656 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.240113 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.243027 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.245971 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.249466 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.253004 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355932.255948 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
The tf.random.Generator
class
The tf.random.Generator
class is used in cases where you want each RNG call to produce different results. It maintains an internal state (managed by a tf.Variable
object) which will be updated every time random numbers are generated. Because the state is managed by tf.Variable
, it enjoys all facilities provided by tf.Variable
such as easy checkpointing, automatic control-dependency and thread safety.
You can get a tf.random.Generator
by manually creating an object of the class or call tf.random.get_global_generator()
to get the default global generator:
g1 = tf.random.Generator.from_seed(1)
print(g1.normal(shape=[2, 3]))
g2 = tf.random.get_global_generator()
print(g2.normal(shape=[2, 3]))
I0000 00:00:1721355933.512912 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.515568 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.517691 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.519747 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.521854 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.523773 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.525762 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.527750 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.529753 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.531688 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.533796 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.535779 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.574518 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.576504 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.578553 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.580736 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.582863 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.584761 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.586778 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.588766 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.590782 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.593080 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.595552 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 I0000 00:00:1721355933.597919 95393 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355 tf.Tensor( [[ 0.43842277 -0.53439844 -0.07710262] [ 1.5658045 -0.1012345 -0.2744976 ]], shape=(2, 3), dtype=float32) tf.Tensor( [[-0.619722 0.17175268 -1.7519977 ] [-0.37428895 -0.3936975 0.94025034]], shape=(2, 3), dtype=float32)
There are multiple ways to create a generator object. The easiest is Generator.from_seed
, as shown above, that creates a generator from a seed. A seed is any non-negative integer. from_seed
also takes an optional argument alg
which is the RNG algorithm that will be used by this generator:
g1 = tf.random.Generator.from_seed(1, alg='philox')
print(g1.normal(shape=[2, 3]))
tf.Tensor( [[ 0.43842277 -0.53439844 -0.07710262] [ 1.5658045 -0.1012345 -0.2744976 ]], shape=(2, 3), dtype=float32)
See the Algorithms section below for more information about it.
Another way to create a generator is with Generator.from_non_deterministic_state
. A generator created this way will start from a non-deterministic state, depending on e.g., time and OS.
g = tf.random.Generator.from_non_deterministic_state()
print(g.normal(shape=[2, 3]))
tf.Tensor( [[ 1.3122826 -1.5829424 -0.60620594] [-0.40802944 -0.41581756 -0.63317794]], shape=(2, 3), dtype=float32)
There are yet other ways to create generators, such as from explicit states, which are not covered by this guide.
When using tf.random.get_global_generator
to get the global generator, you need to be careful about device placement. The global generator is created (from a non-deterministic state) at the first time tf.random.get_global_generator
is called, and placed on the default device at that call. So, for example, if the first site you call tf.random.get_global_generator
is within a tf.device("gpu")
scope, the global generator will be placed on the GPU, and using the global generator later on from the CPU will incur a GPU-to-CPU copy.
There is also a function tf.random.set_global_generator
for replacing the global generator with another generator object. This function should be used with caution though, because the old global generator may have been captured by a tf.function
(as a weak reference), and replacing it will cause it to be garbage collected, breaking the tf.function
. A better way to reset the global generator is to use one of the "reset" functions such as Generator.reset_from_seed
, which won't create new generator objects.
g = tf.random.Generator.from_seed(1)
print(g.normal([]))
print(g.normal([]))
g.reset_from_seed(1)
print(g.normal([]))
tf.Tensor(0.43842277, shape=(), dtype=float32) tf.Tensor(1.6272374, shape=(), dtype=float32) tf.Tensor(0.43842277, shape=(), dtype=float32)
Creating independent random-number streams
In many applications one needs multiple independent random-number streams, independent in the sense that they won't overlap and won't have any statistically detectable correlations. This is achieved by using Generator.split
to create multiple generators that are guaranteed to be independent of each other (i.e. generating independent streams).
g = tf.random.Generator.from_seed(1)
print(g.normal([]))
new_gs = g.split(3)
for new_g in new_gs:
print(new_g.normal([]))
print(g.normal([]))
tf.Tensor(0.43842277, shape=(), dtype=float32) tf.Tensor(2.536413, shape=(), dtype=float32) tf.Tensor(0.33186463, shape=(), dtype=float32) tf.Tensor(-0.07144657, shape=(), dtype=float32) tf.Tensor(-0.79253083, shape=(), dtype=float32)
split
will change the state of the generator on which it is called (g
in the above example), similar to an RNG method such as normal
. In addition to being independent of each other, the new generators (new_gs
) are also guaranteed to be independent of the old one (g
).
Spawning new generators is also useful when you want to make sure the generator you use is on the same device as other computations, to avoid the overhead of cross-device copy. For example:
with tf.device("cpu"): # change "cpu" to the device you want
g = tf.random.get_global_generator().split(1)[0]
print(g.normal([])) # use of g won't cause cross-device copy, unlike the global generator
tf.Tensor(0.33917606, shape=(), dtype=float32)
You can do splitting recursively, calling split
on split generators. There are no limits (barring integer overflow) on the depth of recursions.
Interaction with tf.function
tf.random.Generator
obeys the same rules as tf.Variable
when used with tf.function
. This includes three aspects.
Creating generators outside tf.function
tf.function
can use a generator created outside of it.
g = tf.random.Generator.from_seed(1)
@tf.function
def foo():
return g.normal([])
print(foo())
tf.Tensor(0.43842277, shape=(), dtype=float32)
The user needs to make sure that the generator object is still alive (not garbage-collected) when the function is called.
Creating generators inside tf.function
Creation of generators inside a tf.function
can only happened during the first run of the function.
g = None
@tf.function
def foo():
global g
if g is None:
g = tf.random.Generator.from_seed(1)
return g.normal([])
print(foo())
print(foo())
tf.Tensor(0.43842277, shape=(), dtype=float32) tf.Tensor(1.6272374, shape=(), dtype=float32)
Passing generators as arguments to tf.function
When used as an argument to a tf.function
, different generator objects will cause retracing of the tf.function
.
num_traces = 0
@tf.function
def foo(g):
global num_traces
num_traces += 1
return g.normal([])
foo(tf.random.Generator.from_seed(1))
foo(tf.random.Generator.from_seed(2))
print(num_traces)
2
Note that this retracing behavior is consistent with tf.Variable
:
num_traces = 0
@tf.function
def foo(v):
global num_traces
num_traces += 1
return v.read_value()
foo(tf.Variable(1))
foo(tf.Variable(2))
print(num_traces)
1
Interaction with distribution strategies
There are two ways in which Generator
interacts with distribution strategies.
Creating generators outside distribution strategies
If a generator is created outside strategy scopes, all replicas’ access to the generator will be serialized, and hence the replicas will get different random numbers.
g = tf.random.Generator.from_seed(1)
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
def f():
print(g.normal([]))
results = strat.run(f)
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. tf.Tensor(0.43842274, shape=(), dtype=float32) tf.Tensor(1.6272374, shape=(), dtype=float32)
Note that this usage may have performance issues because the generator's device is different from the replicas.
Creating generators inside distribution strategies
If a generator is created inside a strategy scope, each replica will get a different and independent stream of random numbers.
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
g = tf.random.Generator.from_seed(1)
print(strat.run(lambda: g.normal([])))
print(strat.run(lambda: g.normal([])))
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. PerReplica:{ 0: tf.Tensor(-0.87930447, shape=(), dtype=float32), 1: tf.Tensor(0.020661574, shape=(), dtype=float32) } WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32) }
If the generator is seeded (e.g. created by Generator.from_seed
), the random numbers are determined by the seed, even though different replicas get different and uncorrelated numbers. One can think of a random number generated on a replica as a hash of the replica ID and a "primary" random number that is common to all replicas. Hence, the whole system is still deterministic.
tf.random.Generator
can also be created inside Strategy.run
:
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
def f():
g = tf.random.Generator.from_seed(1)
a = g.normal([])
b = g.normal([])
return tf.stack([a, b])
print(strat.run(f))
print(strat.run(f))
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. PerReplica:{ 0: tf.Tensor([-0.87930447 -1.5822568 ], shape=(2,), dtype=float32), 1: tf.Tensor([0.02066157 0.77539235], shape=(2,), dtype=float32) } WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. PerReplica:{ 0: tf.Tensor([-0.87930447 -1.5822568 ], shape=(2,), dtype=float32), 1: tf.Tensor([0.02066157 0.77539235], shape=(2,), dtype=float32) }
We no longer recommend passing tf.random.Generator
as arguments to Strategy.run
, because Strategy.run
generally expects the arguments to be tensors, not generators.
Saving generators
Generally for saving or serializing you can handle a tf.random.Generator
the same way you would handle a tf.Variable
or a tf.Module
(or its subclasses). In TF there are two mechanisms for serialization: Checkpoint and SavedModel.
Checkpoint
Generators can be freely saved and restored using tf.train.Checkpoint
. The random-number stream from the restoring point will be the same as that from the saving point.
filename = "./checkpoint"
g = tf.random.Generator.from_seed(1)
cp = tf.train.Checkpoint(generator=g)
print(g.normal([]))
tf.Tensor(0.43842277, shape=(), dtype=float32)
cp.write(filename)
print("RNG stream from saving point:")
print(g.normal([]))
print(g.normal([]))
RNG stream from saving point: tf.Tensor(1.6272374, shape=(), dtype=float32) tf.Tensor(1.6307176, shape=(), dtype=float32)
cp.restore(filename)
print("RNG stream from restoring point:")
print(g.normal([]))
print(g.normal([]))
RNG stream from restoring point: tf.Tensor(1.6272374, shape=(), dtype=float32) tf.Tensor(1.6307176, shape=(), dtype=float32)
You can also save and restore within a distribution strategy:
filename = "./checkpoint"
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
g = tf.random.Generator.from_seed(1)
cp = tf.train.Checkpoint(my_generator=g)
print(strat.run(lambda: g.normal([])))
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') PerReplica:{ 0: tf.Tensor(-0.87930447, shape=(), dtype=float32), 1: tf.Tensor(0.020661574, shape=(), dtype=float32) }
with strat.scope():
cp.write(filename)
print("RNG stream from saving point:")
print(strat.run(lambda: g.normal([])))
print(strat.run(lambda: g.normal([])))
RNG stream from saving point: PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32) } PerReplica:{ 0: tf.Tensor(-0.5039703, shape=(), dtype=float32), 1: tf.Tensor(0.1251838, shape=(), dtype=float32) }
with strat.scope():
cp.restore(filename)
print("RNG stream from restoring point:")
print(strat.run(lambda: g.normal([])))
print(strat.run(lambda: g.normal([])))
RNG stream from restoring point: PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32) } PerReplica:{ 0: tf.Tensor(-0.5039703, shape=(), dtype=float32), 1: tf.Tensor(0.1251838, shape=(), dtype=float32) }
You should make sure that the replicas don't diverge in their RNG call history (e.g. one replica makes one RNG call while another makes two RNG calls) before saving. Otherwise, their internal RNG states will diverge and tf.train.Checkpoint
(which only saves the first replica's state) won't properly restore all the replicas.
You can also restore a saved checkpoint to a different distribution strategy with a different number of replicas. Because a tf.random.Generator
object created in a strategy can only be used in the same strategy, to restore to a different strategy, you have to create a new tf.random.Generator
in the target strategy and a new tf.train.Checkpoint
for it, as shown in this example:
filename = "./checkpoint"
strat1 = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat1.scope():
g1 = tf.random.Generator.from_seed(1)
cp1 = tf.train.Checkpoint(my_generator=g1)
print(strat1.run(lambda: g1.normal([])))
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') PerReplica:{ 0: tf.Tensor(-0.87930447, shape=(), dtype=float32), 1: tf.Tensor(0.020661574, shape=(), dtype=float32) }
with strat1.scope():
cp1.write(filename)
print("RNG stream from saving point:")
print(strat1.run(lambda: g1.normal([])))
print(strat1.run(lambda: g1.normal([])))
RNG stream from saving point: PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32) } PerReplica:{ 0: tf.Tensor(-0.5039703, shape=(), dtype=float32), 1: tf.Tensor(0.1251838, shape=(), dtype=float32) }
strat2 = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1", "cpu:2"])
with strat2.scope():
g2 = tf.random.Generator.from_seed(1)
cp2 = tf.train.Checkpoint(my_generator=g2)
cp2.restore(filename)
print("RNG stream from restoring point:")
print(strat2.run(lambda: g2.normal([])))
print(strat2.run(lambda: g2.normal([])))
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1', '/job:localhost/replica:0/task:0/device:CPU:2') INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1', '/job:localhost/replica:0/task:0/device:CPU:2') RNG stream from restoring point: PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32), 2: tf.Tensor(0.6851049, shape=(), dtype=float32) } PerReplica:{ 0: tf.Tensor(-0.5039703, shape=(), dtype=float32), 1: tf.Tensor(0.1251838, shape=(), dtype=float32), 2: tf.Tensor(-0.58519536, shape=(), dtype=float32) }
Although g1
and cp1
are different objects from g2
and cp2
, they are linked via the common checkpoint file filename
and object name my_generator
. Overlapping replicas between strategies (e.g. cpu:0
and cpu:1
above) will have their RNG streams properly restored like in previous examples. This guarantee doesn't cover the case when a generator is saved in a strategy scope and restored outside of any strategy scope or vice versa, because a device outside strategies is treated as different from any replica in a strategy.
SavedModel
tf.random.Generator
can be saved to a SavedModel. The generator can be created within a strategy scope. The saving can also happen within a strategy scope.
filename = "./saved_model"
class MyModule(tf.Module):
def __init__(self):
super(MyModule, self).__init__()
self.g = tf.random.Generator.from_seed(0)
@tf.function
def __call__(self):
return self.g.normal([])
@tf.function
def state(self):
return self.g.state
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
m = MyModule()
print(strat.run(m))
print("state:", m.state())
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') PerReplica:{ 0: tf.Tensor(-1.4154755, shape=(), dtype=float32), 1: tf.Tensor(-0.11388441, shape=(), dtype=float32) } state: tf.Tensor([256 0 0], shape=(3,), dtype=int64)
with strat.scope():
tf.saved_model.save(m, filename)
print("RNG stream from saving point:")
print(strat.run(m))
print("state:", m.state())
print(strat.run(m))
print("state:", m.state())
INFO:tensorflow:Assets written to: ./saved_model/assets INFO:tensorflow:Assets written to: ./saved_model/assets RNG stream from saving point: PerReplica:{ 0: tf.Tensor(-0.68758255, shape=(), dtype=float32), 1: tf.Tensor(0.8084062, shape=(), dtype=float32) } state: tf.Tensor([512 0 0], shape=(3,), dtype=int64) PerReplica:{ 0: tf.Tensor(-0.27342677, shape=(), dtype=float32), 1: tf.Tensor(-0.53093255, shape=(), dtype=float32) } state: tf.Tensor([768 0 0], shape=(3,), dtype=int64)
imported = tf.saved_model.load(filename)
print("RNG stream from loading point:")
print("state:", imported.state())
print(imported())
print("state:", imported.state())
print(imported())
print("state:", imported.state())
RNG stream from loading point: state: tf.Tensor([256 0 0], shape=(3,), dtype=int64) tf.Tensor(-1.0359411, shape=(), dtype=float32) state: tf.Tensor([512 0 0], shape=(3,), dtype=int64) tf.Tensor(-0.06425078, shape=(), dtype=float32) state: tf.Tensor([768 0 0], shape=(3,), dtype=int64)
Loading a SavedModel containing tf.random.Generator
into a distribution strategy is not recommended because the replicas will all generate the same random-number stream (which is because replica ID is frozen in SavedModel's graph).
Loading a distributed tf.random.Generator
(a generator created within a distribution strategy) into a non-strategy environment, like the above example, also has a caveat. The RNG state will be properly restored, but the random numbers generated will be different from the original generator in its strategy (again because a device outside strategies is treated as different from any replica in a strategy).
Stateless RNGs
Usage of stateless RNGs is simple. Since they are just pure functions, there is no state or side effect involved.
print(tf.random.stateless_normal(shape=[2, 3], seed=[1, 2]))
print(tf.random.stateless_normal(shape=[2, 3], seed=[1, 2]))
tf.Tensor( [[ 0.5441101 0.20738031 0.07356433] [ 0.04643455 -1.30159 -0.95385665]], shape=(2, 3), dtype=float32) tf.Tensor( [[ 0.5441101 0.20738031 0.07356433] [ 0.04643455 -1.30159 -0.95385665]], shape=(2, 3), dtype=float32)
Every stateless RNG requires a seed
argument, which needs to be an integer Tensor of shape [2]
. The results of the op are fully determined by this seed.
The RNG algorithm used by stateless RNGs is device-dependent, meaning the same op running on a different device may produce different outputs.
Algorithms
General
Both the tf.random.Generator
class and the stateless
functions support the Philox algorithm (written as "philox"
or tf.random.Algorithm.PHILOX
) on all devices.
Different devices will generate the same integer numbers, if using the same algorithm and starting from the same state. They will also generate "almost the same" float-point numbers, though there may be small numerical discrepancies caused by the different ways the devices carry out the float-point computation (e.g. reduction order).
XLA devices
On XLA-driven devices (such as TPU, and also CPU/GPU when XLA is enabled) the ThreeFry algorithm (written as "threefry"
or tf.random.Algorithm.THREEFRY
) is also supported. This algorithm is fast on TPU but slow on CPU/GPU compared to Philox.
See paper 'Parallel Random Numbers: As Easy as 1, 2, 3' for more details about these algorithms.