Python 在 TensorFlow 中,tf.identity 是做什么用的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34877523/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
In TensorFlow, what is tf.identity used for?
提问by rd11
I've seen tf.identity
used in a few places, such as the official CIFAR-10 tutorial and the batch-normalization implementation on stackoverflow, but I don't see why it's necessary.
我tf.identity
在一些地方看到过使用,例如官方的 CIFAR-10 教程和 stackoverflow 上的批处理规范化实现,但我不明白为什么有必要。
What's it used for? Can anyone give a use case or two?
它是做什么用的?任何人都可以提供一两个用例吗?
One proposed answer is that it can be used for transfer between the CPU and GPU. This is not clear to me. Extension to the question, based on this: loss = tower_loss(scope)
is under the GPU block, which suggests to me that all operators defined in tower_loss
are mapped to the GPU. Then, at the end of tower_loss
, we see total_loss = tf.identity(total_loss)
before it's returned. Why? What would be the flaw with not using tf.identity
here?
一个建议的答案是它可以用于 CPU 和 GPU 之间的传输。这对我来说不清楚。对问题的扩展,基于此:loss = tower_loss(scope)
在 GPU 块下,这向我表明 中定义的所有运算符tower_loss
都映射到 GPU。然后,在 结束时tower_loss
,我们看到total_loss = tf.identity(total_loss)
它返回之前。为什么?不在tf.identity
这里使用会有什么缺陷?
采纳答案by rd11
After some stumbling I think I've noticed a single use case that fits all the examples I've seen. If there are other use cases, please elaborate with an example.
经过一些绊脚石,我想我已经注意到一个适合我所见过的所有示例的用例。如果有其他用例,请举例说明。
Use case:
用例:
Suppose you'd like to run an operator every time a particular Variable is evaluated. For example, say you'd like to add one to x
every time the variable y
is evaluated. It might seem like this will work:
假设您想在每次评估特定变量时运行运算符。例如,假设您想在x
每次y
评估变量时都添加一个。看起来这可能会起作用:
x = tf.Variable(0.0)
x_plus_1 = tf.assign_add(x, 1)
with tf.control_dependencies([x_plus_1]):
y = x
init = tf.initialize_all_variables()
with tf.Session() as session:
init.run()
for i in xrange(5):
print(y.eval())
It doesn't: it'll print 0, 0, 0, 0, 0. Instead, it seems that we need to add a new node to the graph within the control_dependencies
block. So we use this trick:
它不会:它会打印 0, 0, 0, 0, 0。相反,我们似乎需要在control_dependencies
块内的图中添加一个新节点。所以我们使用这个技巧:
x = tf.Variable(0.0)
x_plus_1 = tf.assign_add(x, 1)
with tf.control_dependencies([x_plus_1]):
y = tf.identity(x)
init = tf.initialize_all_variables()
with tf.Session() as session:
init.run()
for i in xrange(5):
print(y.eval())
This works: it prints 1, 2, 3, 4, 5.
这是有效的:它打印 1、2、3、4、5。
If in the CIFAR-10 tutorial we dropped tf.identity
, then loss_averages_op
would never run.
如果在 CIFAR-10 教程中我们删除了tf.identity
,则loss_averages_op
永远不会运行。
回答by Rafa? Józefowicz
tf.identity
is useful when you want to explicitly transport tensor between devices (like, from GPU to a CPU).
The op adds send/recv nodes to the graph, which make a copy when the devices of the input and the output are different.
tf.identity
当您想在设备之间显式传输张量时(例如,从 GPU 到 CPU)非常有用。op 将发送/接收节点添加到图中,当输入和输出的设备不同时,这些节点会进行复制。
A default behavior is that the send/recv nodes are added implicitly when the operation happens on a different device but you can imagine some situations (especially in a multi-threaded/distributed settings) when it might be useful to fetch the value of the variable multiple times within a single execution of the session.run
. tf.identity
allows for more control with regard to when the value should be read from the source device. Possibly a more appropriate name for this op would be read
.
默认行为是当操作发生在不同的设备上时隐式添加发送/接收节点,但您可以想象某些情况(尤其是在多线程/分布式设置中)获取变量的值可能有用在session.run
. tf.identity
允许对何时从源设备读取值进行更多控制。可能更适合此操作的名称是read
.
Also, please note that in the implementation of tf.Variable
link, the identity op is added in the constructor, which makes sure that all the accesses to the variable copy the data from the source only once. Multiple copies can be expensive in cases when the variable lives on a GPU but it is read by multiple CPU ops (or the other way around). Users can change the behavior with multiple calls to tf.identity
when desired.
另外,请注意,在tf.Variable
link的实现中,在构造函数中添加了标识操作,这确保对变量的所有访问都只从源复制数据一次。如果变量存在于 GPU 上但它被多个 CPU 操作读取(或相反),则多个副本可能会很昂贵。用户可以根据需要通过多次调用来更改行为tf.identity
。
EDIT: Updated answer after the question was edited.
编辑:编辑问题后更新答案。
In addition, tf.identity
can be used used as a dummy node to update a reference to the tensor. This is useful with various control flow ops. In the CIFAR case we want to enforce that the ExponentialMovingAverageOp will update relevant variables before retrieving the value of the loss. This can be implemented as:
此外,tf.identity
可以用作虚拟节点来更新对张量的引用。这对于各种控制流操作很有用。在 CIFAR 情况下,我们希望强制 ExponentialMovingAverageOp 在检索损失值之前更新相关变量。这可以实现为:
with tf.control_dependencies([loss_averages_op]):
total_loss = tf.identity(total_loss)
Here, the tf.identity
doesn't do anything useful aside of marking the total_loss
tensor to be ran after evaluating loss_averages_op
.
在这里,tf.identity
除了total_loss
在评估后标记要运行的张量之外, 没有做任何有用的事情loss_averages_op
。
回答by Arthelais
I came across another use case that is not completely covered by the other answers.
我遇到了其他答案未完全涵盖的另一个用例。
def conv_layer(input_tensor, kernel_shape, output_dim, layer_name, decay=None, act=tf.nn.relu):
"""Reusable code for making a simple convolutional layer.
"""
# Adding a name scope ensures logical grouping of the layers in the graph.
with tf.name_scope(layer_name):
# This Variable will hold the state of the weights for the layer
with tf.name_scope('weights'):
weights = weight_variable(kernel_shape, decay)
variable_summaries(weights, layer_name + '/weights')
with tf.name_scope('biases'):
biases = bias_variable([output_dim])
variable_summaries(biases, layer_name + '/biases')
with tf.name_scope('convolution'):
preactivate = tf.nn.conv2d(input_tensor, weights, strides=[1, 1, 1, 1], padding='SAME')
biased = tf.nn.bias_add(preactivate, biases)
tf.histogram_summary(layer_name + '/pre_activations', biased)
activations = act(biased, 'activation')
tf.histogram_summary(layer_name + '/activations', activations)
return activations
Most of the time when constructing a convolutional layer, you just want the activations returned so you can feed those into the next layer. Sometimes, however - for example when building an auto-encoder - you want the pre-activation values.
大多数情况下,在构建卷积层时,您只需要返回激活值,以便将它们输入到下一层。然而,有时 - 例如在构建自动编码器时 - 您需要预激活值。
In this situation an elegant solution is to pass tf.identity
as the activation function, effectively not activating the layer.
在这种情况下,一个优雅的解决方案是tf.identity
作为激活函数传递,实际上不激活层。
回答by grihabor
I found another application of tf.identity in Tensorboard. If you use tf.shuffle_batch, it returns multiple tensors at once, so you see messy picture when visualizing the graph, you can't split tensor creation pipeline from actiual input tensors: messy
我在 Tensorboard 中发现了 tf.identity 的另一个应用。如果你使用 tf.shuffle_batch,它一次返回多个张量,所以你在可视化图形时看到凌乱的图片,你不能从实际输入张量中拆分张量创建管道:混乱
But with tf.identity you can create duplicate nodes, which don't affect computation flow: nice
但是使用 tf.identity 您可以创建重复节点,这不会影响计算流程:很好
回答by ahmedhosny
In addition to the above, I simply use it when I need to assign a name to ops that do not have a name argument, just like when initializing a state in RNN's:
除了上述之外,当我需要为没有名称参数的操作分配名称时,我只是使用它,就像在 RNN 中初始化状态时一样:
rnn_cell = tf.contrib.rnn.MultiRNNCell([cells])
# no name arg
initial_state = rnn_cell.zero_state(batch_size,tf.float32)
# give it a name with tf.identity()
initial_state = tf.identity(input=initial_state,name="initial_state")
回答by Ju Xuan
In distribution training, we should use tf.identity or the workers will hang at waiting for initialization of the chief worker:
在分发训练中,我们应该使用 tf.identity 否则工作人员将在等待首席工作人员的初始化时挂起:
vec = tf.identity(tf.nn.embedding_lookup(embedding_tbl, id)) * mask
with tf.variable_scope("BiRNN", reuse=None):
out, _ = tf.nn.bidirectional_dynamic_rnn(fw, bw, vec, sequence_length=id_sz, dtype=tf.float32)
For details, without identity, the chief worker would treat some variables as local variables inappropriately and the other workers wait for an initialization operation that can not end
详细来说,如果没有身份,主worker会不恰当地把一些变量当作局部变量,其他worker等待一个无法结束的初始化操作
回答by Shyam Swaroop
When our input data is serialized in bytes, and we want to extract features from this dataset. We can do so in key-value format and then get a placeholder for it. Its benefits are more realised when there are multiple features and each feature has to be read in different format.
当我们的输入数据以字节为单位进行序列化时,我们希望从该数据集中提取特征。我们可以以键值格式进行操作,然后为其获取占位符。当有多个特征并且每个特征必须以不同的格式读取时,它的好处就更能体现出来。
#read the entire file in this placeholder
serialized_tf_example = tf.placeholder(tf.string, name='tf_example')
#Create a pattern in which data is to be extracted from input files
feature_configs = {'image': tf.FixedLenFeature(shape=[256], dtype=tf.float32),/
'text': tf.FixedLenFeature(shape=[128], dtype=tf.string),/
'label': tf.FixedLenFeature(shape=[128], dtype=tf.string),}
#parse the example in key: tensor dictionary
tf_example = tf.parse_example(serialized_tf_example, feature_configs)
#Create seperate placeholders operation and tensor for each feature
image = tf.identity(tf_example['image'], name='image')
text = tf.identity(tf_example['text'], name='text')
label = tf.identity(tf_example['text'], name='label')
回答by mrgloom
I see this kind of hack to check assert:
我看到这种检查断言的黑客:
assertion = tf.assert_equal(tf.shape(image)[-1], 3, message="image must have 3 color channels")
with tf.control_dependencies([assertion]):
image = tf.identity(image)
Also it's used just to give a name:
它也仅用于命名:
image = tf.identity(image, name='my_image')