Python 如何在 TensorFlow 上进行 Xavier 初始化

Question

提问by Alejandro

I'm porting my Caffe network over to TensorFlow but it doesn't seem to have xavier initialization. I'm using truncated_normalbut this seems to be making it a lot harder to train.

我正在将我的 Caffe 网络移植到 TensorFlow，但它似乎没有 xavier 初始化。我正在使用，truncated_normal但这似乎使训练变得更加困难。

Answer 1

采纳答案by Sung Kim

Since version 0.8 there is a Xavier initializer, see here for the docs.

由于版本 0.8 有一个 Xavier 初始值设定项，请参阅此处的文档。

You can use something like this:

你可以使用这样的东西：

W = tf.get_variable("W", shape=[784, 256],
           initializer=tf.contrib.layers.xavier_initializer())

Answer 2

回答by Vince Gatto

I looked and I couldn't find anything built in. However, according to this:

我看了看，我找不到任何内置的东西。但是，根据这个：

http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization

Xavier initialization is just sampling a (usually Gaussian) distribution where the variance is a function of the number of neurons. tf.random_normalcan do that for you, you just need to compute the stddev (i.e. the number of neurons being represented by the weight matrix you're trying to initialize).

Xavier 初始化只是对一个（通常是高斯）分布进行采样，其中方差是神经元数量的函数。 tf.random_normal可以为您做到这一点，您只需要计算 stddev（即由您尝试初始化的权重矩阵表示的神经元数量）。

Answer 3

回答by Delip

@Aleph7, Xavier/Glorot initialization depends the number of incoming connections (fan_in), number outgoing connections (fan_out), and kind of activation function (sigmoid or tanh) of the neuron. See this: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf

@Aleph7，Xavier/Glorot 初始化取决于传入连接数 (fan_in)、传出连接数 (fan_out) 和神经元的激活函数类型（sigmoid 或 tanh）。看到这个：http: //jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf

So now, to your question. This is how I would do it in TensorFlow:

所以现在，你的问题。这就是我在 TensorFlow 中的做法：

(fan_in, fan_out) = ...
    low = -4*np.sqrt(6.0/(fan_in + fan_out)) # use 4 for sigmoid, 1 for tanh activation 
    high = 4*np.sqrt(6.0/(fan_in + fan_out))
    return tf.Variable(tf.random_uniform(shape, minval=low, maxval=high, dtype=tf.float32))

Note that we should be sampling from a uniform distribution, and not the normal distribution as suggested in the other answer.

请注意，我们应该从均匀分布中采样，而不是另一个答案中建议的正态分布。

Incidentally, I wrote a post yesterdayfor something different using TensorFlow that happens to also use Xavier initialization. If you're interested, there's also a python notebook with an end-to-end example: https://github.com/delip/blog-stuff/blob/master/tensorflow_ufp.ipynb

顺便说一句，我昨天写了一篇关于使用 TensorFlow 的不同内容的帖子，恰好也使用了 Xavier 初始化。如果您有兴趣，还有一个带有端到端示例的 python notebook：https: //github.com/delip/blog-stuff/blob/master/tensorflow_ufp.ipynb

Answer 4

回答by Hooked

A nice wrapper around tensorflowcalled prettytensorgives an implementation in the source code (copied directly from here):

一个很好的包装器tensorflow调用prettytensor在源代码中提供了一个实现（直接从这里复制）：

def xavier_init(n_inputs, n_outputs, uniform=True):
  """Set the parameter initialization using the method described.
  This method is designed to keep the scale of the gradients roughly the same
  in all layers.
  Xavier Glorot and Yoshua Bengio (2010):
           Understanding the difficulty of training deep feedforward neural
           networks. International conference on artificial intelligence and
           statistics.
  Args:
    n_inputs: The number of input nodes into each output.
    n_outputs: The number of output nodes for each input.
    uniform: If true use a uniform distribution, otherwise use a normal.
  Returns:
    An initializer.
  """
  if uniform:
    # 6 was used in the paper.
    init_range = math.sqrt(6.0 / (n_inputs + n_outputs))
    return tf.random_uniform_initializer(-init_range, init_range)
  else:
    # 3 gives us approximately the same limits as above since this repicks
    # values greater than 2 standard deviations from the mean.
    stddev = math.sqrt(3.0 / (n_inputs + n_outputs))
    return tf.truncated_normal_initializer(stddev=stddev)

Answer 5

回答by Salvador Dali

TF-contrib has xavier_initializer. Here is an example how to use it:

TF-contrib 有xavier_initializer. 这是一个如何使用它的示例：

import tensorflow as tf
a = tf.get_variable("a", shape=[4, 4], initializer=tf.contrib.layers.xavier_initializer())
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print sess.run(a)

In addition to this, tensorflow has other initializers:

除此之外，tensorflow 还有其他初始化器：

Answer 6

回答by Saullo G. P. Castro

Just to add another example on how to define a tf.Variableinitialized using Xavier and Yoshua's method:

只是添加另一个关于如何tf.Variable使用Xavier 和 Yoshua的方法定义初始化的示例：

graph = tf.Graph()
with graph.as_default():
    ...
    initializer = tf.contrib.layers.xavier_initializer()
    w1 = tf.Variable(initializer(w1_shape))
    b1 = tf.Variable(initializer(b1_shape))
    ...

This prevented me from having nanvalues on my loss function due to numerical instabilities when using multiple layers with RELUs.

nan当使用带有 RELU 的多层时，由于数值不稳定性，这阻止了我的损失函数值。

Answer 7

回答by xilef

Via the kernel_initializerparameter to tf.layers.conv2d, tf.layers.conv2d_transpose, tf.layers.Denseetc

通过kernel_initializer参数tf.layers.conv2d, tf.layers.conv2d_transpose, tf.layers.Dense等

e.g.

例如

layer = tf.layers.conv2d(
     input, 128, 5, strides=2,padding='SAME',
     kernel_initializer=tf.contrib.layers.xavier_initializer())

https://www.tensorflow.org/api_docs/python/tf/layers/conv2d

https://www.tensorflow.org/api_docs/python/tf/layers/conv2d_transpose

https://www.tensorflow.org/api_docs/python/tf/layers/Dense

Answer 8

回答by Tony Power

Just in case you want to use one line as you do with:

以防万一您想像使用以下内容一样使用一行：

W = tf.Variable(tf.truncated_normal((n_prev, n), stddev=0.1))

You can do:

你可以做：

W = tf.Variable(tf.contrib.layers.xavier_initializer()((n_prev, n)))

Answer 9

回答by y.selivonchyk

In Tensorflow 2.0and further both tf.contrib.*and tf.get_variable()are deprecated. In order to do Xavier initialization you now have to switch to:

在Tensorflow 2.0以及更远的版本中，tf.contrib.*和tf.get_variable()均已弃用。为了进行 Xavier 初始化，您现在必须切换到：

init = tf.initializers.GlorotUniform()
var = tf.Variable(init(shape=shape))
# or a oneliner with a little confusing brackets
var = tf.Variable(tf.initializers.GlorotUniform()(shape=shape))

Glorot uniform and Xavier uniform are two different names of the same initialization type. If you want to know more about how to use initializations in TF2.0 with or without Keras refer to documentation.

Glorot uniform 和 Xavier uniform 是同一个初始化类型的两个不同名称。如果您想了解更多有关如何在 TF2.0 中使用或不使用 Keras 进行初始化的信息，请参阅文档。

Python 如何在 TensorFlow 上进行 Xavier 初始化

提问by Alejandro

采纳答案by Sung Kim

回答by Vince Gatto

回答by Delip

回答by Hooked

回答by Salvador Dali

回答by Saullo G. P. Castro

回答by xilef

回答by Tony Power

回答by y.selivonchyk

相关推荐

最近更新

标签

Python 如何在 TensorFlow 上进行 Xavier 初始化

提问by Alejandro

采纳答案by Sung Kim

回答by Vince Gatto

回答by Delip

回答by Hooked

回答by Salvador Dali

回答by Saullo G. P. Castro

回答by xilef

回答by Tony Power

回答by y.selivonchyk

相关推荐

如何使用python读取配置文件

Python pd.read_csv 中的字符串行索引导致错误“标签 [1] 不在 [索引] 中”

Python 使用熊猫数据框进行线性回归

Python 在 TensorFlow 中，Session.run() 和 Tensor.eval() 有什么区别？

相关推荐

最近更新

标签