Python 如何在 TensorFlow 中处理带有可变长度序列的批次？

Question

提问by Seja Nair

I was trying to use an RNN (specifically, LSTM) for sequence prediction. However, I ran into an issue with variable sequence lengths. For example,

我试图使用 RNN（特别是 LSTM）进行序列预测。但是，我遇到了可变序列长度的问题。例如，

sent_1 = "I am flying to Dubain"
sent_2 = "I was traveling from US to Dubai"

I am trying to predicting the next word after the current one with a simple RNN based on this Benchmark for building a PTB LSTM model.

我正在尝试使用基于此基准的简单 RNN 预测当前单词之后的下一个单词，用于构建 PTB LSTM 模型。

However, the num_stepsparameter (used for unrolling to the previous hidden states), should remain the same in each Tensorflow's epoch. Basically, batching sentences is not possible as the sentences vary in length.

但是，num_steps参数（用于展开到之前的隐藏状态）应该在每个 Tensorflow 的 epoch 中保持不变。基本上，批处理句子是不可能的，因为句子的长度各不相同。

 # inputs = [tf.squeeze(input_, [1])
 #           for input_ in tf.split(1, num_steps, inputs)]
 # outputs, states = rnn.rnn(cell, inputs, initial_state=self._initial_state)

Here, num_stepsneed to be changed in my case for every sentence. I have tried several hacks, but nothing seems working.

在这里，num_steps在我的情况下，每个句子都需要更改。我尝试了几次黑客攻击，但似乎没有任何效果。

Answer 1

采纳答案by Taras Sereda

You can use the ideas of bucketing and padding which are described in:

您可以使用分桶和填充的思想，这些思想在：

Sequence-to-Sequence Models

序列到序列模型

Also, the rnn function which creates RNN network accepts parameter sequence_length.

此外，创建 RNN 网络的 rnn 函数接受参数sequence_length。

As an example, you can create buckets of sentences of the same size, pad them with the necessary amount of zeros, or placeholders which stand for zero word and afterwards feed them along with seq_length = len(zero_words).

例如，您可以创建相同大小的句子桶，用必要数量的零填充它们，或代表零字的占位符，然后将它们与 seq_length = len(zero_words) 一起提供。

seq_length = tf.placeholder(tf.int32)
outputs, states = rnn.rnn(cell, inputs, initial_state=initial_state, sequence_length=seq_length)

sess = tf.Session()
feed = {
    seq_length: 20,
    #other feeds
}
sess.run(outputs, feed_dict=feed)

Take a look at this reddit thread as well:

看看这个reddit线程：

Tensorflow basic RNN example with 'variable length' sequences

带有“可变长度”序列的 Tensorflow 基本 RNN 示例

Answer 2

回答by Seja Nair

You can use ideas of bucketing and padding which are described in

您可以使用中描述的分桶和填充的想法

Sequence-to-Sequence Models

序列到序列模型

Also rnn function which creates RNN network accepts parameter sequence_length.

创建 RNN 网络的 rnn 函数也接受参数 sequence_length。

As example you can create buckets of sentances of the same size, padd them with necessary amount of zeros, or placeholdres which stands for zero word and afterwards feed them along with seq_length = len(zero_words).

例如，您可以创建相同大小的句子桶，用必要数量的零填充它们，或代表零字的占位符，然后将它们与 seq_length = len(zero_words) 一起提供。

seq_length = tf.placeholder(tf.int32)
outputs, states = rnn.rnn(cell, inputs,initial_state=initial_state,sequence_length=seq_length)

sess = tf.Session()
feed = {
seq_lenght: 20,
#other feeds
       }
sess.run(outputs, feed_dict=feed)

Here , the most important thing is , if you want to make use of the states obtained by one sentence as , the state for the next sentence , when you are providing sequence_length , ( lets say 20 and sentence after padding is 50 ) . You want the state obtained at the 20th time step . For that , do

在这里，最重要的是，如果你想利用一个句子获得的状态作为下一个句子的状态，当你提供sequence_length时，（假设是20，填充后的句子是50）。您想要在第 20 个时间步获得的状态。为此，做

tf.pack(states)

After that call

在那次通话之后

for i in range(len(sentences)):
state_mat   = session.run([states],{
            m.input_data: x,m.targets: y,m.initial_state: state,     m.early_stop:early_stop })
state = state_mat[early_stop-1,:,:]

Answer 3

回答by tnq177

You can limit the maximum length of your input sequences, pad the shorter ones to that length, record the length of each sequence and use tf.nn.dynamic_rnn. It processes input sequences as usual, but after the last element of a sequence, indicated by seq_length, it just copies the cell state through, and for output it outputs zeros-tensor.

您可以限制输入序列的最大长度，将较短的序列填充到该长度，记录每个序列的长度并使用tf.nn.dynamic_rnn。它像往常一样处理输入序列，但在序列的最后一个元素之后，由表示seq_length，它只是通过复制细胞状态，并为输出输出零张量。

Answer 4

回答by Datalker

You can use dynamic_rnninstead and specify length of every sequence even within one batch via passing array to sequence_lengthparameter. Example is below:

您可以dynamic_rnn改为使用并通过将数组传递给sequence_length参数来指定每个序列的长度，即使在一个批次中也是如此。示例如下：

def length(sequence):
    used = tf.sign(tf.reduce_max(tf.abs(sequence), reduction_indices=2))
    length = tf.reduce_sum(used, reduction_indices=1)
    length = tf.cast(length, tf.int32)
    return length

from tensorflow.nn.rnn_cell import GRUCell

max_length = 100
frame_size = 64
num_hidden = 200

sequence = tf.placeholder(tf.float32, [None, max_length, frame_size])
output, state = tf.nn.dynamic_rnn(
    GRUCell(num_hidden),
    sequence,
    dtype=tf.float32,
    sequence_length=length(sequence),
)

Code is taken from a perfect articleon the topic, please also check it.

代码取自一篇关于该主题的完美文章，也请检查它。

Update: Another great poston dynamic_rnnvs rnnyou can find

更新：另一个关于vs你可以找到的好帖子dynamic_rnnrnn

Answer 5

回答by Benjamin Striner

Sorry to post on a dead issue but I just submitted a PR for a better solution. dynamic_rnnis extremely flexible but abysmally slow. It works if it is your only option but CuDNN is much faster. This PR adds support for variable lengths to CuDNNLSTM, so you will hopefully be able to use that soon.

很抱歉在一个死问题上发帖，但我刚刚提交了一个 PR 以获得更好的解决方案。dynamic_rnn非常灵活，但速度极慢。如果它是您唯一的选择，它会起作用，但 CuDNN 速度要快得多。这个 PR 增加了对可变长度的支持CuDNNLSTM，所以你很快就能使用它。

You need to sort sequences by descending length. Then you can pack_sequence, run your RNNs, then unpack_sequence.

您需要按长度降序对序列进行排序。然后你可以pack_sequence，运行你的 RNN，然后unpack_sequence。

https://github.com/tensorflow/tensorflow/pull/22308

Python 如何在 TensorFlow 中处理带有可变长度序列的批次？

提问by Seja Nair

采纳答案by Taras Sereda

回答by Seja Nair

回答by tnq177

回答by Datalker

回答by Benjamin Striner

相关推荐

最近更新

标签

Python 如何在 TensorFlow 中处理带有可变长度序列的批次？

提问by Seja Nair

采纳答案by Taras Sereda

回答by Seja Nair

回答by tnq177

回答by Datalker

回答by Benjamin Striner

相关推荐

Python 如何在 Mac 上安装 PyQt5？

Python ValueError：在 Pandas 中匹配日期时，系列长度必须匹配才能进行比较

Python Django 的 collectstatic 有什么意义？

Python：如何使用 OpenCV 在单击时从网络摄像头捕获图像

相关推荐

最近更新

标签