Python LSTM 自编码器
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44647258/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
LSTM Autoencoder
提问by ScientiaEtVeritas
I'm trying to build a LSTM autoencoder with the goal of getting a fixed sized vector from a sequence, which represents the sequence as good as possible. This autoencoder consists of two parts:
我正在尝试构建一个 LSTM 自动编码器,目标是从一个序列中获取一个固定大小的向量,该向量尽可能好地表示该序列。这个自编码器由两部分组成:
LSTM
Encoder: Takes a sequence and returns an output vector (return_sequences = False
)LSTM
Decoder: Takes an output vector and returns a sequence (return_sequences = True
)
LSTM
编码器:接受一个序列并返回一个输出向量 (return_sequences = False
)LSTM
解码器:获取一个输出向量并返回一个序列 (return_sequences = True
)
So, in the end, the encoder is a many to oneLSTM and the decoder is a one to manyLSTM.
因此,最终,编码器是多对一的LSTM,解码器是一对多的LSTM。
Image source: Andrej Karpathy
图片来源:Andrej Karpathy
On a high level the coding looks like this (similar as described here):
在高层次上,编码看起来像这样(类似于此处所述):
encoder = Model(...)
decoder = Model(...)
autoencoder = Model(encoder.inputs, decoder(encoder(encoder.inputs)))
autoencoder.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
autoencoder.fit(data, data,
batch_size=100,
epochs=1500)
The shape (number of training examples, sequence length, input dimension) of the data
array is (1200, 10, 5)
and looks like this:
data
数组的形状(训练示例的数量、序列长度、输入维度)(1200, 10, 5)
如下所示:
array([[[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
...,
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]],
... ]
Problem:I am not sure how to proceed, especially how to integrate LSTM
to Model
and how to get the decoder to generate a sequence from a vector.
问题:我不确定如何进行,尤其是如何集成LSTM
到Model
以及如何让解码器从向量生成序列。
I am using keras
with tensorflow
backend.
我使用keras
与tensorflow
后端。
EDIT:If someone wants to try out, here is my procedure to generate random sequences with moving ones (including padding):
编辑:如果有人想尝试,这是我生成带有移动序列(包括填充)的随机序列的程序:
import random
import math
def getNotSoRandomList(x):
rlen = 8
rlist = [0 for x in range(rlen)]
if x <= 7:
rlist[x] = 1
return rlist
sequence = [[getNotSoRandomList(x) for x in range(round(random.uniform(0, 10)))] for y in range(5000)]
### Padding afterwards
from keras.preprocessing import sequence as seq
data = seq.pad_sequences(
sequences = sequence,
padding='post',
maxlen=None,
truncating='post',
value=0.
)
采纳答案by Daniel M?ller
Models can be any way you want. If I understood it right, you just want to know how to create models with LSTM?
模型可以是您想要的任何方式。如果我理解正确,您只是想知道如何使用 LSTM 创建模型?
Using LSTMs
使用 LSTM
Well, first, you have to define what your encoded vector looks like. Suppose you want it to be an array of 20 elements, a 1-dimension vector. So, shape (None,20). The size of it is up to you, and there is no clear rule to know the ideal one.
好吧,首先,您必须定义编码向量的外观。假设您希望它是一个包含 20 个元素的数组,一个一维向量。所以,形状(无,20)。它的大小取决于您,并且没有明确的规则来知道理想的大小。
And your input must be three-dimensional, such as your (1200,10,5). In keras summaries and error messages, it will be shown as (None,10,5), as "None" represents the batch size, which can vary each time you train/predict.
而且你的输入必须是三维的,比如你的(1200,10,5)。在 keras 摘要和错误消息中,它将显示为 (None,10,5),因为“None”代表批次大小,每次训练/预测时都会有所不同。
There are many ways to do this, but, suppose you want only one LSTM layer:
有很多方法可以做到这一点,但是,假设您只需要一个 LSTM 层:
from keras.layers import *
from keras.models import Model
inpE = Input((10,5)) #here, you don't define the batch size
outE = LSTM(units = 20, return_sequences=False, ...optional parameters...)(inpE)
This is enough for a very very simple encoder resulting in an array with 20 elements (but you can stack more layers if you want). Let's create the model:
这对于一个非常非常简单的编码器来说已经足够了,它产生了一个包含 20 个元素的数组(但如果需要,您可以堆叠更多层)。让我们创建模型:
encoder = Model(inpE,outE)
Now, for the decoder, it gets obscure. You don't have an actual sequence anymore, but a static meaningful vector. You may want to use LTSMs still, they will suppose the vector is a sequence.
现在,对于解码器,它变得晦涩难懂。您不再有实际的序列,而是一个静态的有意义的向量。您可能仍想使用 LTSM,它们会假设向量是一个序列。
But here, since the input has shape (None,20), you must first reshape it to some 3-dimensional array in order to attach an LSTM layer next.
但是在这里,由于输入具有形状 (None,20),因此您必须首先将其整形为某个 3 维数组,以便接下来附加 LSTM 层。
The way you will reshape it is entirely up to you. 20 steps of 1 element? 1 step of 20 elements? 10 steps of 2 elements? Who knows?
您将如何重塑它完全取决于您。1 个元素的 20 个步骤?1 步 20 个元素?2个元素的10个步骤?谁知道?
inpD = Input((20,))
outD = Reshape((10,2))(inpD) #supposing 10 steps of 2 elements
It's important to notice that if you don't have 10 steps anymore, you won't be able to just enable "return_sequences" and have the output you want. You'll have to work a little. Acually, it's not necessary to use "return_sequences" or even to use LSTMs, but you may do that.
重要的是要注意,如果您不再有 10 个步骤,您将无法仅启用“return_sequences”并获得您想要的输出。你得工作一点。实际上,没有必要使用“return_sequences”甚至使用 LSTM,但您可以这样做。
Since in my reshape I have 10 timesteps (intentionally), it will be ok to use "return_sequences", because the result will have 10 timesteps (as the initial input)
由于在我的重塑中我有 10 个时间步长(有意),因此可以使用“return_sequences”,因为结果将有 10 个时间步长(作为初始输入)
outD1 = LSTM(5,return_sequences=True,...optional parameters...)(outD)
#5 cells because we want a (None,10,5) vector.
You could work in many other ways, such as simply creating a 50 cell LSTM without returning sequences and then reshaping the result:
您可以通过许多其他方式工作,例如简单地创建一个 50 单元的 LSTM 而不返回序列,然后重新调整结果:
alternativeOut = LSTM(50,return_sequences=False,...)(outD)
alternativeOut = Reshape((10,5))(alternativeOut)
And our model goes:
我们的模型是:
decoder = Model(inpD,outD1)
alternativeDecoder = Model(inpD,alternativeOut)
After that, you unite the models with your code and train the autoencoder.
All three models will have the same weights, so you can make the encoder bring results just by using its predict
method.
之后,您将模型与代码结合起来并训练自动编码器。所有三个模型都将具有相同的权重,因此您只需使用其predict
方法即可使编码器带来结果。
encoderPredictions = encoder.predict(data)
What I often see about LSTMs for generating sequences is something like predicting the next element.
我经常看到的 LSTM 生成序列类似于预测下一个元素。
You take just a few elements of the sequence and try to find the next element. And you take another segment one step forward and so on. This may be helpful in generating sequences.
您只需要序列中的几个元素并尝试找到下一个元素。然后你再向前迈出一步,依此类推。这可能有助于生成序列。
回答by user6903745
You can find a simple of sequence to sequence autoencoder here: https://blog.keras.io/building-autoencoders-in-keras.html
您可以在这里找到一个简单的序列来对自动编码器进行排序:https://blog.keras.io/building-autoencoders-in-keras.html
回答by user702846
Here is an example
这是一个例子
Let's create a synthetic data consisting of a few sequence. The idea is looking into these sequences through the lens of an autoencoder. In other words, lowering the dimension or summarizing them into a fixed length.
让我们创建一个由几个序列组成的合成数据。这个想法是通过自动编码器的镜头来研究这些序列。换句话说,降低维度或将它们汇总为固定长度。
# define input sequence
sequence = np.array([[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
[0.2, 0.4, 0.6, 0.8],
[0.3, 0.6, 0.9, 1.2]])
# prepare to normalize
x = pd.DataFrame(sequence.tolist()).T.values
scaler = preprocessing.StandardScaler()
x_scaled = scaler.fit_transform(x)
sequence_normalized = [col[~np.isnan(col)] for col in x_scaled.T]
# make sure to use dtype='float32' in padding otherwise with floating points
sequence = pad_sequences(sequence, padding='post', dtype='float32')
# reshape input into [samples, timesteps, features]
n_obs = len(sequence)
n_in = 9
sequence = sequence.reshape((n_obs, n_in, 1))
Let's device a simple LSTM
让我们设置一个简单的 LSTM
#define encoder
visible = Input(shape=(n_in, 1))
encoder = LSTM(2, activation='relu')(visible)
# define reconstruct decoder
decoder1 = RepeatVector(n_in)(encoder)
decoder1 = LSTM(100, activation='relu', return_sequences=True)(decoder1)
decoder1 = TimeDistributed(Dense(1))(decoder1)
# tie it together
myModel = Model(inputs=visible, outputs=decoder1)
# summarize layers
print(myModel.summary())
#sequence = tmp
myModel.compile(optimizer='adam', loss='mse')
history = myModel.fit(sequence, sequence,
epochs=400,
verbose=0,
validation_split=0.1,
shuffle=True)
plot_model(myModel, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = myModel.predict(sequence, verbose=0)
# yhat
import matplotlib.pyplot as plt
#plot our loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model train vs validation loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper right')
plt.show()
Lets build the autoencoder
让我们构建自动编码器
# use our encoded layer to encode the training input
decoder_layer = myModel.layers[1]
encoded_input = Input(shape=(9, 1))
decoder = Model(encoded_input, decoder_layer(encoded_input))
# we are interested in seeing how the encoded sequences with lenght 2 (same as the dimension of the encoder looks like)
out = decoder.predict(sequence)
f = plt.figure()
myx = out[:,0]
myy = out[:,1]
s = plt.scatter(myx, myy)
for i, txt in enumerate(out[:,0]):
plt.annotate(i+1, (myx[i], myy[i]))
And here is the representation of the sequences
这是序列的表示