Python Pandas DataFrame 和 Keras

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43876770/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:28:45  来源:igfitidea点击:

Pandas DataFrame and Keras

pythonpandaskeras

提问by Gonzalo Donoso

I'm trying to perform a sentiment analysis in Python using Keras. To do so, I need to do a word embedding of my texts. The problem appears when I try to fit the data to my model:

我正在尝试使用 Keras 在 Python 中执行情绪分析。为此,我需要对我的文本进行词嵌入。当我尝试将数据拟合到我的模型时出现问题:

model_1 = Sequential()
model_1.add(Embedding(1000,32, input_length = X_train.shape[0]))
model_1.add(Flatten())
model_1.add(Dense(250, activation='relu'))
model_1.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

The shape of my train data is

我的火车数据的形状是

(4834,)

And is a Pandas series object. When I try to fit my model and validate it with some other data I get this error:

并且是 Pandas 系列对象。当我尝试拟合我的模型并使用其他一些数据对其进行验证时,我收到此错误:

model_1.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=64, verbose=2)

ValueError: Error when checking model input: expected embedding_1_input to have shape (None, 4834) but got array with shape (4834, 1)

ValueError:检查模型输入时出错:预期 embedding_1_input 具有形状 (None, 4834) 但得到形状为 (4834, 1) 的数组

How can I reshape my data to make it suited for Keras? I've been trying with np.reshape but I cannot place None elements with that function.

如何重塑我的数据以使其适合 Keras?我一直在尝试使用 np.reshape 但我无法使用该功能放置 None 元素。

Thanks in advance

提前致谢

回答by Dat Tran

Noneis the number of expected rows that goes into training therefore you can't define it. Also Keras needs a numpy array as input and not a pandas dataframe. First convert the df to a numpy array with df.valuesand then do np.reshape((-1, 4834)). Note that you should use np.float32. This is important if you train it on GPU.

None是进入训练的预期行数,因此您无法定义它。此外,Keras 需要一个 numpy 数组作为输入,而不是 Pandas 数据框。首先使用 df 将 df 转换为 numpy 数组,df.values然后执行np.reshape((-1, 4834)). 请注意,您应该使用np.float32. 如果你在 GPU 上训练它,这很重要。

回答by Pardhu

https://pypi.org/project/keras-pandas/

https://pypi.org/project/keras-pandas/

Easiest way is having the keras_pandas package to fit a pandas dataframe to keras.The code shown below is an general example from the package docs.

最简单的方法是使用 keras_pandas 包来将 Pandas 数据框适配到 keras。下面显示的代码是来自包文档的一般示例。

from keras import Model
from keras.layers import Dense

from keras_pandas.Automater import Automater
from keras_pandas.lib import load_titanic

observations = load_titanic()

# Transform the data set, using keras_pandas
categorical_vars = ['pclass', 'sex', 'survived']
numerical_vars = ['age', 'siblings_spouses_aboard', 'parents_children_aboard', 'fare']
text_vars = ['name']

auto = Automater(categorical_vars=categorical_vars, numerical_vars=numerical_vars, text_vars=text_vars,
 response_var='survived')
X, y = auto.fit_transform(observations)

# Start model with provided input nub
x = auto.input_nub

# Fill in your own hidden layers
x = Dense(32)(x)
x = Dense(32, activation='relu')(x)
x = Dense(32)(x)

# End model with provided output nub
x = auto.output_nub(x)

model = Model(inputs=auto.input_layers, outputs=x)
model.compile(optimizer='Adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train model
model.fit(X, y, epochs=4, validation_split=.2)

回答by Tim Seed

You need a specific version of Pandas for this to work. If you use the current version (as of 20th Aug 2018) this will fail.

你需要一个特定版本的 Pandas 才能工作。如果您使用当前版本(截至 2018 年 8 月 20 日),这将失败。

Rollback your Pandas and Keras (pip uninstall ....) and then install a specific version like this

回滚你的 Pandas 和 Keras(pip 卸载 ....),然后像这样安装一个特定的版本

python -m pip install pandas==0.19.2