pandas 我怎样才能干净地规范化数据，然后在以后“非规范化”它？

Question

提问by maxbfuer

I am using Anaconda with a Tensorflow neural network. Most of my data is stored with pandas.
I am attempting to predict cryptocurrency markets. I am aware that this lots of people are probably doing this and it is most likely not going to be very effective, I'm mostly doing it to familiarize myself with Tensorflow and Anaconda tools.
I am fairly new to this, so if I am doing something wrong or suboptimally please let me know.

我正在将 Anaconda 与 Tensorflow 神经网络一起使用。我的大部分数据都存储在pandas.
我正在尝试预测加密货币市场。我知道很多人可能都在这样做，而且很可能不会非常有效，我这样做主要是为了熟悉 Tensorflow 和 Anaconda 工具。
我对此很陌生，所以如果我做错了什么或不是最理想的，请告诉我。

Here is how I aquire and handle the data:

以下是我获取和处理数据的方式：

Download datasets from quandl.com into pandas DataFrames
Select the desired columns from each downloaded dataset
Concatenate the DataFrames
Drop all NaNs from the new, merged DataFrame
Normalize each column (independently) to 0.0-1.0in the new DataFrameusing the code
df = (df - df.min()) / (df.max() - df.min())
Feed the normalized data into my neural network
Unnormalize the data (This is the part that I haven't implemented)

从 quandl.com 下载数据集到 Pandas DataFrames
从每个下载的数据集中选择所需的列
连接 DataFrames
从新的合并中删除所有 NaN DataFrame
使用代码0.0-1.0将每一列（独立地）标准化为新DataFrame的
df = (df - df.min()) / (df.max() - df.min())
将标准化数据输入我的神经网络
对数据进行非规范化（这是我尚未实现的部分）

Now, my question is, how can I cleanly normalize and then unnormalize this data? I realize that if I want to unnormalize data, I'm going to need to store the initial df.min()and df.max()values, but this looks ugly and feels cumbersome.
I am aware that I can normalize data with sklearn.preprocessing.MinMaxScaler, but as far as I know I can't unnormalize data using this.

现在，我的问题是，我怎样才能彻底规范化这些数据，然后不规范化这些数据？我意识到如果我想对数据进行非规范化，我将需要存储初始值df.min()和df.max()值，但这看起来很难看而且感觉很麻烦。
我知道我可以使用对数据进行规范化sklearn.preprocessing.MinMaxScaler，但据我所知，我无法使用它对数据进行非规范化。

It might be that I'm doing something fundamentally wrong here, but I'll be very surprised if there isn't a clean way to normalize and unnormalize data with Anaconda or other libraries.

可能是我在这里做了一些根本性的错误，但如果没有一种干净的方法来使用 Anaconda 或其他库对数据进行规范化和非规范化，我会感到非常惊讶。

Answer 1

回答by tmrlvi

All the scalers in sklearn.preprocessinghave inverse_transformmethod designed just for that.

中的所有缩放器sklearn.preprocessing都有inverse_transform专门为此设计的方法。

For example, to scale and un-scale your DataFramewith MinMaxScaleryou could do:

例如，为了扩展和未扩展您的DataFrame使用MinMaxScaler，你可以做：

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaled = scaler.fit_transform(df)
unscaled = scaler.inverse_transform(scaled)

Just bear in mind that the transformfunction (and fit_transformas well) return a numpy.array, and not a pandas.Dataframe.

请记住，该transform函数（fit_transform以及）返回 a numpy.array，而不是 a pandas.Dataframe。

pandas 我怎样才能干净地规范化数据，然后在以后“非规范化”它？

提问by maxbfuer

回答by tmrlvi

相关推荐

最近更新

标签

pandas 我怎样才能干净地规范化数据，然后在以后“非规范化”它？

提问by maxbfuer

回答by tmrlvi

相关推荐

pandas 初始化一个空的 DataFrame 并附加行

pandas 如何在pandas中实现sql合并

pandas 基于列的整个 DataFrame 上的 df.unique()

pandas 熊猫数据帧到键值对

相关推荐

最近更新

标签