Python 从 Pandas 数据帧转换为 TensorFlow 张量对象

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42286972/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:30:47  来源:igfitidea点击:

Converting from Pandas dataframe to TensorFlow tensor object

pythonpandastensorflow

提问by jlt199

I'm still new to Python, Machine Learning and TensorFlow, but doing my best to jump right in head-first. I could use some help though.

我还是 Python、机器学习和 TensorFlow 的新手,但我会尽我最大的努力直接进入。不过我可以使用一些帮助。

My data is currently in a Pandas dataframe. How can I convert this to TensorFlow object? I've tried

我的数据目前位于 Pandas 数据框中。如何将其转换为 TensorFlow 对象?我试过了

dataVar_tensor = tf.constant(dataVar)
depth_tensor = tf.constant(depth)

But, I get errors [15780 rows x 9 columns] - got shape [15780, 9], but wanted [].

但是,我收到错误[15780 rows x 9 columns] - got shape [15780, 9], but wanted []

I'm sure this is probably a straightforward question, but I could really use the help.

我确定这可能是一个简单的问题,但我真的可以使用帮助。

Many thanks

非常感谢

ps. I'm running tensorflow 0.12 with Anaconda Python 3.5 on Windows 10

附:我在 Windows 10 上使用 Anaconda Python 3.5 运行 tensorflow 0.12

回答by jlt199

I've converted my Pandas dataframe to a Numpy array using df.values

我已将 Pandas 数据框转换为 Numpy 数组 df.values

Now, using

现在,使用

dataVar_tensor = tf.constant(dataVar, dtype = tf.float32, shape=[15780,9])
depth_tensor = tf.constant(depth, 'float32',shape=[15780,1])

seems to work. I can't say it does definitively because I have other hurdles to overcome to get my code working, but it's hopefully a step in the right direction. Thanks for all your help

似乎工作。我不能肯定地说它确实如此,因为我还有其他障碍需要克服才能使我的代码正常工作,但希望这是朝着正确方向迈出的一步。感谢你的帮助

As an aside, my trials of getting the tutorial to work on my own data are continued in my next question Converting TensorFlow tutorial to work with my own data

顺便说一句,我在下一个问题Converting TensorFlow tutorial to work with my own data中继续尝试让教程处理我自己的数据

回答by Thedarknight

here is one solution i found that works on google colab , probably should work in a local machine too

这是我发现适用于 google colab 的一种解决方案,可能也适用于本地机器

import pandas as pd
import tensorflow as tf
#Read the file to a pandas object
data=pd.read_csv('filedir')
#convert the pandas object to a tensor
data=tf.convert_to_tensor(data)
type(data)

This must print something like

这必须打印类似的东西

tensorflow.python.framework.ops.Tensor

Hope this helps :)

希望这可以帮助 :)

`

`

回答by VS_FF

The following works easily based on numpyarray input data:

以下基于numpy数组输入数据轻松工作:

import tensorflow as tf
import numpy as np
a = np.array([1,2,3])
with tf.Session() as sess:
    tf.global_variables_initializer().run()

    dataVar = tf.constant(a)
    print(dataVar.eval())

-> [1 2 3]

Don't forget to start the sessionand run()or eval()your tensor object to see its content; otherwise it will just give you its generic description.

不要忘记启动sessionrun()eval()您的张量对象以查看其内容;否则它只会给你它的一般描述。

I suspect that since your data is in the DataFrame rather than a simply array, you need to experiment with the shapeparameterof tf.constant(), which you are currently not specifying, in order to help it understand the dimensionality of the DataFrame and deal with its indices, etc.?

我怀疑,因为你的数据在数据帧,而不是一个简单的数组,你需要用实验shape参数tf.constant(),您目前没有指定,以帮助其理解数据框的维度和处理其指数等.?

回答by StefanQ

hottbox.pdtools.utils (the Pandas integration tools of the HOTTBOX API) provides the functions

hottbox.pdtools.utils(HOTTBOX API 的 Pandas 集成工具)提供的功能

   pd_to_tensor(df[, keep_index])
   tensor_to_pd(tensor[, col_name])

for conversion in both directions.

用于双向转换。

回答by Heather Walker

You can use tf.estimator.inputs.pandas_input_fnin your make_input_fn(X, y, num_epochs)function. I've not managed to get it to work with a multi-index, however. I fixed this issue by turning it into a standard integer index, using df.reset_index(drop=True)

你可以tf.estimator.inputs.pandas_input_fn在你的make_input_fn(X, y, num_epochs)函数中使用。但是,我还没有设法让它与多索引一起工作。我通过将其转换为标准整数索引来解决此问题,使用df.reset_index(drop=True)

回答by praveen kumar

You can convert a the dataframe column to a tensor object like so:

您可以将数据框列转换为张量对象,如下所示:

tf.constant((df['column_name']))

This should return you a tensor variable which looks something like this:

这应该返回一个张量变量,它看起来像这样:

<tf.Tensor: id=275634, shape=(48895,), dtype=float64, numpy=
array([1, 2, ...])>

Also, you can ad any number of dataframe columns as you want, like so:

此外,您可以根据需要添加任意数量的数据框列,如下所示:

tf.constant(([cdf['column1'], cdf['column2']]))

Hope this helps.

希望这可以帮助。