Python 将 numpy 数组转换为 Pandas 数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50624046/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert numpy array to pandas dataframe
提问by konstantin
I have a numpy array of size 31x36
and i want to transform into pandas dataframe in order to process it. I am trying to convert it using the following code:
我有一个大小为 numpy 的数组,31x36
我想转换为 Pandas 数据帧以便处理它。我正在尝试使用以下代码对其进行转换:
pd.DataFrame(data=matrix,
index=np.array(range(1, 31)),
columns=np.array(range(1, 36)))
However, I am receiving the following error:
但是,我收到以下错误:
ValueError: Shape of passed values is (36, 31), indices imply (35, 30)
ValueError: 传递值的形状是 (36, 31),索引意味着 (35, 30)
How can I solve the issue and transform it properly?
如何解决问题并正确转换?
采纳答案by EdChum
As to why what you tried failed, the ranges are off by 1
至于您尝试失败的原因,范围相差 1
pd.DataFrame(data=matrix,
index=np.array(range(1, 32)),
columns=np.array(range(1, 37)))
As the last value isn't included in the range
由于最后一个值不包含在范围内
Actually looking at what you're doing you could've just done:
实际上,看看你在做什么,你本可以做的:
pd.DataFrame(data=matrix,
index=np.arange(1, 32)),
columns=np.arange(1, 37)))
Or in pure pandas
:
或者纯pandas
:
pd.DataFrame(data=matrix,
index=pd.RangeIndex(range(1, 32)),
columns=pd.RangeIndex(range(1, 37)))
Also if you don't specify the index and column params, an auto-generated index and columns is made, which will start from 0
. Unclear why you need them to start from 1
此外,如果您不指定索引和列参数,则会生成一个自动生成的索引和列,它们将从0
. 不清楚为什么你需要他们从1
You could also have not passed the index and column params and just modified them after construction:
您也可以没有传递索引和列参数,只是在构造后修改它们:
In[9]:
df = pd.DataFrame(adaption)
df.columns = df.columns+1
df.index = df.index + 1
df
Out[9]:
1 2 3 4 5 6
1 -2.219072 -1.637188 0.497752 -1.486244 1.702908 0.331697
2 -0.586996 0.040052 1.021568 0.783492 -1.263685 -0.192921
3 -0.605922 0.856685 -0.592779 -0.584826 1.196066 0.724332
4 -0.226160 -0.734373 -0.849138 0.776883 -0.160852 0.403073
5 -0.081573 -1.805827 -0.755215 -0.324553 -0.150827 -0.102148
回答by ACascarino
In addition to the above answer,range(1, X)
describes the set of numbers from 1
up to X-1
inclusive. You need to use range(1, 32)
and range(1, 37)
to do what you describe.
除了上面的答案,range(1, X)
描述了从1
up 到X-1
inclusive的一组数字。您需要使用range(1, 32)
并range(1, 37)
执行您所描述的操作。
回答by jpp
You meet an error because the end
argument in range(start, end)
is non-inclusive. You have a couple of options to account for this:
您遇到错误,因为end
in 中的参数range(start, end)
是non-inclusive。您有几个选项可以解决这个问题:
Don't pass index and columns
不要传递索引和列
Just use df = pd.DataFrame(matrix)
. The pd.DataFrame
constructor adds integer indices implicitly.
只需使用df = pd.DataFrame(matrix)
. 该pd.DataFrame
构造函数整数指数增加了含蓄。
Pass in the shape of the array
传入数组的形状
matrix.shape
gives a tuple of row and column count, so you need not specify them manually. For example:
matrix.shape
给出行和列计数的元组,因此您无需手动指定它们。例如:
df = pd.DataFrame(matrix, index=range(matrix.shape[0]),
columns=range(matrix.shape[1]))
If you need to start at 1
, remember to add 1:
如果需要从 开始1
,记得加1:
df = pd.DataFrame(matrix, index=range(1, matrix.shape[0] + 1),
columns=range(1, matrix.shape[1] + 1))