Pandas Dataframe ValueError：传递值的形状是 (X, )，索引意味着 (X, Y)

Question

提问by user1367204

I am getting an error and I'm not sure how to fix it.

我收到一个错误，我不知道如何解决它。

The following seems to work:

以下似乎有效：

def random(row):
   return [1,2,3,4]

df = pandas.DataFrame(np.random.randn(5, 4), columns=list('ABCD'))

df.apply(func = random, axis = 1)

and my output is:

我的输出是：

[1,2,3,4]
[1,2,3,4]
[1,2,3,4]
[1,2,3,4]

However, when I change one of the of the columns to a value such as 1 or None:

但是，当我将其中一列更改为 1 或 None 等值时：

def random(row):
   return [1,2,3,4]

df = pandas.DataFrame(np.random.randn(5, 4), columns=list('ABCD'))
df['E'] = 1

df.apply(func = random, axis = 1)

I get the the error:

我得到错误：

ValueError: Shape of passed values is (5,), indices imply (5, 5)

I've been wrestling with this for a few days now and nothing seems to work. What is interesting is that when I change

我已经为此挣扎了几天，但似乎没有任何效果。有趣的是，当我改变

def random(row):
   return [1,2,3,4]

to

到

def random(row):
   print [1,2,3,4]

everything seems to work normally.

一切似乎都正常工作。

This question is a clearer way of asking this question, which I feel may have been confusing.

这个问题是问这个问题的更清晰的方式，我觉得这可能令人困惑。

My goal is to compute a list for each row and then create a column out of that.

我的目标是为每一行计算一个列表，然后从中创建一列。

EDIT: I originally start with a dataframe that hase one column. I add 4 columns in 4 difference apply steps, and then when I try to add another column I get this error.

编辑：我最初从一个包含一列的数据框开始。我在 4 个差异应用步骤中添加了 4 列，然后当我尝试添加另一列时出现此错误。

Answer 1

采纳答案by Roman Pekar

If your goal is add new column to DataFrame, just write your function as function returning scalar value (not list), something like this:

如果您的目标是向 DataFrame 添加新列，只需将您的函数编写为返回标量值（不是列表）的函数，如下所示：

>>> def random(row):
...     return row.mean()

and then use apply:

然后使用应用：

>>> df['new'] = df.apply(func = random, axis = 1)
>>> df
          A         B         C         D       new
0  0.201143 -2.345828 -2.186106 -0.784721 -1.278878
1 -0.198460  0.544879  0.554407 -0.161357  0.184867
2  0.269807  1.132344  0.120303 -0.116843  0.351403
3 -1.131396  1.278477  1.567599  0.483912  0.549648
4  0.288147  0.382764 -0.840972  0.838950  0.167222

I don't know if it possible for your new column to contain lists, but it deinitely possible to contain tuples ((...)instead of [...]):

我不知道您的新列是否可能包含列表，但绝对可能包含元组（(...)而不是[...]）：

>>> def random(row):
...    return (1,2,3,4,5)
...
>>> df['new'] = df.apply(func = random, axis = 1)
>>> df
          A         B         C         D              new
0  0.201143 -2.345828 -2.186106 -0.784721  (1, 2, 3, 4, 5)
1 -0.198460  0.544879  0.554407 -0.161357  (1, 2, 3, 4, 5)
2  0.269807  1.132344  0.120303 -0.116843  (1, 2, 3, 4, 5)
3 -1.131396  1.278477  1.567599  0.483912  (1, 2, 3, 4, 5)
4  0.288147  0.382764 -0.840972  0.838950  (1, 2, 3, 4, 5)

Answer 2

回答by KeepLearning

I use the code below it is just fine

我使用下面的代码就好了

import numpy as np    
df = pd.DataFrame(np.array(your_data), columns=columns)

Pandas Dataframe ValueError：传递值的形状是 (X, )，索引意味着 (X, Y)

提问by user1367204

采纳答案by Roman Pekar

回答by KeepLearning

相关推荐

最近更新

标签

Pandas Dataframe ValueError：传递值的形状是 (X, )，索引意味着 (X, Y)

提问by user1367204

采纳答案by Roman Pekar

回答by KeepLearning

相关推荐

在 WPF 应用程序中从网络摄像头抓取图片？

图像以适应 WPF 中的网格单元格大小

如何使 WPF 窗口响应

WPF/XAML：如何使 TextBlock 中的所有文本大写？

相关推荐

最近更新

标签