pandas 在 Python 中遍历二维数组?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43510710/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Iterating through a two dimensional array in Python?
提问by Christina de L
I'm trying to iterate through a two dimensional array in Python and compare items in the array to ints, however I am faced with a ton of various errors whenever I attempt to do such. I'm using numpy and pandas.
我正在尝试遍历 Python 中的二维数组并将数组中的项与整数进行比较,但是每当我尝试这样做时,我都会面临大量各种错误。我正在使用 numpy 和Pandas。
My dataset is created as follows:
我的数据集创建如下:
filename = "C:/Users/User/My Documents/JoeTest.csv"
datas = pandas.read_csv(filename)
dataset = datas.values
Then, I attempt to go through the data, grabbing certain elements of it.
然后,我尝试浏览数据,抓取其中的某些元素。
def model_building(data):
global blackKings
flag = 0;
blackKings.append(data[0][1])
for i in data:
if data[i][39] == 1:
if data[i][40] == 1:
values.append(1)
else:
values.append(-1)
else:
if data[i][40] == 1:
values.append(-1)
else:
values.append(1)
for j in blackKings:
if blackKings[j] != data[i][1]:
flag = 1
if flag == 1:
blackKings.append(data[i][1])
flag = 0;
However, doing so leaves me with a ValueError: The Truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). I don't want to use either of these, as I'm looking to compare the actual value of that one specific instance. Is there another way around this problem?
然而,这样做会给我留下一个 ValueError: The Truth value of an array of a more element is ambiguity。使用 a.any() 或 a.all()。我不想使用其中任何一个,因为我希望比较该特定实例的实际值。有没有其他方法可以解决这个问题?
回答by hpaulj
You need to tell us something about this: dataset = datas.values
你需要告诉我们一些关于这个的事情: dataset = datas.values
It's probably a 2d array, since it derives from a load of a csv. But what shape and dtype? Maybe even a sample of the array.
它可能是一个二维数组,因为它来自一个 csv 的负载。但是什么形状和数据类型?甚至可能是数组的样本。
Is that the data
argument in the function?
这是data
函数中的参数吗?
What are blackKings
and values
? You treat them like lists (with append
).
什么是blackKings
和values
?您将它们视为列表(带有append
)。
for i in data:
if data[i][39] == 1:
This doesn't make sense. for i in data
, if data
is 2d, i
is the the first row, then the second row, etc. If you want i
to in an index, you use something like
这没有意义。 for i in data
, if data
is 2d, i
is the first row, then the second row, etc. 如果你想i
在一个索引中,你使用类似的东西
for i in range(data.shape[0]):
2d array indexing is normally done with data[i,39]
.
二维数组索引通常使用data[i,39]
.
But in your case data[i][39]
is probably an array.
但在你的情况下data[i][39]
可能是一个数组。
Anytime you use an array in a if
statement, you'll get this ValueError
, because there are multiple values.
任何时候在if
语句中使用数组时,都会得到 this ValueError
,因为有多个值。
If i
were proper indexes, then data[i,39]
would be a single value.
如果i
是适当的索引,那么data[i,39]
将是单个值。
To illustrate:
为了显示:
In [41]: data=np.random.randint(0,4,(4,4))
In [42]: data
Out[42]:
array([[0, 3, 3, 2],
[2, 1, 0, 2],
[3, 2, 3, 1],
[1, 3, 3, 3]])
In [43]: for i in data:
...: print('i',i)
...: print('data[i]',data[i].shape)
...:
i [0 3 3 2] # 1st row
data[i] (4, 4)
i [2 1 0 2] # a 4d array
data[i] (4, 4)
...
Here i
is a 4 element array; using that to index data[i]
actually produces a 4 dimensional array; it isn't selecting one value, but rather many values.
这i
是一个 4 元素数组;使用它来索引data[i]
实际上会产生一个 4 维数组;它不是选择一个值,而是选择多个值。
Instead you need to iterate in one of these ways:
相反,您需要以下列方式之一进行迭代:
In [46]: for row in data:
...: if row[3]==1:
...: print(row)
[3 2 3 1]
In [47]: for i in range(data.shape[0]):
...: if data[i,3]==1:
...: print(data[i])
[3 2 3 1]
To debug a problem like this you need to look at intermediate values, and especially their shapes. Don't just assume. Check!
要调试这样的问题,您需要查看中间值,尤其是它们的形状。不要只是假设。查看!
回答by piRSquared
I'm going to attempt to rewrite your function
我将尝试重写你的函数
def model_building(data):
global blackKings
blackKings.append(data[0, 1])
# Your nested if statements were performing an xor
# This is vectorized version of the same thing
values = np.logical_xor(*(data.T[[39, 40]] == 1)) * -2 + 1
# not sure where `values` is defined. If you really wanted to
# append to it, you can do
# values = np.append(values, np.logical_xor(*(data.T[[39, 40]] == 1)) * -2 + 1)
# Your blackKings / flag logic can be reduced
mask = (blackKings[:, None] != data[:, 1]).all(1)
blackKings = np.append(blackKings, data[:, 1][mask])
This may not be perfect because it is difficult to parse your logic considering you are missing some pieces. But hopefully you can adopt some of what I've included here and improve your code.
这可能并不完美,因为考虑到您缺少某些部分,很难解析您的逻辑。但希望您可以采用我在此处包含的一些内容并改进您的代码。