Pandas 索引和密钥错误

Question

提问by Yash

Consider the following:

考虑以下：

d = {'a': 0.0, 'b': 1.0, 'c': 2.0}

e = pd.Series(d, index = ['a', 'b', 'c'])

df = pd.DataFrame({ 'A' : 1.,'B' : e,'C' :pd.Timestamp('20130102')}).

When i try to access the first row of column B in the following way:

当我尝试通过以下方式访问 B 列的第一行时：

>>> df.B[0]
0.0

I get the correct result.

我得到正确的结果。

However, after reading KeyError: 0 when accessing value in pandas series, I was under the assumption that, since I have specified the index as 'a', 'b' and 'c', the correct way to access the first row of column B (using positional arguments) is: df.B.iloc[0], and df.B[0]should raise a Key Error. I dont know what am I missing. Can someone clarify in which case do I get a Key Error ?

但是，在阅读KeyError: 0 when accessing value in pandas series 之后，我假设，因为我已将索引指定为 'a'、'b' 和 'c'，这是访问列第一行的正确方法B（使用位置参数）是: df.B.iloc[0]，并且df.B[0]应该引发一个关键错误。我不知道我错过了什么。有人可以澄清在哪种情况下我会收到 Key Error 吗？

Answer 1

回答by Justinas Marozas

Problem in your referenced Question is that index of given dataframe is integer, but does not start from 0.

您引用的问题中的问题是给定数据帧的索引是整数，但不是从 0 开始。

Pandas behaviour when asking for df.B[0]is ambiguous and depends on data type of index and data type of value passed to python slice syntax. It can behave like df.B.loc[0](index label based) or df.B.iloc[0](position based) or probably something else I'm not aware of. For predictable behaviour I recommend using locand iloc.

请求时的 Pandas 行为df.B[0]不明确，取决于传递给 python 切片语法的索引的数据类型和值的数据类型。它可以表现得像df.B.loc[0]（基于索引标签）或df.B.iloc[0]（基于位置）或者可能是我不知道的其他东西。对于可预测的行为，我建议使用loc和iloc。

To illustrate this with your example:

用你的例子来说明这一点：

d = [0.0, 1.0, 2.0]
e = pd.Series(d, index = ['a', 'b', 'c'])
df = pd.DataFrame({'A': 1., 'B': e, 'C': pd.Timestamp('20130102')})

df.B[0] # 0.0 - fall back to position based
df.B['0'] # KeyError - no label '0' in index
df.B['a'] # 0.0 - found label 'a' in index
df.B.loc[0] # TypeError - string index queried by integer value
df.B.loc['0'] # KeyError - no label '0' in index
df.B.loc['a'] # 0.0 - found label 'a' in index
df.B.iloc[0] # 0.0 - position based query for row 0
df.B.iloc['0'] # TypeError - string can't be used for position
df.B.iloc['a'] # TypeError - string can't be used for position

With example from referenced article:

以参考文章中的示例为例：

d = [0.0, 1.0, 2.0]
e = pd.Series(d, index = [4, 5, 6])
df = pd.DataFrame({'A': 1., 'B': e, 'C': pd.Timestamp('20130102')})

df.B[0] # KeyError - label 0 not in index
df.B['0'] # KeyError - label '0' not in index
df.B.loc[0] # KeyError - label 0 not in index
df.B.loc['0'] # KeyError - label '0' not in index
df.B.iloc[0] # 0.0 - position based query for row 0
df.B.iloc['0'] # TypeError - string can't be used for position

Answer 2

回答by xyzjayne

df.Breturns a pandas series which is why you can do positional indexing. If you select column B as a dataframe this will throw an error:

df.B返回一个Pandas系列，这就是您可以进行位置索引的原因。如果您选择 B 列作为数据框，这将引发错误：

df[['B']][0]

Answer 3

回答by NiGiord

df.Bis actually a pandas.Seriesobject (a shortcut for df['B']), which can be iterated. df.B[0]is no longer a "row" but just the first element of df.B, since by writing df.Byou basically create a 1-D object.

df.B实际上是一个pandas.Series对象（的快捷方式df['B']），可以迭代。df.B[0]不再是“行”而只是的第一个元素df.B，因为通过编写df.B您基本上创建了一个一维对象。

More information in the data structure documentation

数据结构文档中的更多信息

You can treat a DataFrame semantically like a dict of like-indexed Series objects.

您可以在语义上将 DataFrame 视为类似索引的 Series 对象的字典。

Pandas 索引和密钥错误

提问by Yash

回答by Justinas Marozas

回答by xyzjayne

回答by NiGiord

相关推荐

最近更新

标签

Pandas 索引和密钥错误

提问by Yash

回答by Justinas Marozas

回答by xyzjayne

回答by NiGiord

相关推荐

pandas 当 json_normalize 无法遍历列以展平时如何修复它？

Pandas - 将列名添加到 groupby 的结果中

Pandas：如果单元格包含特定文本，则删除行

pandas Python从数组中删除括号

相关推荐

最近更新

标签