Python Pandas 中的 loc 函数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31571217/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
loc function in pandas
提问by kenway
Can anybody explain why is loc used in python pandas with examples like shown below?
任何人都可以解释为什么 loc 在 python pandas 中使用如下所示的示例?
for i in range(0, 2):
for j in range(0, 3):
df.loc[(df.Age.isnull()) & (df.Gender == i) & (df.Pclass == j+1),
'AgeFill'] = median_ages[i,j]
采纳答案by KirstieJane
The use of .loc
is recommended here because the methods df.Age.isnull()
, df.Gender == i
and df.Pclass == j+1
may return a view of slices of the data frame or may return a copy. This can confuse pandas.
使用.loc
建议这里,因为这些方法df.Age.isnull()
,df.Gender == i
并且df.Pclass == j+1
可返回该数据帧的切片的视图,或者可以返回副本。这可能会混淆熊猫。
If you don't use .loc
you end up calling all 3 conditions in series which leads you to a problem called chained indexing. When you use .loc
however you access all your conditions in one step and pandas is no longer confused.
如果你不使用,.loc
你最终会依次调用所有 3 个条件,这会导致一个称为链式索引的问题。.loc
然而,当您使用时,您可以一步访问所有条件,pandas 不再感到困惑。
You can read more about this along with some examples of when not using .loc
will cause the operation to fail in the pandas documentation.
您可以.loc
在pandas 文档中阅读有关此内容的更多信息以及何时不使用将导致操作失败的一些示例。
The simple answer is that while you can often get away with not using .loc
and simply typing (for example)
简单的答案是,虽然您经常可以不使用.loc
而只是输入(例如)
df['Age_fill'][(df.Age.isnull()) & (df.Gender == i) & (df.Pclass == j+1)] \
= median_ages[i,j]
you'll always get the SettingWithCopy
warning and your code will be a little messier for it.
你总是会收到SettingWithCopy
警告,你的代码会变得有点混乱。
In my experience .loc
has taken me a while to get my head around and it's been a bit annoying updating my code. But it's really super simple and very intuitive: df.loc[row_index,col_indexer]
.
根据我的经验.loc
,我花了一段时间才弄清楚,更新我的代码有点烦人。但它真的超级简单,非常直观:df.loc[row_index,col_indexer]
。
For more information see the pandas documentation on Indexing and Selecting Data.
有关更多信息,请参阅有关索引和选择数据的 pandas 文档。