Pandas Dataframe apply() 方法提供了一个行对象,但是如何访问索引值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24698283/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Dataframe apply() method provides a row object, but how do you access the index value
提问by Paul H
I am new to Panda's and DataFrames and have run into an issue. The DataFrame.apply() method passes a row parameter to the provided function. However I can't seem to find out what the index value corresponding to that row is from this row parameter.
我是 Panda's 和 DataFrames 的新手,遇到了一个问题。DataFrame.apply() 方法将行参数传递给提供的函数。但是我似乎无法从这个行参数中找出与该行对应的索引值是什么。
An example
一个例子
df = DataFrame ({'a' : np.random.randn(6),
'b' : ['foo', 'bar'] * 3,
'c' : np.random.randn(6)})
df = df.set_index('a')
def my_test2(row):
return "{}.{}".format(row['a'], row['b'])
df['Value'] = df.apply(my_test2, axis=1)
Yields a KeyError
产生一个 KeyError
KeyError: ('a', u'occurred at index -1.16119852166')
The problem is that the row['a'] in the my_test2 method fails. If I don't do the df.set_index('a') it works fine, but I do want to have an index on a.
问题是 my_test2 方法中的 row['a'] 失败了。如果我不做 df.set_index('a') 它工作正常,但我确实想有一个索引。
I tried duplicating column a (once as index, and once as a column) and this works, but this just seems ugly and problematic.
我尝试复制 a 列(一次作为索引,一次作为列)并且这有效,但这看起来很丑陋且有问题。
Any ideas on how to get the corresponding index value given the row object?
关于如何获取给定行对象的相应索引值的任何想法?
Many thanks in advance.
提前谢谢了。
回答by BKay
I believe what you want is this:
我相信你想要的是这个:
def my_test(row):
return "{}.{}".format(row.name, row['b'])
THis works because:
这是有效的,因为:
"{}.{}".format("ham", "cheese")
returns
回报
'ham.cheese'
and if you reference a single row, the name attribute returns the index. For the example above:
如果您引用单行,则 name 属性返回索引。对于上面的例子:
df.iloc[0].name
returns
回报
b foo
c 1.417726
Value 0.7842562355491481.foo
Name: 0.784256235549, dtype: object
Therefore this function is equivalent to finding the index of the ith row and executing this command
因此这个函数相当于找到第i行的索引并执行这个命令
"{}.{}".format(df.iloc[i].name, df.iloc[i]['b'])
then the apply function does this for all rows.
然后 apply 函数对所有行执行此操作。

