Pandas:在列的应用函数中使用索引值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39151978/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: Use index value in apply function over a column
提问by Fizi
I was wondering if there's a way to use index value while using apply over a column of the dataframe. Suppose I have df like:
我想知道是否有办法在对数据框的列使用应用时使用索引值。假设我有 df 像:
col1 col2
0 a [0,1,2]
1 b [0,2]
2 c [0,1,2]
I want to write an apply function on df.col2 such that it removes the index values from the list in col2 leaving a df like:
我想在 df.col2 上编写一个应用函数,以便它从 col2 中的列表中删除索引值,留下一个 df 像:
col1 col2
0 a [1,2]
1 b [0,2]
2 c [0,1]
The index value may or may not be in the list. But if it does exist in the list it should be removed.Note that this isn't the actual use-case but similar to what I need. I have
索引值可能在也可能不在列表中。但如果它确实存在于列表中,则应将其删除。请注意,这不是实际用例,但与我需要的类似。我有
df.col2.apply(lambda x: f(x))
and in the f(x) I want to be able to access the index value of x if its possible or a workaround. I know that df.apply() can work on the column values and df.index.map() can work on the index. Is there a method in Pandas that combines the use-cases of the two in one single elegant solution. Thanks for the help.
在 f(x) 中,如果可能或解决方法,我希望能够访问 x 的索引值。我知道 df.apply() 可以处理列值,而 df.index.map() 可以处理索引。Pandas 中是否有一种方法可以将两者的用例结合到一个优雅的解决方案中。谢谢您的帮助。
UPDATE: the index is an integer value and it will be constrained in such a way that its consecutive whole numbers. The col2 will have a list for each index. I want to check if the index is in that list and remove it from the list if it exists. So lets say for row index 3 we have list [27,36,3,9,7]. I want to drop 3 from the list. The list is unordered
更新:索引是一个整数值,它将以连续整数的方式受到约束。col2 将为每个索引提供一个列表。我想检查索引是否在该列表中,如果存在,则将其从列表中删除。因此,对于行索引 3,我们有列表 [27,36,3,9,7]。我想从列表中删除 3 个。该列表是无序的
回答by fuglede
If I understand your question correctly, this should do the job:
如果我正确理解您的问题,这应该可以完成工作:
df.apply(lambda x: x.name in x.col2 and x.col2.remove(x.name), axis=1)
With the example from the original post:
使用原始帖子中的示例:
In [226]: df
Out[226]:
col1 col2
0 a [0, 1, 2]
1 b [0, 2]
2 c [0, 1, 2]
In [227]: df.apply(lambda x: x.name in x.col2 and x.col2.remove(x.name), axis=1);
In [228]: df
Out[228]:
col1 col2
0 a [1, 2]
1 b [0, 2]
2 c [0, 1]
回答by piRSquared
回答by Siraj S.
maybe you can try this, this will not delete the index values from the list but will replace it with 'nan'
也许你可以试试这个,这不会从列表中删除索引值,而是用“nan”替换它
df = pd.DataFrame({'a':list('mno'),'b':[[1,2,3],[1,3,4],[5,6,2]]})
df1 = df.b.apply(pd.Series)
df['b'] = np.array(df1[df1.apply(lambda x: x!=df.index.values)]).tolist()
Out[111]: a b 0 m [1.0, 2.0, 3.0] 1 n [nan, 3.0, 4.0] 2 o [5.0, 6.0, nan]
Out[111]: a b 0 m [1.0, 2.0, 3.0] 1 n [nan, 3.0, 4.0] 2 o [5.0, 6.0, nan]