在应用函数中使用 shift() 函数来比较 Pandas Dataframe 中的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37967070/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:26:50  来源:igfitidea点击:

Using a shift() function within an apply function to compare rows in a Pandas Dataframe

pythonpython-2.7pandas

提问by user2242044

I would like to use shift()to pull in data from the previous index, provided values in one of the columns, Letter, is the same.

我想使用shift()从前一个索引中提取数据,前提是其中一列中的值Letter相同。

import pandas as pd
df = pd.DataFrame(data=[['A', 'one'],
                        ['A', 'two'],
                        ['B', 'three'],
                        ['B', 'four'],
                        ['C', 'five']],
                  columns=['Letter', 'value'])

df['Previous Value'] = df.apply(lambda x : x['value'] if x['Letter'].shift(1) == x['Letter'] else "", axis=1)
print df

I am getting the error:

我收到错误:

AttributeError: ("'str' object has no attribute 'shift'", u'occurred at index 0')

Desired Output:

期望输出:

  Letter  value Previous Value
0      A    one               
1      A    two            one
2      B  three               
3      B   four          three
4      C   five               

回答by EdChum

Use whereon your condition where the current row matches previous row using shift:

使用where您的病情在当前行使用前一行匹配shift

In [11]:
df = pd.DataFrame(data=[['A', 'one'],
                        ['A', 'two'],
                        ['B', 'three'],
                        ['B', 'four'],
                        ['C', 'five']],
                  columns=['Letter', 'value'])
?
df['Previous Value'] = df['value'].shift().where(df['Letter'].shift() == df['Letter'], '')
df
?
Out[11]:
  Letter  value Previous Value
0      A    one               
1      A    two            one
2      B  three               
3      B   four          three
4      C   five               

回答by Dmitry Andreev

You are trying to apply .shift() to a value of a given column of a given row instead of a Series. I would do this, using groupby:

您正在尝试将 .shift() 应用于给定行的给定列的值,而不是系列。我会这样做,使用 groupby:

In [6]: df['Previous letter'] = df.groupby('Letter').value.shift()

In [7]: df
Out[7]:
  Letter  value Previous letter
0      A    one             NaN
1      A    two             one
2      B  three             NaN
3      B   four           three
4      C   five             NaN