数据框中的 Pandas 列表理解

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15939811/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:45:41  来源:igfitidea点击:

Pandas list comprehension in a dataframe

python-2.7pandasdataframe

提问by Michele Reilly

I would like to pull out the price at the next day's open currently stored in (row + 1) and store it in a new column, if some condition is met.

如果满足某些条件,我想提取当前存储在(行 + 1)中的第二天开盘价并将其存储在新列中。

df['b']=''

df['shift']=''

df['shift']=df['open'].shift(-1)

df['b']=df[x for x in df['shift'] if df["MA10"]>df["MA100"]]

回答by DSM

There are a few approaches. Using apply:

有几种方法。使用apply

>>> df = pd.read_csv("bondstack.csv")
>>> df["shift"] = df["open"].shift(-1)
>>> df["b"] = df.apply(lambda row: row["shift"] if row["MA10"] > row["MA100"] else np.nan, axis=1)

which produces

产生

>>> df[["MA10", "MA100", "shift", "b"]][:10]
        MA10      MA100      shift          b
0  16.915625  17.405625  16.734375        NaN
1  16.871875  17.358750  17.171875        NaN
2  16.893750  17.317187  17.359375        NaN
3  16.950000  17.279062  17.359375        NaN
4  17.137500  17.254062  18.640625        NaN
5  17.365625  17.229063  18.921875  18.921875
6  17.550000  17.200312  18.296875  18.296875
7  17.681250  17.177500  18.640625  18.640625
8  17.812500  17.159375  18.609375  18.609375
9  17.943750  17.142813  18.234375  18.234375

For a more vectorized approach, you could use

对于更矢量化的方法,您可以使用

>>> df = pd.read_csv("bondstack.csv")
>>> df["b"] = np.nan
>>> df["b"][df["MA10"] > df["MA100"]] = df["open"].shift(-1)

or my preferred approach:

或我的首选方法:

>>> df = pd.read_csv("bondstack.csv")
>>> df["b"] = df["open"].shift(-1).where(df["MA10"] > df["MA100"])

回答by user3226167

Modifying DSM's approach 3, stating True/False values in np.whereexplicitly:

修改 DSM 的方法 3,np.where明确说明 True/False 值:

#numpy.where(condition, x, y)
df["b"] = np.where(df["MA10"] > df["MA100"], df["open"].shift(-1), np.nan)

Using list comprehension explicitly:

明确使用列表理解:

#[xv if c else yv for (c,xv,yv) in zip(condition,x,y)]      #np.where documentation
df['b'] = [ xv if c else np.nan for (c,xv) in zip(df["MA10"]> df["MA100"], df["open"].shift(-1))]