python:使用 .iterrows() 创建列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31458794/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:01:56  来源:igfitidea点击:

python: using .iterrows() to create columns

pythonpandas

提问by citydreams

I am trying to use a loop function to create a matrix of whether a product was seen in a particular week.

我正在尝试使用循环函数来创建某个产品是否在特定周内被看到的矩阵。

Each row in the df (representing a product) has a close_date (the date the product closed) and a week_diff (the number of weeks the product was listed).

df 中的每一行(代表一个产品)都有一个 close_date(产品关闭的日期)和一个 week_diff(产品被列出的周数)。

import pandas
mydata = [{'subid' : 'A', 'Close_date_wk': 25, 'week_diff':3},
          {'subid' : 'B', 'Close_date_wk': 26, 'week_diff':2},
          {'subid' : 'C', 'Close_date_wk': 27, 'week_diff':2},]
df = pandas.DataFrame(mydata)

My goal is to see how many alternative products were listed for each product in each date_range

我的目标是查看每个 date_range 中每种产品列出了多少替代产品

I have set up the following loop:

我已经设置了以下循环:

for index, row in df.iterrows():
    i = 0
    max_range = row['Close_date_wk']    
    min_range = int(row['Close_date_wk'] - row['week_diff'])
    for i in range(min_range,max_range):
        col_head = 'job_week_'  +  str(i)
        row[col_head] = 1

Can you please help explain why the "row[col_head] = 1" line is neither adding a column, nor adding a value to that column for that row.

您能否帮助解释为什么“row[col_head] = 1”行既不添加列,也不为该行的该列添加值。

For example, if:

例如,如果:

row A has date range 1,2,3 
row B has date range 2,3  
row C has date range 3,4,5'

then ideally I would like to end up with

那么理想情况下我想结束

row A has 0 alternative products in week 1
          1 alternative products in week 2
          2 alternative products in week 3
row B has 1 alternative products in week 2
          2 alternative products in week 3
&c..

采纳答案by EdChum

You can't mutate the df using rowhere to add a new column, you'd either refer to the original df or use .loc, .iloc, or .ix, example:

您不能使用row此处更改df以添加新列,您可以参考原始 df 或使用.loc, .iloc, 或.ix, 例如:

In [29]:

df = pd.DataFrame(columns=list('abc'), data = np.random.randn(5,3))
df
Out[29]:
          a         b         c
0 -1.525011  0.778190 -1.010391
1  0.619824  0.790439 -0.692568
2  1.272323  1.620728  0.192169
3  0.193523  0.070921  1.067544
4  0.057110 -1.007442  1.706704
In [30]:

for index,row in df.iterrows():
    df.loc[index,'d'] = np.random.randint(0, 10)
df
Out[30]:
          a         b         c  d
0 -1.525011  0.778190 -1.010391  9
1  0.619824  0.790439 -0.692568  9
2  1.272323  1.620728  0.192169  1
3  0.193523  0.070921  1.067544  0
4  0.057110 -1.007442  1.706704  9

You can modify existing rows:

您可以修改现有行:

In [31]:
# reset the df by slicing
df = df[list('abc')]
for index,row in df.iterrows():
    row['b'] = np.random.randint(0, 10)
df
Out[31]:
          a  b         c
0 -1.525011  8 -1.010391
1  0.619824  2 -0.692568
2  1.272323  8  0.192169
3  0.193523  2  1.067544
4  0.057110  3  1.706704

But adding a new column using row won't work:

但是使用 row 添加新列将不起作用:

In [35]:

df = df[list('abc')]
for index,row in df.iterrows():
    row['d'] = np.random.randint(0,10)
df
Out[35]:
          a  b         c
0 -1.525011  8 -1.010391
1  0.619824  2 -0.692568
2  1.272323  8  0.192169
3  0.193523  2  1.067544
4  0.057110  3  1.706704