对 Pandas Dataframe 中的列和行进行迭代

Question

提问by Notna

Say I have a dataframe that looks like:

假设我有一个如下所示的数据框：

d = {'option1': ['1', '0', '1', '1'], 'option2': ['0', '0', '1', '0'], 'option3': ['1', '1', '0', '0'], 'views': ['6', '10', '5', '2']
df = pd.DataFrame(data=d)

print(df)

  option1 option2 option3 views
0       1       0       1     6
1       0       0       1    10
2       1       1       0     5
3       1       0       0     2

I'm trying to build a for loop that iterates over each column (except the column "views") and each row. If the value of a cell is not 0, I want to replace it with the corresponding value of the column "views" from the same row.

我正在尝试构建一个 for 循环，该循环遍历每一列（“视图”列除外）和每一行。如果单元格的值不是 0，我想用同一行中“views”列的相应值替换它。

The following output is required (should be easier to understand):

需要以下输出（应该更容易理解）：

  option1 option2 option3 views
0       6       0       6     6
1       0       0      10    10
2       5       5       0     5
3       2       0       0     2

I tried something like:

我试过类似的东西：

df_range = len(df)

for column in df:
    for i in range(df_range):
        if column != 0:
            column = df.views[i]

But I know I'm missing something, it does not work.

但我知道我错过了一些东西，它不起作用。

Also please note that in my real dataframe, I have dozens of columns, so I need something that iterates over each column automatically. Thanks!!

另请注意，在我的真实数据框中，我有几十列，所以我需要一些自动迭代每一列的东西。谢谢！！

I saw this thread Update a dataframe in pandas while iterating row by rowbut it doesn't exactly apply to my problem, because I'm not only going row by row, I also need to go column by column.

我看到这个线程在逐行迭代时更新Pandas中的数据框，但它并不完全适用于我的问题，因为我不仅要逐行进行，还需要逐列进行。

Answer 1

采纳答案by Keith Dowd

You can also achieve the result you want this way:

您还可以通过这种方式实现您想要的结果：

for col in df:
    if col == 'views':
        continue
    for i, row_value in df[col].iteritems():
        df[col][i] = row_value * df['views'][i]

Notice the following about this solution:

请注意有关此解决方案的以下内容：

1) This solution operates on each value in the dataframe individually and so is less efficient than broadcasting, because it's performing two loops (one outer, one inner).

1) 此解决方案单独对数据帧中的每个值进行操作，因此效率低于广播，因为它执行两个循环（一个外部循环，一个内部循环）。

2) This solution assumes that option1...option N are binary because essentially this solution is multiplying each binary value in option1...option N with the values in views.

2) 该解决方案假设option1...option N 是二进制的，因为本质上该解决方案是将option1...option N 中的每个二进制值与中的值相乘views。

3) This solution will work for any number of option columns. The option columns may have any labels you desire.

3) 此解决方案适用于任意数量的选项列。选项列可能有您想要的任何标签。

4) This solution assumes there is a column labeled views.

4) 此解决方案假定有一列标记为views。

Answer 2

回答by YOLO

You don't need to iterate through rows. This one should be faster: Ensure that the columns values are integers.

您不需要遍历行。这个应该更快：确保列值是整数。

## convert column type to integer
for i in df:
    df[i] = df[i].astype(int)

## update columns
for col in df:
    if col != 'views':
        df[col] = df[col] * df['views']

df

    option1     option2     option3     views
0      6          0            6          6
1      0          0           10         10  
2      5          5            0          5
3      2          0            0          2

Answer 3

回答by luqman ahmad

dataSet = pd.read_excel("dataset.xlsx")
i = 0 ;
for column in dataSet:
    for i in dataSet[column].iteritems():
        if (column == 'views'):
            print (i)

Answer 4

回答by luqman ahmad

I think this would work:

我认为这会奏效：

df=df.astype(int)
df[df.columns[:-1]]= np.where(df[df.columns[:-1]]>0, 1, 0)
df[df.columns[:-1]]= df[df.columns[:-1]].mul(df['views'].as_matrix(), axis=0)

对 Pandas Dataframe 中的列和行进行迭代

提问by Notna

采纳答案by Keith Dowd

回答by YOLO

回答by luqman ahmad

回答by luqman ahmad

相关推荐

最近更新

标签

对 Pandas Dataframe 中的列和行进行迭代

提问by Notna

采纳答案by Keith Dowd

回答by YOLO

回答by luqman ahmad

回答by luqman ahmad

相关推荐

pandas 如何将python列表转换为Pandas系列

pandas 熊猫附加在具有不同名称的列上

pandas 如何每小时获得一次滴答声？

Pandas 根据字母顺序写入 Excel 重新排列列

相关推荐

最近更新

标签