将列添加到由 Python 中的 for 循环计算的数据框中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36830232/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:24:46  来源:igfitidea点击:

add columns to a data frame calculated by for loops in python

pythonfor-looppandas

提问by anitasp

import re
#Creating several new colums with a for loop and adding them to the original df.
#Creating permutations for a second level of binary variables for df
for i in list_ib:
    for j in list_ib:
        if i == j:
            break
        else:            
            bina = df[i]*df[j]
            print(i,j)

i are binary columns that belong to a data frame (df) and j are the same columns. I have calculated the multiplications each column with each column. My question is now, how do I add all the new binary product columns to the original df?

i 是属于数据框 (df) 的二进制列,j 是相同的列。我已经计算了每列与每列的乘法。我现在的问题是,如何将所有新的二进制产品列添加到原始 df 中?

I have tried:

我试过了:

df = df + df[i,j,bina]

but I am not getting the results I need. Any suggestions?

但我没有得到我需要的结果。有什么建议?

回答by Thanos

As I understand, i,j,binaare not part of your df. Build arrays for each one of those, each array element representing a 'row' and once you have all rows for i,j,binaready, then you can concatenate like this:

据我了解,i,j,bina不是您的 df 的一部分。为每一个构建数组,每个数组元素代表一个“行”,一旦你i,j,bina准备好所有行,那么你可以像这样连接:

>>> new_df = pd.DataFrame(data={'i':i, 'j':j, 'bina':bina}, columns=['i','j','bina'])
>>> pd.concat([df, new_df], axis=1)

Alternatively, once you have all data for 'i', 'j' and 'bina'collected and assuming you have the data for each of these in a separate array, you can do this:

或者,一旦您'i', 'j' and 'bina'收集了所有数据并假设您在单独的数组中拥有这些数据,您可以执行以下操作:

>>> df['i'] = i
>>> df['j'] = j
>>> df['bina'] = bina

This will work only if these three arrays have as many elements as rows in the DataFrame df.

仅当这三个数组具有与 DataFrame df 中的行一样多的元素时,这才有效。

I hope this helps!

我希望这有帮助!

回答by Matt Messersmith

Typically you add columns to a Dataframeusing its built-in __setitem__(), which you can access with []. For example:

通常,您可以Dataframe使用其内置的向 a 添加列__setitem__(),您可以使用[]. 例如:

import pandas as pd

df = pd.DataFrame()

df["one"] = 1, 1, 1
df["two"] = 2, 2, 2
df["three"] = 3, 3, 3

print df

# Output:
#    one  two  three
# 0    1    2      3
# 1    1    2      3
# 2    1    2      3

list_ib = df.columns.values

for i in list_ib:
    for j in list_ib:
        if i == j:
            break
        else:
            bina = df[i] * df[j]
            df['bina_' + str(i) + '_' + str(j)] = bina # Add new column which is the result of multiplying columns i and j together

print df

# Output:
#        one  two  three  bina_two_one  bina_three_one  bina_three_two
# 0    1    2      3             2               3               6
# 1    1    2      3             2               3               6
# 2    1    2      3             2               3               6