Pandas/Python 根据条件添加行

Question

提问by Mike

I am looking to insert a row into a dataframe between two existing rows based on certain criteria.

我希望根据某些条件在两个现有行之间的数据框中插入一行。

For example, my data frame:

例如，我的数据框：

    import pandas as pd
    df = pd.DataFrame({'Col1':['A','B','D','E'],'Col2':['B', 'C', 'E', 'F'], 'Col3':['1', '1', '1', '1']})

Which looks like:

看起来像：

    Col1    Col2    Col3
  0 A       B       1
  1 B       C       1
  2 D       E       1
  3 E       F       1

I want to be able to insert a new row between Index 1 and Index 2 given the condition:

给定条件，我希望能够在索引 1 和索引 2 之间插入一个新行：

n = 0   
while n < len(df):
    (df.ix[n]['Col2'] == df.ix[n+1]['Col1']) == False
    Something, Something, insert row
    n+=1

My desired output table would look like:

我想要的输出表如下所示：

    Col1    Col2    Col3
  0 A       B       1
  1 B       C       1
  2 C       D       1
  3 D       E       1
  4 E       F       1

I am struggling with conditional inserting of rows based on values in the previous and proceeding records. I ultimately want to preform the above exercise on my real world example which would include multiple conditions, and preserving the values of more than one column (in this example it was Col3, but in my real world it would be multiple columns)

我正在努力根据前一个和正在进行的记录中的值有条件地插入行。我最终想在我的真实世界示例中执行上述练习，该示例将包含多个条件，并保留不止一列的值（在此示例中它是 Col3，但在我的现实世界中它将是多列）

Answer 1

回答by MaxU

UPDATE: memory saving method - first set a new index with a gap for a new row:

更新：内存节省方法 - 首先为新行设置一个有间隙的新索引：

In [30]: df
Out[30]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    D    E    1
3    E    F    1

if we want to insert a new row between indexes 1and 2, we split the index at position 2:

如果我们想在索引1和之间插入一个新行2，我们将在位置处拆分索引2：

In [31]: idxs = np.split(df.index, 2)

set a new index (with gap at position 2):

设置一个新索引（在位置有间隙2）：

In [32]: df.set_index(idxs[0].union(idxs[1] + 1), inplace=True)

In [33]: df
Out[33]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
3    D    E    1
4    E    F    1

insert new row with index 2:

插入带有索引的新行2：

In [34]: df.loc[2] = ['X','X',2]

In [35]: df
Out[35]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
3    D    E    1
4    E    F    1
2    X    X    2

sort index:

排序索引：

In [38]: df.sort_index(inplace=True)

In [39]: df
Out[39]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    X    X    2
3    D    E    1
4    E    F    1

PS you also can insert DataFrame instead of single row using df.append(new_df):

PS，您还可以使用df.append(new_df)以下方法插入 DataFrame 而不是单行：

In [42]: df
Out[42]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    D    E    1
3    E    F    1

In [43]: idxs = np.split(df.index, 2)

In [45]: new_df = pd.DataFrame([['X', 'X', 10], ['Y','Y',11]], columns=df.columns)

In [49]: new_df.index += idxs[1].min()

In [51]: new_df
Out[51]:
  Col1 Col2  Col3
2    X    X    10
3    Y    Y    11

In [52]: df = df.set_index(idxs[0].union(idxs[1]+len(new_df)))

In [53]: df
Out[53]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
4    D    E    1
5    E    F    1

In [54]: df = df.append(new_df)

In [55]: df
Out[55]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
4    D    E    1
5    E    F    1
2    X    X   10
3    Y    Y   11

In [56]: df.sort_index(inplace=True)

In [57]: df
Out[57]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    X    X   10
3    Y    Y   11
4    D    E    1
5    E    F    1

OLD answer:

旧答案：

One (among many) way to achieve that would be to split your DF and concatenate it together with needed DF in desired order:

实现这一目标的一种（其中多种）方法是拆分您的 DF 并将其与所需的 DF 按所需顺序连接在一起：

Original DF:

原始DF：

In [12]: df
Out[12]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    D    E    1
3    E    F    1

let's split it into two parts ([0:1], [2:end]):

让我们把它分成两部分（[0:1], [2:end]）：

In [13]: dfs = np.split(df, [2])

In [14]: dfs
Out[14]:
[  Col1 Col2 Col3
 0    A    B    1
 1    B    C    1,   Col1 Col2 Col3
 2    D    E    1
 3    E    F    1]

now we can concatenate it together with a new DF in desired order:

现在我们可以按照所需的顺序将它与一个新的 DF 连接在一起：

In [15]: pd.concat([dfs[0], pd.DataFrame([['C','D', 1]], columns=df.columns), dfs[1]], ignore_index=True)
Out[15]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    C    D    1
3    D    E    1
4    E    F    1

Pandas/Python 根据条件添加行

提问by Mike

回答by MaxU

相关推荐

最近更新

标签

Pandas/Python 根据条件添加行

提问by Mike

回答by MaxU

相关推荐

如何在 Pandas 中创建数据框视图？

pandas 类型错误：将第二个图例添加到绘图时，“PathCollection”对象不可迭代

Pandas 中的多个同名列

pandas 熊猫合并具有不同名称的列并避免重复

相关推荐

最近更新

标签