Python 向 Pandas 数据框插入一行

Question

提问by Meloun

I have a dataframe:

我有一个数据框：

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

   A  B  C
0  5  6  7
1  7  8  9

[2 rows x 3 columns]

and I need to add a first row [2, 3, 4] to get:

我需要添加第一行 [2, 3, 4] 以获得：

I've tried append()and concat()functions but can't find the right way how to do that.

我已经尝试过append()，concat()但无法找到正确的方法来做到这一点。

How to add/insert series to dataframe?

如何在数据框中添加/插入系列？

Answer 1

回答by FooBar

One way to achieve this is

实现这一目标的一种方法是

>>> pd.DataFrame(np.array([[2, 3, 4]]), columns=['A', 'B', 'C']).append(df, ignore_index=True)
Out[330]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

Generally, it's easiest to append dataframes, not series. In your case, since you want the new row to be "on top" (with starting id), and there is no function pd.prepend(), I first create the new dataframe and then append your old one.

通常，最容易附加数据帧，而不是系列。在您的情况下，由于您希望新行位于“顶部”（带有起始 ID），并且没有函数pd.prepend()，我首先创建新数据框，然后附加旧数据框。

ignore_indexwill ignore the old ongoing index in your dataframe and ensure that the first row actually starts with index 1instead of restarting with index 0.

ignore_index将忽略数据框中旧的正在进行的索引，并确保第一行实际上以 index 开头，1而不是以 index 重新启动0。

Typical Disclaimer: Cetero censeo ... appending rows is a quite inefficient operation. If you care about performance and can somehow ensure to first create a dataframe with the correct (longer) index and then just insertingthe additional row into the dataframe, you should definitely do that. See:

典型的免责声明：Cetero ceneo ...附加行是一种效率很低的操作。如果您关心性能并且可以以某种方式确保首先创建一个具有正确（更长）索引的数据帧，然后将额外的行插入到数据帧中，那么您绝对应该这样做。看：

>>> index = np.array([0, 1, 2])
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[0:1] = [list(s1), list(s2)]
>>> df2
Out[336]: 
     A    B    C
0    5    6    7
1    7    8    9
2  NaN  NaN  NaN
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[1:] = [list(s1), list(s2)]

So far, we have what you had as df:

到目前为止，我们有你所拥有的df：

>>> df2
Out[339]: 
     A    B    C
0  NaN  NaN  NaN
1    5    6    7
2    7    8    9

But now you can easily insert the row as follows. Since the space was preallocated, this is more efficient.

但是现在您可以轻松插入行，如下所示。由于空间是预先分配的，因此效率更高。

>>> df2.loc[0] = np.array([2, 3, 4])
>>> df2
Out[341]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

Answer 2

回答by Piotr Migdal

Just assign row to a particular index, using loc:

只需将行分配给特定索引，使用loc：

 df.loc[-1] = [2, 3, 4]  # adding a row
 df.index = df.index + 1  # shifting index
 df = df.sort_index()  # sorting by index

And you get, as desired:

您可以根据需要获得：

See in Pandas documentation Indexing: Setting with enlargement.

请参阅 Pandas 文档索引：放大设置。

Answer 3

回答by mgilbert

Not sure how you were calling concat()but it should work as long as both objects are of the same type. Maybe the issue is that you need to cast your second vector to a dataframe? Using the df that you defined the following works for me:

不确定您是如何调用的，concat()但只要两个对象的类型相同，它就应该可以工作。也许问题是您需要将第二个向量转换为数据帧？使用您定义的 df 对我有用：

df2 = pd.DataFrame([[2,3,4]], columns=['A','B','C'])
pd.concat([df2, df])

Answer 4

回答by elPastor

I put together a short function that allows for a little more flexibility when inserting a row:

我整理了一个简短的函数，可以在插入行时提供更大的灵活性：

def insert_row(idx, df, df_insert):
    dfA = df.iloc[:idx, ]
    dfB = df.iloc[idx:, ]

    df = dfA.append(df_insert).append(dfB).reset_index(drop = True)

    return df

which could be further shortened to:

可以进一步缩短为：

def insert_row(idx, df, df_insert):
    return df.iloc[:idx, ].append(df_insert).append(df.iloc[idx:, ]).reset_index(drop = True)

Then you could use something like:

然后你可以使用类似的东西：

df = insert_row(2, df, df_new)

where 2is the index position in dfwhere you want to insert df_new.

where2是df您要插入的索引位置df_new。

Answer 5

回答by Tai

We can use numpy.insert. This has the advantage of flexibility. You only need to specify the index you want to insert to.

我们可以使用numpy.insert. 这具有灵活性的优点。您只需要指定要插入的索引。

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

pd.DataFrame(np.insert(df.values, 0, values=[2, 3, 4], axis=0))

    0   1   2
0   2   3   4
1   5   6   7
2   7   8   9

For np.insert(df.values, 0, values=[2, 3, 4], axis=0), 0 tells the function the place/index you want to place the new values.

对于np.insert(df.values, 0, values=[2, 3, 4], axis=0), 0 告诉函数您要放置新值的位置/索引。

Answer 6

回答by Sagar Rathod

Below would be the best way to insert a row into pandas dataframe without sorting and reseting an index:

下面是在不排序和重置索引的情况下将行插入到 Pandas 数据框中的最佳方法：

import pandas as pd

df = pd.DataFrame(columns=['a','b','c'])

def insert(df, row):
    insert_loc = df.index.max()

    if pd.isna(insert_loc):
        df.loc[0] = row
    else:
        df.loc[insert_loc + 1] = row

insert(df,[2,3,4])
insert(df,[8,9,0])
print(df)

Answer 7

回答by Aaron Melgar

this might seem overly simple but its incredible that a simple insert new row function isn't built in. i've read a lot about appending a new df to the original, but i'm wondering if this would be faster.

这可能看起来过于简单，但令人难以置信的是，没有内置简单的插入新行函数。我已经阅读了很多关于将新的 df 附加到原始文件的内容，但我想知道这是否会更快。

df.loc[0] = [row1data, blah...]
i = len(df) + 1
df.loc[i] = [row2data, blah...]

Answer 8

回答by Xinyi Li

You can simply append the row to the end of the DataFrame, and then adjust the index.

您可以简单地将该行附加到 DataFrame 的末尾，然后调整索引。

For instance:

例如：

df = df.append(pd.DataFrame([[2,3,4]],columns=df.columns),ignore_index=True)
df.index = (df.index + 1) % len(df)
df = df.sort_index()

Or use concatas:

或concat用作：

df = pd.concat([pd.DataFrame([[1,2,3,4,5,6]],columns=df.columns),df],ignore_index=True)

Answer 9

回答by Pepe

It is pretty simple to add a row into a pandas DataFrame:

在 pandas 中添加一行非常简单DataFrame：

Create a regular Python dictionary with the same columns names as your Dataframe;
Use pandas.append()method and pass in the name of your dictionary;
Add ignore_index=Trueright after your dictionary name.

创建一个与您的列名称相同的常规 Python 字典Dataframe；
使用pandas.append()方法并传入你的字典名称；
ignore_index=True在您的字典名称之后添加。

Answer 10

回答by Pepe

The simplest way add a row in a pandas data frame is:

在 Pandas 数据框中添加一行的最简单方法是：

DataFrame.loc[ location of insertion ]= list( )

Example :

例子：

DF.loc[ 9 ] = [ ′Pepe' , 33, ′Japan' ]

NB: the length of your list should match that of the data frame.

注意：列表的长度应与数据框的长度相匹配。

Python 向 Pandas 数据框插入一行

提问by Meloun

回答by FooBar

回答by Piotr Migdal

回答by mgilbert

回答by elPastor

回答by Tai

回答by Sagar Rathod

回答by Aaron Melgar

回答by Xinyi Li

回答by Pepe

回答by Pepe

相关推荐

最近更新

标签

Python 向 Pandas 数据框插入一行

提问by Meloun

回答by FooBar

回答by Piotr Migdal

回答by mgilbert

回答by elPastor

回答by Tai

回答by Sagar Rathod

回答by Aaron Melgar

回答by Xinyi Li

回答by Pepe

回答by Pepe

相关推荐

Python Flask 应用程序：在函数运行时更新进度条

Python Pandas 使用布尔值选择 DataFrame 列

Python 类型错误：使用 %s 时没有足够的格式字符串参数

Python 时间戳中的 T 和 Z 究竟是什么意思？

相关推荐

最近更新

标签