Python 将新行添加到具有特定索引名称的 Pandas DataFrame

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46621712/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:44:04  来源:igfitidea点击:

Add a new row to a Pandas DataFrame with specific index name

pythonpandasdataframe

提问by samba

I'm trying to add a new row to the DataFrame with a specific index name 'e'.

我正在尝试向具有特定索引名称的 DataFrame 添加新行'e'

    number   variable       values
a    NaN       bank          true   
b    3.0       shop          false  
c    0.5       market        true   
d    NaN       government    true   

I have tried the following but it's creating a new column instead of a new row.

我尝试了以下操作,但它正在创建一个新列而不是新行。

new_row = [1.0, 'hotel', 'true']
df = df.append(new_row)

Still don't understand how to insert the row with a specific index. Will be grateful for any suggestions.

仍然不明白如何插入具有特定索引的行。将不胜感激任何建议。

回答by MaxU

You can use df.loc[_not_yet_existing_index_label_] = new_row.

您可以使用df.loc[_not_yet_existing_index_label_] = new_row.

Demo:

演示:

In [3]: df.loc['e'] = [1.0, 'hotel', 'true']

In [4]: df
Out[4]:
   number    variable values
a     NaN        bank   True
b     3.0        shop  False
c     0.5      market   True
d     NaN  government   True
e     1.0       hotel   true

PS using this method you can't add a row with already existing (duplicate) index value (label) - a row with this index label will be updatedin this case.

PS 使用此方法您无法添加具有现有(重复)索引值(标签)的行 -在这种情况下将更新具有此索引标签的行。



UPDATE:

更新:

This might not work in recent Pandas/Python3 if the index is a DateTimeIndex and the new row's index doesn't exist.

如果索引是 DateTimeIndex 并且新行的索引不存在,这在最近的 Pandas/Python3 中可能不起作用。

it'll work if we specify correct index value(s).

如果我们指定正确的索引值,它将起作用。

Demo (using pandas: 0.23.4):

演示(使用pandas: 0.23.4):

In [17]: ix = pd.date_range('2018-11-10 00:00:00', periods=4, freq='30min')

In [18]: df = pd.DataFrame(np.random.randint(100, size=(4,3)), columns=list('abc'), index=ix)

In [19]: df
Out[19]:
                      a   b   c
2018-11-10 00:00:00  77  64  90
2018-11-10 00:30:00   9  39  26
2018-11-10 01:00:00  63  93  72
2018-11-10 01:30:00  59  75  37

In [20]: df.loc[pd.to_datetime('2018-11-10 02:00:00')] = [100,100,100]

In [21]: df
Out[21]:
                       a    b    c
2018-11-10 00:00:00   77   64   90
2018-11-10 00:30:00    9   39   26
2018-11-10 01:00:00   63   93   72
2018-11-10 01:30:00   59   75   37
2018-11-10 02:00:00  100  100  100

In [22]: df.index
Out[22]: DatetimeIndex(['2018-11-10 00:00:00', '2018-11-10 00:30:00', '2018-11-10 01:00:00', '2018-11-10 01:30:00', '2018-11-10 02:00:00'], dtype='da
tetime64[ns]', freq=None)

回答by Bharath

Use append by converting list a dataframe in case you want to add multiple rows at once i.e

如果您想一次添加多行,请通过转换列表数据帧来使用附加

df = df.append(pd.DataFrame([new_row],index=['e'],columns=df.columns))

Or for single row (Thanks @Zero)

或单行(感谢@Zero)

df = df.append(pd.Series(new_row, index=df.columns, name='e'))

Output:

输出:

  number    variable values
a     NaN        bank   True
b     3.0        shop  False
c     0.5      market   True
d     NaN  government   True
e     1.0       hotel   true

回答by Kim Miller

If it's the first row you need:

如果它是您需要的第一行:

df = Dataframe(columns=[number, variable, values])
df.loc['e', [number, variable, values]] = [1.0, 'hotel', 'true']