Python 使用熊猫在数据框中附加一个空行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39998262/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Append an empty row in dataframe using pandas
提问by Mansoor Akram
I am trying to append an empty row at the end of dataframe but unable to do so, even trying to understand how pandas work with append function and still not getting it.
我试图在数据帧的末尾附加一个空行,但无法这样做,甚至试图了解 Pandas 如何使用 append 函数但仍然没有得到它。
Here's the code:
这是代码:
import pandas as pd
excel_names = ["ARMANI+EMPORIO+AR0143-book.xlsx"]
excels = [pd.ExcelFile(name) for name in excel_names]
frames = [x.parse(x.sheet_names[0], header=None,index_col=None).dropna(how='all') for x in excels]
for f in frames:
f.append(0, float('NaN'))
f.append(2, float('NaN'))
There are two columns and random number of row.
有两列和随机数的行。
with "print f" in for loop i Get this:
在for循环中使用“print f”我得到这个:
0 1
0 Brand Name Emporio Armani
2 Model number AR0143
4 Part Number AR0143
6 Item Shape Rectangular
8 Dial Window Material Type Mineral
10 Display Type Analogue
12 Clasp Type Buckle
14 Case Material Stainless steel
16 Case Diameter 31 millimetres
18 Band Material Leather
20 Band Length Women's Standard
22 Band Colour Black
24 Dial Colour Black
26 Special Features second-hand
28 Movement Quartz
采纳答案by srcerer
Add a new pandas.Series using pandas.DataFrame.append().
使用pandas.DataFrame.append() 添加一个新的pandas.Series。
If you wish to specify the name (AKA the "index") of the new row, use:
如果要指定新行的名称(也称为“索引”),请使用:
df.append(pandas.Series(name='NameOfNewRow'))
If you don't wish to name the new row, use:
如果您不想命名新行,请使用:
df.append(pandas.Series(), ignore_index=True)
where df
is your pandas.DataFrame.
df
你的 pandas.DataFrame在哪里。
回答by silent_dev
You can add it by appending a Series to the dataframe as follows. I am assuming by blank you mean you want to add a row containing only "Nan". You can first create a Series object with Nan. Make sure you specify the columns while defining 'Series' object in the -Index parameter. The you can append it to the DF. Hope it helps!
您可以通过将系列附加到数据帧来添加它,如下所示。我假设空白是指您要添加仅包含“Nan”的行。你可以先用 Nan 创建一个 Series 对象。确保在 -Index 参数中定义“系列”对象时指定列。您可以将其附加到 DF。希望能帮助到你!
from numpy import nan as Nan
import pandas as pd
>>> df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
... 'B': ['B0', 'B1', 'B2', 'B3'],
... 'C': ['C0', 'C1', 'C2', 'C3'],
... 'D': ['D0', 'D1', 'D2', 'D3']},
... index=[0, 1, 2, 3])
>>> s2 = pd.Series([Nan,Nan,Nan,Nan], index=['A', 'B', 'C', 'D'])
>>> result = df1.append(s2)
>>> result
A B C D
0 A0 B0 C0 D0
1 A1 B1 C1 D1
2 A2 B2 C2 D2
3 A3 B3 C3 D3
4 NaN NaN NaN NaN
回答by pocketdora
You can add a new series, and name it at the same time. The name will be the index of the new row, and all the values will automatically be NaN.
您可以添加一个新系列,并同时为其命名。名称将是新行的索引,所有值将自动为 NaN。
df.append(pd.Series(name='Afterthought'))
回答by kamal tanwar
The code below worked for me.
下面的代码对我有用。
df.append(pd.Series([np.nan]), ignore_index = True)
回答by Dave Reikher
Assuming df
is your dataframe,
假设df
是你的数据框,
df_prime = pd.concat([df, pd.DataFrame([[np.nan] * df.shape[1]], columns=df.columns)], ignore_index=True)
where df_prime
equals df
with an additional last row of NaN's.
其中df_prime
等于df
额外的最后一行 NaN。
Note that pd.concat
is slow so if you need this functionality in a loop, it's best to avoid using it.
In that case, assuming your index is incremental, you can use
请注意,这pd.concat
很慢,因此如果您需要循环使用此功能,最好避免使用它。在这种情况下,假设您的索引是增量的,您可以使用
df.loc[df.iloc[-1].name + 1,:] = np.nan
回答by Daniel R
Assuming your df.index is sorted you can use:
假设您的 df.index 已排序,您可以使用:
df.loc[df.index.max() + 1] = None
It handles well different indexes and column types.
它可以很好地处理不同的索引和列类型。
[EDIT] it works with pd.DatetimeIndex if there is a constant frequency, otherwise we must specify the new index exactly e.g:
[编辑] 如果频率恒定,则它与 pd.DatetimeIndex 一起使用,否则我们必须准确指定新索引,例如:
df.loc[df.index.max() + pd.Timedelta(milliseconds=1)] = None
long example:
长示例:
df = pd.DataFrame([[pd.Timestamp(12432423), 23, 'text_field']],
columns=["timestamp", "speed", "text"],
index=pd.DatetimeIndex(start='2111-11-11',freq='ms', periods=1))
df.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1 entries, 2111-11-11 to 2111-11-11
Freq: L
Data columns (total 3 columns):
timestamp 1 non-null datetime64[ns]
speed 1 non-null int64
text 1 non-null object
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 32.0+ bytes
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1 entries, 2111-11-11 to 2111-11-11
Freq: L
Data columns (total 3 columns):
timestamp 1 non-null datetime64[ns]
speed 1 non-null int64
text 1 non-null object
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 32.0+ bytes
df.loc[df.index.max() + 1] = None
df.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2 entries, 2111-11-11 00:00:00 to 2111-11-11 00:00:00.001000
Data columns (total 3 columns):
timestamp 1 non-null datetime64[ns]
speed 1 non-null float64
text 1 non-null object
dtypes: datetime64[ns](1), float64(1), object(1)
memory usage: 64.0+ bytes
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2 entries, 2111-11-11 00:00:00 to 2111-11-11 00:00:00.001000
Data columns (total 3 columns):
timestamp 1 non-null datetime64[ns]
speed 1 non-null float64
text 1 non-null object
dtypes: datetime64[ns](1), float64(1), object(1)
memory usage: 64.0+ bytes
df.head()
timestamp speed text
2111-11-11 00:00:00.000 1970-01-01 00:00:00.012432423 23.0 text_field
2111-11-11 00:00:00.001 NaT NaN NaN
回答by Alberto Garcia
You can also use:
您还可以使用:
your_dataframe.insert(loc=0, value=np.nan, column="")
where loc
is your empty row index.
loc
你的空行索引在哪里。
回答by Peter
Append "empty" row to data frame and fill selected cells:
将“空”行附加到数据框并填充选定的单元格:
Generate empty data frame (no rows just columns a
and b
):
生成空数据框(没有行,只有列a
和b
):
import pandas as pd
col_names = ["a","b"]
df = pd.DataFrame(columns = col_names)
Append empty row at the endof the data frame:
在数据框的末尾追加空行:
df = df.append(pd.Series(), ignore_index = True)
Now fill the empty cell at the end (len(df)-1
) of the data frame in column a
:
现在填充len(df)-1
列中数据框末尾 ( )处的空单元格a
:
df.loc[[len(df)-1],'a'] = 123
Result:
结果:
a b
0 123 NaN
And of course one can iterate over the rows and fill cells:
当然,可以遍历行并填充单元格:
col_names = ["a","b"]
df = pd.DataFrame(columns = col_names)
for x in range(0,5):
df = df.append(pd.Series(), ignore_index = True)
df.loc[[len(df)-1],'a'] = 123
Result:
结果:
a b
0 123 NaN
1 123 NaN
2 123 NaN
3 123 NaN
4 123 NaN