将行附加到 Pandas DataFrame 添加 0 列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22917108/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:54:17  来源:igfitidea点击:

Appending row to Pandas DataFrame adds 0 column

pythonpandasappenddataframe

提问by Gyan Veda

I'm creating a Pandas DataFrame to store data. Unfortunately, I can't know the number of rows of data that I'll have ahead of time. So my approach has been the following.

我正在创建一个 Pandas DataFrame 来存储数据。不幸的是,我无法提前知道我将拥有的数据行数。所以我的方法如下。

First, I declare an empty DataFrame.

首先,我声明一个空的 DataFrame。

df = DataFrame(columns=['col1', 'col2'])

Then, I append a row of missing values.

然后,我附加了一行缺失值。

df = df.append([None] * 2, ignore_index=True)

Finally, I can insert values into this DataFrame one cell at a time. (Why I have to do this one cell at a time is a long story.)

最后,我可以一次向这个 DataFrame 一个单元格中插入值。(为什么我必须一次完成一个单元格是一个很长的故事。)

df['col1'][0] = 3.28

This approach works perfectly fine, with the exception that the append statement inserts an additional column to my DataFrame. At the end of the process the output I see when I type dflooks like this (with 100 rows of data).

这种方法非常有效,除了 append 语句在我的 DataFrame 中插入了一个额外的列。在过程结束时,我在键入时看到的输出df看起来像这样(有 100 行数据)。

<class 'pandas.core.frame.DataFrame'>
Data columns (total 2 columns):
0            0  non-null values
col1         100  non-null values
col2         100  non-null values

df.head()looks like this.

df.head()看起来像这样。

      0   col1   col2
0  None   3.28      1
1  None      1      0
2  None      1      0
3  None      1      0
4  None      1      1

Any thoughts on what is causing this 0column to appear in my DataFrame?

关于是什么导致此0列出现在我的 DataFrame 中的任何想法?

回答by Paul

The append is trying to append a column to your dataframe. The column it is trying to append is not named and has two None/Nan elements in it which pandas will name (by default) as column named 0.

追加正在尝试将一列追加到您的数据框。它尝试追加的列未命名,并且其中有两个 None/Nan 元素,pandas 将(默认情况下)命名为名为 0 的列。

In order to do this successfully, the column names coming into the append for the data frame must be consistent with the current data frame column names or else new columns will be created (by default)

为了成功执行此操作,数据框的附加列名称必须与当前数据框列名称一致,否则将创建新列(默认情况下)

#you need to explicitly name the columns of the incoming parameter in the append statement
df = DataFrame(columns=['col1', 'col2'])
print df.append(Series([None]*2, index=['col1','col2']), ignore_index=True)


#as an aside

df = DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
dfRowImproper = [1,2,3,4]
#dfRowProper = DataFrame(arange(4)+1,columns=['A','B','C','D']) #will not work!!! because arange returns a vector, whereas DataFrame expect a matrix/array#
dfRowProper = DataFrame([arange(4)+1],columns=['A','B','C','D']) #will work


print df.append(dfRowImproper) #will make the 0 named column with 4 additional rows defined on this column

print df.append(dfRowProper) #will work as you would like as the column names are consistent

print df.append(DataFrame(np.random.randn(1,4))) #will define four additional columns to the df with 4 additional rows


print df.append(Series(dfRow,index=['A','B','C','D']), ignore_index=True) #works as you want

回答by Toff'

You could use a Seriesfor row insertion:

您可以使用 aSeries进行行插入:

df = pd.DataFrame(columns=['col1', 'col2'])
df = df.append(pd.Series([None]*2), ignore_index=True)
df["col1"][0] = 3.28

dflooks like:

df好像:

   col1 col2
0  3.28  NaN