pandas Python - 熊猫 - 将系列附加到空白数据帧中

Question

提问by bill999

Say I have two pandas Series in python:

假设我在 python 中有两个Pandas系列：

import pandas as pd
h = pd.Series(['g',4,2,1,1])
g = pd.Series([1,6,5,4,"abc"])

I can create a DataFrame with just h and then append g to it:

我可以只用 h 创建一个 DataFrame，然后将 g 附加到它：

df = pd.DataFrame([h])
df1 = df.append(g, ignore_index=True)

I get:

我得到：

>>> df1
   0  1  2  3    4
0  g  4  2  1    1
1  1  6  5  4  abc

But now suppose that I have an empty DataFrame and I try to append h to it:

但是现在假设我有一个空的 DataFrame 并且我尝试将 h 附加到它：

df2 = pd.DataFrame([])
df3 = df2.append(h, ignore_index=True)

This does not work. I think the problem is in the second-to-last line of code. I need to somehow define the blank DataFrame to have the proper number of columns.

这不起作用。我认为问题出在倒数第二行代码中。我需要以某种方式定义空白 DataFrame 以具有正确的列数。

By the way, the reason I am trying to do this is that I am scraping text from the internet using requests+BeautifulSoup and I am processing it and trying to write it to a DataFrame one row at a time.

顺便说一下，我尝试这样做的原因是我正在使用 requests+BeautifulSoup 从互联网上抓取文本，我正在处理它并尝试一次将其写入 DataFrame 一行。

Answer 1

回答by EdChum

So if you don't pass an empty list to the DataFrame constructor then it works:

因此，如果您不将空列表传递给 DataFrame 构造函数，则它可以工作：

In [16]:

df = pd.DataFrame()
h = pd.Series(['g',4,2,1,1])
df = df.append(h,ignore_index=True)
df
Out[16]:
   0  1  2  3  4
0  g  4  2  1  1

[1 rows x 5 columns]

The difference between the two constructor approaches appears to be that the index dtypesare set differently, with an empty list it is an Int64with nothing it is an object:

两种构造方法之间的区别似乎是索引dtypes的设置不同，空列表是 an Int64，没有任何内容object：

In [21]:

df = pd.DataFrame()
print(df.index.dtype)
df = pd.DataFrame([])
print(df.index.dtype)
object
int64

Unclear to me why the above should affect the behaviour (I'm guessing here).

我不清楚为什么上述会影响行为（我在这里猜测）。

UPDATE

更新

After revisiting this I can confirm that this looks to me to be a bug in pandas version 0.12.0as your original code works fine:

在重新审视这个之后，我可以确认这在我看来是 Pandas 版本中的一个错误，0.12.0因为您的原始代码工作正常：

In [13]:

import pandas as pd
df = pd.DataFrame([])
h = pd.Series(['g',4,2,1,1])
df.append(h,ignore_index=True)

Out[13]:
   0  1  2  3  4
0  g  4  2  1  1

[1 rows x 5 columns]

I am running pandas 0.13.1and numpy 1.8.164-bit using python 3.3.5.0but I think the problem is pandas but I would upgrade both pandas and numpy to be safe, I don't think this is a 32 versus 64-bit python issue.

我正在使用 python运行 Pandas0.13.1和 numpy 1.8.164 位，3.3.5.0但我认为问题是 Pandas，但我会升级 Pandas 和 numpy 以确保安全，我认为这不是 32 位与 64 位 python 问题。

pandas Python - 熊猫 - 将系列附加到空白数据帧中

提问by bill999

回答by EdChum

相关推荐

最近更新

标签

pandas Python - 熊猫 - 将系列附加到空白数据帧中

提问by bill999

回答by EdChum

相关推荐

pandas 熊猫：跨行条件计数

在 pandas DataFrame 中查找与时间戳对应的行

通过索引和列名数组对 Pandas 数据框进行切片

pandas SettingWithCopyWarning，即使使用 loc (?)

相关推荐

最近更新

标签