Python 使用 float 类型的 NaN 创建空的 Pandas DataFrame 的优雅方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30053329/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 07:54:24  来源:igfitidea点击:

Elegant way to create empty pandas DataFrame with NaN of type float

pythonpandasnumpydataframenan

提问by mjd

I want to create a Pandas DataFrame filled with NaNs. During my research I found an answer:

我想创建一个充满 NaN 的 Pandas DataFrame。在我的研究中,我找到了一个答案

import pandas as pd

df = pd.DataFrame(index=range(0,4),columns=['A'])

This code results in a DataFrame filled with NaNs of type "object". So they cannot be used later on for example with the interpolate()method. Therefore, I created the DataFrame with this complicated code (inspired by this answer):

此代码会生成一个填充了“对象”类型的 NaN 的 DataFrame。因此它们不能在以后使用,例如与interpolate()方法一起使用。因此,我用这个复杂的代码创建了 DataFrame(受这个答案的启发):

import pandas as pd
import numpy as np

dummyarray = np.empty((4,1))
dummyarray[:] = np.nan

df = pd.DataFrame(dummyarray)

This results in a DataFrame filled with NaN of type "float", so it can be used later on with interpolate(). Is there a more elegant way to create the same result?

这会导致 DataFrame 填充为“float”类型的 NaN,因此稍后可以将其与interpolate(). 有没有更优雅的方法来创建相同的结果?

采纳答案by ojdo

Simply pass the desired value as first argument, like 0, math.infor, here, np.nan. The constructor then initializes and fills the value array to the size specified by arguments indexand columns:

只需将所需的值作为第一个参数传递,例如0,math.inf或者,这里是np.nan。然后构造函数初始化并将值数组填充到参数index和指定的大小columns

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame(np.nan, index=[0, 1, 2, 3], columns=['A', 'B'])

>>> df.dtypes
A    float64
B    float64
dtype: object

>>> df.values
array([[nan, nan],
       [nan, nan],
       [nan, nan],
       [nan, nan]])

回答by Alex Riley

You could specify the dtype directly when constructing the DataFrame:

您可以在构造 DataFrame 时直接指定 dtype:

>>> df = pd.DataFrame(index=range(0,4),columns=['A'], dtype='float')
>>> df.dtypes
A    float64
dtype: object

Specifying the dtype forces Pandas to try creating the DataFrame with that type, rather than trying to infer it.

指定 dtype 会强制 Pandas 尝试使用该类型创建 DataFrame,而不是尝试推断它。

回答by errorParser

Hope this can help!

希望这可以帮助!

 pd.DataFrame(np.nan, index = np.arange(<num_rows>), columns = ['A'])

回答by Yogesh

You can try this line of code:

你可以试试这行代码:

pdDataFrame = pd.DataFrame([np.nan] * 7)

This will create a pandas dataframe of size 7 with NaN of type float:

这将创建一个大小为 7 且 NaN 类型为 float 的 Pandas 数据框:

if you print pdDataFramethe output will be:

如果打印pdDataFrame输出将是:

     0
0   NaN
1   NaN
2   NaN
3   NaN
4   NaN
5   NaN
6   NaN

Also the output for pdDataFrame.dtypesis:

的输出pdDataFrame.dtypes也是:

0    float64
dtype: object

回答by Digio

For multiple columns you can do:

对于多列,您可以执行以下操作:

df = pd.DataFrame(np.zeros([nrow, ncol])*np.nan)