Python 使用 float 类型的 NaN 创建空的 Pandas DataFrame 的优雅方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30053329/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Elegant way to create empty pandas DataFrame with NaN of type float
提问by mjd
I want to create a Pandas DataFrame filled with NaNs. During my research I found an answer:
我想创建一个充满 NaN 的 Pandas DataFrame。在我的研究中,我找到了一个答案:
import pandas as pd
df = pd.DataFrame(index=range(0,4),columns=['A'])
This code results in a DataFrame filled with NaNs of type "object". So they cannot be used later on for example with the interpolate()
method. Therefore, I created the DataFrame with this complicated code (inspired by this answer):
此代码会生成一个填充了“对象”类型的 NaN 的 DataFrame。因此它们不能在以后使用,例如与interpolate()
方法一起使用。因此,我用这个复杂的代码创建了 DataFrame(受这个答案的启发):
import pandas as pd
import numpy as np
dummyarray = np.empty((4,1))
dummyarray[:] = np.nan
df = pd.DataFrame(dummyarray)
This results in a DataFrame filled with NaN of type "float", so it can be used later on with interpolate()
. Is there a more elegant way to create the same result?
这会导致 DataFrame 填充为“float”类型的 NaN,因此稍后可以将其与interpolate()
. 有没有更优雅的方法来创建相同的结果?
采纳答案by ojdo
Simply pass the desired value as first argument, like 0
, math.inf
or, here, np.nan
. The constructor then initializes and fills the value array to the size specified by arguments index
and columns
:
只需将所需的值作为第一个参数传递,例如0
,math.inf
或者,这里是np.nan
。然后构造函数初始化并将值数组填充到参数index
和指定的大小columns
:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame(np.nan, index=[0, 1, 2, 3], columns=['A', 'B'])
>>> df.dtypes
A float64
B float64
dtype: object
>>> df.values
array([[nan, nan],
[nan, nan],
[nan, nan],
[nan, nan]])
回答by Alex Riley
You could specify the dtype directly when constructing the DataFrame:
您可以在构造 DataFrame 时直接指定 dtype:
>>> df = pd.DataFrame(index=range(0,4),columns=['A'], dtype='float')
>>> df.dtypes
A float64
dtype: object
Specifying the dtype forces Pandas to try creating the DataFrame with that type, rather than trying to infer it.
指定 dtype 会强制 Pandas 尝试使用该类型创建 DataFrame,而不是尝试推断它。
回答by errorParser
Hope this can help!
希望这可以帮助!
pd.DataFrame(np.nan, index = np.arange(<num_rows>), columns = ['A'])
回答by Yogesh
You can try this line of code:
你可以试试这行代码:
pdDataFrame = pd.DataFrame([np.nan] * 7)
This will create a pandas dataframe of size 7 with NaN of type float:
这将创建一个大小为 7 且 NaN 类型为 float 的 Pandas 数据框:
if you print pdDataFrame
the output will be:
如果打印pdDataFrame
输出将是:
0
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
Also the output for pdDataFrame.dtypes
is:
的输出pdDataFrame.dtypes
也是:
0 float64
dtype: object
回答by Digio
For multiple columns you can do:
对于多列,您可以执行以下操作:
df = pd.DataFrame(np.zeros([nrow, ncol])*np.nan)