Python pandas.Series() 使用 DataFrame Columns 创建返回 NaN 数据条目
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35818873/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas.Series() Creation using DataFrame Columns returns NaN Data entries
提问by nlsdfnbch
Im attempting to convert a dataframe into a series using code which, simplified, looks like this:
我试图使用简化的代码将数据帧转换为系列,如下所示:
dates = ['2016-1-{}'.format(i)for i in range(1,21)]
values = [i for i in range(20)]
data = {'Date': dates, 'Value': values}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
ts = pd.Series(df['Value'], index=df['Date'])
print(ts)
However, print output looks like this:
但是,打印输出如下所示:
Date
2016-01-01 NaN
2016-01-02 NaN
2016-01-03 NaN
2016-01-04 NaN
2016-01-05 NaN
2016-01-06 NaN
2016-01-07 NaN
2016-01-08 NaN
2016-01-09 NaN
2016-01-10 NaN
2016-01-11 NaN
2016-01-12 NaN
2016-01-13 NaN
2016-01-14 NaN
2016-01-15 NaN
2016-01-16 NaN
2016-01-17 NaN
2016-01-18 NaN
2016-01-19 NaN
2016-01-20 NaN
Name: Value, dtype: float64
Where does NaN
come from? Is a view on a DataFrame
object not a valid input for the Series
class ?
哪里NaN
来的呢?DataFrame
对象的视图不是类的有效输入Series
吗?
I have found the to_series
functionfor pd.Index
objects, is there something similar for DataFrame
s ?
我已经找到了to_series
函数的pd.Index
对象,是有类似的东西DataFrame
S'
回答by jezrael
I think you can use values
, it convert column Value
to array:
我认为你可以使用values
,它将列转换Value
为数组:
ts = pd.Series(df['Value'].values, index=df['Date'])
import pandas as pd
import numpy as np
import io
dates = ['2016-1-{}'.format(i)for i in range(1,21)]
values = [i for i in range(20)]
data = {'Date': dates, 'Value': values}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
print df['Value'].values
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
ts = pd.Series(df['Value'].values, index=df['Date'])
print(ts)
Date
2016-01-01 0
2016-01-02 1
2016-01-03 2
2016-01-04 3
2016-01-05 4
2016-01-06 5
2016-01-07 6
2016-01-08 7
2016-01-09 8
2016-01-10 9
2016-01-11 10
2016-01-12 11
2016-01-13 12
2016-01-14 13
2016-01-15 14
2016-01-16 15
2016-01-17 16
2016-01-18 17
2016-01-19 18
2016-01-20 19
dtype: int64
Or you can use:
或者你可以使用:
ts1 = pd.Series(data=values, index=pd.to_datetime(dates))
print(ts1)
2016-01-01 0
2016-01-02 1
2016-01-03 2
2016-01-04 3
2016-01-05 4
2016-01-06 5
2016-01-07 6
2016-01-08 7
2016-01-09 8
2016-01-10 9
2016-01-11 10
2016-01-12 11
2016-01-13 12
2016-01-14 13
2016-01-15 14
2016-01-16 15
2016-01-17 16
2016-01-18 17
2016-01-19 18
2016-01-20 19
dtype: int64
Thank you @ajcrfor better explanation why you get NaN
:
谢谢@ajcr更好地解释为什么你得到NaN
:
When you give a Series
or DataFrame
column to pd.Series
, it will reindex it using the index
you specify. Since your DataFrame
column has an integer index
(not a date index
) you get lots of missing values.
当您将 aSeries
或DataFrame
column 赋予 时pd.Series
,它将使用index
您指定的重新索引它。由于您的DataFrame
列有一个整数index
(不是 a date index
),您会得到很多缺失值。
回答by k-nut
If you are only looking for a to create series with those values you could have also done:
如果您只想使用这些值创建系列,您也可以这样做:
pd.Series( [i for i in range(20)], pd.date_range('2016-01-02', periods=20, freq='D'))
回答by Alexander
You can just do:
你可以这样做:
s = df.set_index('Date')
Which is now a one column dataframe.
现在是一列数据框。
If you really want it as a Series:
如果你真的想要它作为一个系列:
s = df.set_index('Date').Value
btw, NaN is numpy's Not-a-Number.
顺便说一句, NaN 是 numpy 的 Not-a-Number。
Using your method, you could use:
使用您的方法,您可以使用:
ts = pd.Series(df['Value'].values, name='Value', index=df['Date'])
The reason you are getting the NaNs is that you are not providing the data in the correct format. You are passing a Series to a Series.
您获得 NaN 的原因是您没有以正确的格式提供数据。您正在将一个系列传递给一个系列。