Python pandas.Series() 使用 DataFrame Columns 创建返回 NaN 数据条目

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35818873/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:00:20  来源:igfitidea点击:

pandas.Series() Creation using DataFrame Columns returns NaN Data entries

pythonpython-3.xpandasdataframetime-series

提问by nlsdfnbch

Im attempting to convert a dataframe into a series using code which, simplified, looks like this:

我试图使用简化的代码将数据帧转换为系列,如下所示:

dates = ['2016-1-{}'.format(i)for i in range(1,21)]
values = [i for i in range(20)]
data = {'Date': dates, 'Value': values}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
ts = pd.Series(df['Value'], index=df['Date'])
print(ts)

However, print output looks like this:

但是,打印输出如下所示:

Date
2016-01-01   NaN
2016-01-02   NaN
2016-01-03   NaN
2016-01-04   NaN
2016-01-05   NaN
2016-01-06   NaN
2016-01-07   NaN
2016-01-08   NaN
2016-01-09   NaN
2016-01-10   NaN
2016-01-11   NaN
2016-01-12   NaN
2016-01-13   NaN
2016-01-14   NaN
2016-01-15   NaN
2016-01-16   NaN
2016-01-17   NaN
2016-01-18   NaN
2016-01-19   NaN
2016-01-20   NaN
Name: Value, dtype: float64

Where does NaNcome from? Is a view on a DataFrameobject not a valid input for the Seriesclass ?

哪里NaN来的呢?DataFrame对象的视图不是类的有效输入Series吗?

I have found the to_seriesfunctionfor pd.Indexobjects, is there something similar for DataFrames ?

我已经找到了to_series函数pd.Index对象,是有类似的东西DataFrameS'

回答by jezrael

I think you can use values, it convert column Valueto array:

我认为你可以使用values,它将列转换Value为数组:

ts = pd.Series(df['Value'].values, index=df['Date'])
import pandas as pd
import numpy as np
import io

dates = ['2016-1-{}'.format(i)for i in range(1,21)]
values = [i for i in range(20)]
data = {'Date': dates, 'Value': values}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
print df['Value'].values
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

ts = pd.Series(df['Value'].values, index=df['Date'])
print(ts)
Date
2016-01-01     0
2016-01-02     1
2016-01-03     2
2016-01-04     3
2016-01-05     4
2016-01-06     5
2016-01-07     6
2016-01-08     7
2016-01-09     8
2016-01-10     9
2016-01-11    10
2016-01-12    11
2016-01-13    12
2016-01-14    13
2016-01-15    14
2016-01-16    15
2016-01-17    16
2016-01-18    17
2016-01-19    18
2016-01-20    19
dtype: int64

Or you can use:

或者你可以使用:

ts1 = pd.Series(data=values, index=pd.to_datetime(dates))
print(ts1)
2016-01-01     0
2016-01-02     1
2016-01-03     2
2016-01-04     3
2016-01-05     4
2016-01-06     5
2016-01-07     6
2016-01-08     7
2016-01-09     8
2016-01-10     9
2016-01-11    10
2016-01-12    11
2016-01-13    12
2016-01-14    13
2016-01-15    14
2016-01-16    15
2016-01-17    16
2016-01-18    17
2016-01-19    18
2016-01-20    19
dtype: int64

Thank you @ajcrfor better explanation why you get NaN:

谢谢@ajcr更好地解释为什么你得到NaN

When you give a Seriesor DataFramecolumn to pd.Series, it will reindex it using the indexyou specify. Since your DataFramecolumn has an integer index(not a date index) you get lots of missing values.

当您将 aSeriesDataFramecolumn 赋予 时pd.Series,它将使用index您指定的重新索引它。由于您的DataFrame列有一个整数index(不是 a date index),您会得到很多缺失值。

回答by k-nut

If you are only looking for a to create series with those values you could have also done:

如果您只想使用这些值创建系列,您也可以这样做:

 pd.Series( [i for i in range(20)],  pd.date_range('2016-01-02', periods=20, freq='D'))

回答by Alexander

You can just do:

你可以这样做:

s = df.set_index('Date')

Which is now a one column dataframe.

现在是一列数据框。

If you really want it as a Series:

如果你真的想要它作为一个系列:

s = df.set_index('Date').Value

btw, NaN is numpy's Not-a-Number.

顺便说一句, NaN 是 numpy 的 Not-a-Number。

Using your method, you could use:

使用您的方法,您可以使用:

ts = pd.Series(df['Value'].values, name='Value', index=df['Date'])

The reason you are getting the NaNs is that you are not providing the data in the correct format. You are passing a Series to a Series.

您获得 NaN 的原因是您没有以正确的格式提供数据。您正在将一个系列传递给一个系列。