Python 如何从 namedtuple 实例列表创建 Pandas DataFrame(带有索引或多索引)?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17004985/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I create pandas DataFrame (with index or multiindex) from list of namedtuple instances?
提问by MikeRand
Simple example:
简单的例子:
>>> from collections import namedtuple
>>> import pandas
>>> Price = namedtuple('Price', 'ticker date price')
>>> a = Price('GE', '2010-01-01', 30.00)
>>> b = Price('GE', '2010-01-02', 31.00)
>>> l = [a, b]
>>> df = pandas.DataFrame.from_records(l, index='ticker')
Traceback (most recent call last)
...
KeyError: 'ticker'
Harder example:
更难的例子:
>>> df2 = pandas.DataFrame.from_records(l, index=['ticker', 'date'])
>>> df2
0 1 2
ticker GE 2010-01-01 30
date GE 2010-01-02 31
Now it thinks that ['ticker', 'date']is the index itself, rather than the columns I want to use as the index.
现在它认为这['ticker', 'date']是索引本身,而不是我想用作索引的列。
Is there a way to do this without resorting to an intermediate numpy ndarray or using set_indexafter the fact?
有没有办法在不诉诸中间 numpy ndarray 或set_index事后使用的情况下做到这一点?
采纳答案by Andy Hayden
To get a Series from a namedtuple you could use the _fieldsattribute:
要从命名元组中获取系列,您可以使用该_fields属性:
In [11]: pd.Series(a, a._fields)
Out[11]:
ticker GE
date 2010-01-01
price 30
dtype: object
Similarly you can create a DataFrame like this:
同样,您可以像这样创建一个 DataFrame:
In [12]: df = pd.DataFrame(l, columns=l[0]._fields)
In [13]: df
Out[13]:
ticker date price
0 GE 2010-01-01 30
1 GE 2010-01-02 31
You have to set_indexafter the fact, but you can do this inplace:
你必须set_index事后,但你可以这样做inplace:
In [14]: df.set_index(['ticker', 'date'], inplace=True)
In [15]: df
Out[15]:
price
ticker date
GE 2010-01-01 30
2010-01-02 31

