将 Python dict 转换为数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18837262/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert Python dict into a dataframe
提问by anonuser0428
I have a Python dictionary like the following:
我有一个像下面这样的 Python 字典:
{u'2012-06-08': 388,
u'2012-06-09': 388,
u'2012-06-10': 388,
u'2012-06-11': 389,
u'2012-06-12': 389,
u'2012-06-13': 389,
u'2012-06-14': 389,
u'2012-06-15': 389,
u'2012-06-16': 389,
u'2012-06-17': 389,
u'2012-06-18': 390,
u'2012-06-19': 390,
u'2012-06-20': 390,
u'2012-06-21': 390,
u'2012-06-22': 390,
u'2012-06-23': 390,
u'2012-06-24': 390,
u'2012-06-25': 391,
u'2012-06-26': 391,
u'2012-06-27': 391,
u'2012-06-28': 391,
u'2012-06-29': 391,
u'2012-06-30': 391,
u'2012-07-01': 391,
u'2012-07-02': 392,
u'2012-07-03': 392,
u'2012-07-04': 392,
u'2012-07-05': 392,
u'2012-07-06': 392}
The keys are Unicodedates and the values are integers. I would like to convert this into a pandas dataframe by having the dates and their corresponding values as two separate columns. Example: col1: Dates col2: DateValue (the dates are still Unicode and datevalues are still integers)
键是Unicode日期,值是整数。我想通过将日期及其相应的值作为两个单独的列将其转换为熊猫数据框。示例:col1:日期 col2:DateValue(日期仍然是 Unicode,日期值仍然是整数)
Date DateValue
0 2012-07-01 391
1 2012-07-02 392
2 2012-07-03 392
. 2012-07-04 392
. ... ...
. ... ...
Any help in this direction would be much appreciated. I am unable to find resources on the pandas docs to help me with this.
任何在这个方向上的帮助将不胜感激。我无法在 Pandas 文档上找到帮助我解决这个问题的资源。
I know one solution might be to convert each key-value pair in this dict, into a dict so the entire structure becomes a dict of dicts, and then we can add each row individually to the dataframe. But I want to know if there is an easier way and a more direct way to do this.
我知道一个解决方案可能是将这个 dict 中的每个键值对转换为一个 dict,这样整个结构就变成了一个 dict 的 dict,然后我们可以将每一行单独添加到数据帧中。但我想知道是否有更简单和更直接的方法来做到这一点。
So far I have tried converting the dict into a series object but this doesn't seem to maintain the relationship between the columns:
到目前为止,我已经尝试将 dict 转换为系列对象,但这似乎并没有保持列之间的关系:
s = Series(my_dict,index=my_dict.keys())
采纳答案by Andy Hayden
The error here, is since calling the DataFrame constructor with scalar values (where it expects values to be a list/dict/... i.e. have multiple columns):
这里的错误是因为使用标量值调用 DataFrame 构造函数(它期望值是一个列表/字典/......即有多个列):
pd.DataFrame(d)
ValueError: If using all scalar values, you must must pass an index
You could take the items from the dictionary (i.e. the key-value pairs):
您可以从字典中获取项目(即键值对):
In [11]: pd.DataFrame(d.items()) # or list(d.items()) in python 3
Out[11]:
0 1
0 2012-07-02 392
1 2012-07-06 392
2 2012-06-29 391
3 2012-06-28 391
...
In [12]: pd.DataFrame(d.items(), columns=['Date', 'DateValue'])
Out[12]:
Date DateValue
0 2012-07-02 392
1 2012-07-06 392
2 2012-06-29 391
But I think it makes more sense to pass the Series constructor:
但我认为通过 Series 构造函数更有意义:
In [21]: s = pd.Series(d, name='DateValue')
Out[21]:
2012-06-08 388
2012-06-09 388
2012-06-10 388
In [22]: s.index.name = 'Date'
In [23]: s.reset_index()
Out[23]:
Date DateValue
0 2012-06-08 388
1 2012-06-09 388
2 2012-06-10 388
回答by Viktor Kerkez
Pass the items of the dictionary to the DataFrame constructor, and give the column names. After that parse the Date
column to get Timestamp
values.
将字典的项传递给 DataFrame 构造函数,并给出列名。之后解析Date
列以获取Timestamp
值。
Note the difference between python 2.x and 3.x:
注意python 2.x和3.x的区别:
In python 2.x:
在 python 2.x 中:
df = pd.DataFrame(data.items(), columns=['Date', 'DateValue'])
df['Date'] = pd.to_datetime(df['Date'])
In Python 3.x: (requiring an additional 'list')
在 Python 3.x 中:(需要额外的“列表”)
df = pd.DataFrame(list(data.items()), columns=['Date', 'DateValue'])
df['Date'] = pd.to_datetime(df['Date'])
回答by firstly
Accepts a dict as argument and returns a dataframe with the keys of the dict as index and values as a column.
接受一个 dict 作为参数并返回一个数据框,其中 dict 的键作为索引,值作为列。
def dict_to_df(d):
df=pd.DataFrame(d.items())
df.set_index(0, inplace=True)
return df
回答by ntg
As explained on another answer using pandas.DataFrame()
directly here will not act as you think.
正如在pandas.DataFrame()
此处直接使用的另一个答案所解释的那样,不会像您想象的那样行事。
What you can do is use pandas.DataFrame.from_dict
with orient='index'
:
你可以做的是使用pandas.DataFrame.from_dict
具有orient='index'
:
In[7]: pandas.DataFrame.from_dict({u'2012-06-08': 388,
u'2012-06-09': 388,
u'2012-06-10': 388,
u'2012-06-11': 389,
u'2012-06-12': 389,
.....
u'2012-07-05': 392,
u'2012-07-06': 392}, orient='index', columns=['foo'])
Out[7]:
foo
2012-06-08 388
2012-06-09 388
2012-06-10 388
2012-06-11 389
2012-06-12 389
........
2012-07-05 392
2012-07-06 392
回答by Nader Hisham
pd.DataFrame({'date' : dict_dates.keys() , 'date_value' : dict_dates.values() })
回答by Blairg23
You can also just pass the keys and values of the dictionary to the new dataframe, like so:
您也可以将字典的键和值传递给新的数据框,如下所示:
import pandas as pd
myDict = {<the_dict_from_your_example>]
df = pd.DataFrame()
df['Date'] = myDict.keys()
df['DateValue'] = myDict.values()
回答by Bryan Butler
I have run into this several times and have an example dictionary that I created from a function get_max_Path()
, and it returns the sample dictionary:
我已经多次遇到过这个问题,并且有一个我从函数创建的示例字典,get_max_Path()
它返回示例字典:
{2: 0.3097502930247044,
3: 0.4413177909384636,
4: 0.5197224051562838,
5: 0.5717654946470984,
6: 0.6063959031223476,
7: 0.6365209824708223,
8: 0.655918861281035,
9: 0.680844386645206}
{2: 0.3097502930247044,
3: 0.4413177909384636,
4: 0.5197224051562838,
5: 0.5717654946470984,
6: 0.6063959031223476,
7: 0.6365209824708223,
8: 0.655918861281035,
9: 0.680844386645206}
To convert this to a dataframe, I ran the following:
要将其转换为数据帧,我运行了以下命令:
df = pd.DataFrame.from_dict(get_max_path(2), orient = 'index').reset_index()
df = pd.DataFrame.from_dict(get_max_path(2), orient = 'index').reset_index()
Returns a simple two column dataframe with a separate index:
返回一个带有单独索引的简单的两列数据框:
index 0
0 2 0.309750
1 3 0.441318
index 0
0 2 0.309750
1 3 0.441318
Just rename the columns using f.rename(columns={'index': 'Column1', 0: 'Column2'}, inplace=True)
只需使用重命名列 f.rename(columns={'index': 'Column1', 0: 'Column2'}, inplace=True)
回答by Artem Zaika
In my case I wanted keys and values of a dict to be columns and values of DataFrame. So the only thing that worked for me was:
就我而言,我希望 dict 的键和值是 DataFrame 的列和值。所以唯一对我有用的是:
data = {'adjust_power': 'y', 'af_policy_r_submix_prio_adjust': '[null]', 'af_rf_info': '[null]', 'bat_ac': '3500', 'bat_capacity': '75'}
columns = list(data.keys())
values = list(data.values())
arr_len = len(values)
pd.DataFrame(np.array(values, dtype=object).reshape(1, arr_len), columns=columns)
回答by cheevahagadog
When converting a dictionary into a pandas dataframe where you want the keys to be the columns of said dataframe and the values to be the row values, you can do simply put brackets around the dictionary like this:
将字典转换为 Pandas 数据帧时,您希望键是所述数据帧的列,而值是行值,您可以简单地在字典周围放置括号,如下所示:
>>> dict_ = {'key 1': 'value 1', 'key 2': 'value 2', 'key 3': 'value 3'}
>>> pd.DataFrame([dict_])
key 1 key 2 key 3
0 value 1 value 2 value 3
It's saved me some headaches so I hope it helps someone out there!
它让我免了一些头痛,所以我希望它可以帮助那里的人!
EDIT: In the pandas docsone option for the data
parameter in the DataFrame constructor is a list of dictionaries. Here we're passing a list with one dictionary in it.
编辑:在熊猫文档data
中,DataFrame 构造函数中参数的一个选项是字典列表。在这里,我们传递了一个包含一本字典的列表。
回答by Suat Atan PhD
Pandas have built-in functionfor conversion of dict to data frame.
Pandas 具有将 dict 转换为数据帧的内置函数。
pd.DataFrame.from_dict(dictionaryObject,orient='index')
pd.DataFrame.from_dict(dictionaryObject,orient='index')
For your data you can convert it like below:
对于您的数据,您可以像下面这样转换它:
import pandas as pd
your_dict={u'2012-06-08': 388,
u'2012-06-09': 388,
u'2012-06-10': 388,
u'2012-06-11': 389,
u'2012-06-12': 389,
u'2012-06-13': 389,
u'2012-06-14': 389,
u'2012-06-15': 389,
u'2012-06-16': 389,
u'2012-06-17': 389,
u'2012-06-18': 390,
u'2012-06-19': 390,
u'2012-06-20': 390,
u'2012-06-21': 390,
u'2012-06-22': 390,
u'2012-06-23': 390,
u'2012-06-24': 390,
u'2012-06-25': 391,
u'2012-06-26': 391,
u'2012-06-27': 391,
u'2012-06-28': 391,
u'2012-06-29': 391,
u'2012-06-30': 391,
u'2012-07-01': 391,
u'2012-07-02': 392,
u'2012-07-03': 392,
u'2012-07-04': 392,
u'2012-07-05': 392,
u'2012-07-06': 392}
your_df_from_dict=pd.DataFrame.from_dict(your_dict,orient='index')
print(your_df_from_dict)