将 Python dict 转换为数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18837262/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 12:00:59  来源:igfitidea点击:

Convert Python dict into a dataframe

pythonpandasdataframe

提问by anonuser0428

I have a Python dictionary like the following:

我有一个像下面这样的 Python 字典:

{u'2012-06-08': 388,
 u'2012-06-09': 388,
 u'2012-06-10': 388,
 u'2012-06-11': 389,
 u'2012-06-12': 389,
 u'2012-06-13': 389,
 u'2012-06-14': 389,
 u'2012-06-15': 389,
 u'2012-06-16': 389,
 u'2012-06-17': 389,
 u'2012-06-18': 390,
 u'2012-06-19': 390,
 u'2012-06-20': 390,
 u'2012-06-21': 390,
 u'2012-06-22': 390,
 u'2012-06-23': 390,
 u'2012-06-24': 390,
 u'2012-06-25': 391,
 u'2012-06-26': 391,
 u'2012-06-27': 391,
 u'2012-06-28': 391,
 u'2012-06-29': 391,
 u'2012-06-30': 391,
 u'2012-07-01': 391,
 u'2012-07-02': 392,
 u'2012-07-03': 392,
 u'2012-07-04': 392,
 u'2012-07-05': 392,
 u'2012-07-06': 392}

The keys are Unicodedates and the values are integers. I would like to convert this into a pandas dataframe by having the dates and their corresponding values as two separate columns. Example: col1: Dates col2: DateValue (the dates are still Unicode and datevalues are still integers)

键是Unicode日期,值是整数。我想通过将日期及其相应的值作为两个单独的列将其转换为熊猫数据框。示例:col1:日期 col2:DateValue(日期仍然是 Unicode,日期值仍然是整数)

     Date         DateValue
0    2012-07-01    391
1    2012-07-02    392
2    2012-07-03    392
.    2012-07-04    392
.    ...           ...
.    ...           ...

Any help in this direction would be much appreciated. I am unable to find resources on the pandas docs to help me with this.

任何在这个方向上的帮助将不胜感激。我无法在 Pandas 文档上找到帮助我解决这个问题的资源。

I know one solution might be to convert each key-value pair in this dict, into a dict so the entire structure becomes a dict of dicts, and then we can add each row individually to the dataframe. But I want to know if there is an easier way and a more direct way to do this.

我知道一个解决方案可能是将这个 dict 中的每个键值对转换为一个 dict,这样整个结构就变成了一个 dict 的 dict,然后我们可以将每一行单独添加到数据帧中。但我想知道是否有更简单和更直接的方法来做到这一点。

So far I have tried converting the dict into a series object but this doesn't seem to maintain the relationship between the columns:

到目前为止,我已经尝试将 dict 转换为系列对象,但这似乎并没有保持列之间的关系:

s  = Series(my_dict,index=my_dict.keys())

采纳答案by Andy Hayden

The error here, is since calling the DataFrame constructor with scalar values (where it expects values to be a list/dict/... i.e. have multiple columns):

这里的错误是因为使用标量值调用 DataFrame 构造函数(它期望值是一个列表/字典/......即有多个列):

pd.DataFrame(d)
ValueError: If using all scalar values, you must must pass an index

You could take the items from the dictionary (i.e. the key-value pairs):

您可以从字典中获取项目(即键值对):

In [11]: pd.DataFrame(d.items())  # or list(d.items()) in python 3
Out[11]:
             0    1
0   2012-07-02  392
1   2012-07-06  392
2   2012-06-29  391
3   2012-06-28  391
...

In [12]: pd.DataFrame(d.items(), columns=['Date', 'DateValue'])
Out[12]:
          Date  DateValue
0   2012-07-02        392
1   2012-07-06        392
2   2012-06-29        391

But I think it makes more sense to pass the Series constructor:

但我认为通过 Series 构造函数更有意义:

In [21]: s = pd.Series(d, name='DateValue')
Out[21]:
2012-06-08    388
2012-06-09    388
2012-06-10    388

In [22]: s.index.name = 'Date'

In [23]: s.reset_index()
Out[23]:
          Date  DateValue
0   2012-06-08        388
1   2012-06-09        388
2   2012-06-10        388

回答by Viktor Kerkez

Pass the items of the dictionary to the DataFrame constructor, and give the column names. After that parse the Datecolumn to get Timestampvalues.

将字典的项传递给 DataFrame 构造函数,并给出列名。之后解析Date列以获取Timestamp值。

Note the difference between python 2.x and 3.x:

注意python 2.x和3.x的区别:

In python 2.x:

在 python 2.x 中:

df = pd.DataFrame(data.items(), columns=['Date', 'DateValue'])
df['Date'] = pd.to_datetime(df['Date'])

In Python 3.x: (requiring an additional 'list')

在 Python 3.x 中:(需要额外的“列表”)

df = pd.DataFrame(list(data.items()), columns=['Date', 'DateValue'])
df['Date'] = pd.to_datetime(df['Date'])

回答by firstly

Accepts a dict as argument and returns a dataframe with the keys of the dict as index and values as a column.

接受一个 dict 作为参数并返回一个数据框,其中 dict 的键作为索引,值作为列。

def dict_to_df(d):
    df=pd.DataFrame(d.items())
    df.set_index(0, inplace=True)
    return df

回答by ntg

As explained on another answer using pandas.DataFrame()directly here will not act as you think.

正如在pandas.DataFrame()此处直接使用的另一个答案所解释的那样,不会像您想象的那样行事。

What you can do is use pandas.DataFrame.from_dictwith orient='index':

你可以做的是使用pandas.DataFrame.from_dict具有orient='index'

In[7]: pandas.DataFrame.from_dict({u'2012-06-08': 388,
 u'2012-06-09': 388,
 u'2012-06-10': 388,
 u'2012-06-11': 389,
 u'2012-06-12': 389,
 .....
 u'2012-07-05': 392,
 u'2012-07-06': 392}, orient='index', columns=['foo'])
Out[7]: 
            foo
2012-06-08  388
2012-06-09  388
2012-06-10  388
2012-06-11  389
2012-06-12  389
........
2012-07-05  392
2012-07-06  392

回答by Nader Hisham

pd.DataFrame({'date' : dict_dates.keys() , 'date_value' : dict_dates.values() })

回答by Blairg23

You can also just pass the keys and values of the dictionary to the new dataframe, like so:

您也可以将字典的键和值传递给新的数据框,如下所示:

import pandas as pd

myDict = {<the_dict_from_your_example>]
df = pd.DataFrame()
df['Date'] = myDict.keys()
df['DateValue'] = myDict.values()

回答by Bryan Butler

I have run into this several times and have an example dictionary that I created from a function get_max_Path(), and it returns the sample dictionary:

我已经多次遇到过这个问题,并且有一个我从函数创建的示例字典,get_max_Path()它返回示例字典:

{2: 0.3097502930247044, 3: 0.4413177909384636, 4: 0.5197224051562838, 5: 0.5717654946470984, 6: 0.6063959031223476, 7: 0.6365209824708223, 8: 0.655918861281035, 9: 0.680844386645206}

{2: 0.3097502930247044, 3: 0.4413177909384636, 4: 0.5197224051562838, 5: 0.5717654946470984, 6: 0.6063959031223476, 7: 0.6365209824708223, 8: 0.655918861281035, 9: 0.680844386645206}

To convert this to a dataframe, I ran the following:

要将其转换为数据帧,我运行了以下命令:

df = pd.DataFrame.from_dict(get_max_path(2), orient = 'index').reset_index()

df = pd.DataFrame.from_dict(get_max_path(2), orient = 'index').reset_index()

Returns a simple two column dataframe with a separate index:

返回一个带有单独索引的简单的两列数据框:

index 0 0 2 0.309750 1 3 0.441318

index 0 0 2 0.309750 1 3 0.441318

Just rename the columns using f.rename(columns={'index': 'Column1', 0: 'Column2'}, inplace=True)

只需使用重命名列 f.rename(columns={'index': 'Column1', 0: 'Column2'}, inplace=True)

回答by Artem Zaika

In my case I wanted keys and values of a dict to be columns and values of DataFrame. So the only thing that worked for me was:

就我而言,我希望 dict 的键和值是 DataFrame 的列和值。所以唯一对我有用的是:

data = {'adjust_power': 'y', 'af_policy_r_submix_prio_adjust': '[null]', 'af_rf_info': '[null]', 'bat_ac': '3500', 'bat_capacity': '75'} 

columns = list(data.keys())
values = list(data.values())
arr_len = len(values)

pd.DataFrame(np.array(values, dtype=object).reshape(1, arr_len), columns=columns)

回答by cheevahagadog

When converting a dictionary into a pandas dataframe where you want the keys to be the columns of said dataframe and the values to be the row values, you can do simply put brackets around the dictionary like this:

将字典转换为 Pandas 数据帧时,您希望键是所述数据帧的列,而值是行值,您可以简单地在字典周围放置括号,如下所示:

>>> dict_ = {'key 1': 'value 1', 'key 2': 'value 2', 'key 3': 'value 3'}
>>> pd.DataFrame([dict_])

    key 1     key 2     key 3
0   value 1   value 2   value 3

It's saved me some headaches so I hope it helps someone out there!

它让我免了一些头痛,所以我希望它可以帮助那里的人!

EDIT: In the pandas docsone option for the dataparameter in the DataFrame constructor is a list of dictionaries. Here we're passing a list with one dictionary in it.

编辑:在熊猫文档data中,DataFrame 构造函数中参数的一个选项是字典列表。在这里,我们传递了一个包含一本字典的列表。

回答by Suat Atan PhD

Pandas have built-in functionfor conversion of dict to data frame.

Pandas 具有将 dict 转换为数据帧的内置函数

pd.DataFrame.from_dict(dictionaryObject,orient='index')

pd.DataFrame.from_dict(dictionaryObject,orient='index')

For your data you can convert it like below:

对于您的数据,您可以像下面这样转换它:

import pandas as pd
your_dict={u'2012-06-08': 388,
 u'2012-06-09': 388,
 u'2012-06-10': 388,
 u'2012-06-11': 389,
 u'2012-06-12': 389,
 u'2012-06-13': 389,
 u'2012-06-14': 389,
 u'2012-06-15': 389,
 u'2012-06-16': 389,
 u'2012-06-17': 389,
 u'2012-06-18': 390,
 u'2012-06-19': 390,
 u'2012-06-20': 390,
 u'2012-06-21': 390,
 u'2012-06-22': 390,
 u'2012-06-23': 390,
 u'2012-06-24': 390,
 u'2012-06-25': 391,
 u'2012-06-26': 391,
 u'2012-06-27': 391,
 u'2012-06-28': 391,
 u'2012-06-29': 391,
 u'2012-06-30': 391,
 u'2012-07-01': 391,
 u'2012-07-02': 392,
 u'2012-07-03': 392,
 u'2012-07-04': 392,
 u'2012-07-05': 392,
 u'2012-07-06': 392}

your_df_from_dict=pd.DataFrame.from_dict(your_dict,orient='index')
print(your_df_from_dict)