Python 如何将模型对象列表转换为熊猫数据框？

Question

提问by ezamur

I have an array of objects of this class

我有一个此类的对象数组

class CancerDataEntity(Model):

    age = columns.Text(primary_key=True)
    gender = columns.Text(primary_key=True)
    cancer = columns.Text(primary_key=True)
    deaths = columns.Integer()
    ...

When printed, array looks like this

打印时，数组看起来像这样

[CancerDataEntity(age=u'80-85+', gender=u'Female', cancer=u'All cancers (C00-97,B21)', deaths=15306), CancerDataEntity(...

I want to convert this to a data frame so I can play with it in a more suitable way to me - to aggregate, count, sum and similar. How I wish this data frame to look, would be something like this:

我想将其转换为数据框，以便我可以以更适合我的方式使用它 - 聚合、计数、求和等。我希望这个数据框看起来像这样：

     age     gender     cancer     deaths
0    80-85+  Female     ...        15306
1    ...

Is there a way to achieve this using numpy/pandas easily, without manually processing the input array?

有没有办法使用 numpy/pandas 轻松实现这一点，而无需手动处理输入数组？

Answer 1

采纳答案by ezamur

Code that leads to desired result:

导致预期结果的代码：

variables = arr[0].keys()
df = pd.DataFrame([[getattr(i,j) for j in variables] for i in arr], columns = variables)

Thanks to @Serbitar for pointing me to the right direction.

感谢@Serbitar 为我指明了正确的方向。

Answer 2

回答by Serbitar

try:

尝试：

variables = list(array[0].keys())
dataframe = pandas.DataFrame([[getattr(i,j) for j in variables] for i in array], columns = variables)

Answer 3

回答by OregonTrail

A much cleaner way to to this is to define a to_dictmethod on your class and then use pandas.DataFrame.from_records

一个更简洁的方法是to_dict在你的类上定义一个方法，然后使用pandas.DataFrame.from_records

class Signal(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def to_dict(self):
        return {
            'x': self.x,
            'y': self.y,
        }

e.g.

例如

In [87]: signals = [Signal(3, 9), Signal(4, 16)]

In [88]: pandas.DataFrame.from_records([s.to_dict() for s in signals])
Out[88]:
   x   y
0  3   9
1  4  16

Answer 4

回答by Shital Shah

Just use:

只需使用：

DataFrame([o.__dict__ for o in my_objs])

Full example:

完整示例：

import pandas as pd

# define some class
class SomeThing:
    def __init__(self, x, y):
        self.x, self.y = x, y

# make an array of the class objects
things = [SomeThing(1,2), SomeThing(3,4), SomeThing(4,5)]

# fill dataframe with one row per object, one attribute per column
df = pd.DataFrame([t.__dict__ for t in things ])

print(df)

This prints:

这打印：

Answer 5

回答by typhon04

I would like to emphasize Jim Hunziker's comment.

我想强调Jim Hunziker的评论。

pandas.DataFrame([vars(s) for s in signals])

It is far easier to write, less error-prone and you don't have to change the to_dict()function every time you add a new attribute.

编写起来要容易得多，不易出错，而且to_dict()每次添加新属性时都不必更改函数。

If you want the freedom to choose which attributes to keep, the columnsparameter could be used.

如果您希望自由选择要保留的属性，则可以使用columns参数。

pandas.DataFrame([vars(s) for s in signals], columns=['x', 'y'])

The downside is that it won't work for complex attributes, though that should rarely be the case.

缺点是它不适用于复杂的属性，尽管这种情况很少发生。

Python 如何将模型对象列表转换为熊猫数据框？

提问by ezamur

采纳答案by ezamur

回答by Serbitar

回答by OregonTrail

回答by Shital Shah

回答by typhon04

相关推荐

最近更新

标签

Python 如何将模型对象列表转换为熊猫数据框？

提问by ezamur

采纳答案by ezamur

回答by Serbitar

回答by OregonTrail

回答by Shital Shah

回答by typhon04

相关推荐

Python 如何在 Tensorflow 中设置分层学习率？

Python 按日期对火花数据框进行分组

在 Python 中遍历 JSON 列表的问题？

Python 无法导入 openpyxl

相关推荐

最近更新

标签