pandas 用于存储对象的 Python DataFrame 或列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44219023/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:42:07  来源:igfitidea点击:

Python DataFrame or list for storing objects

pythonpandasnumpy

提问by Demaunt

Can I "store" instances of class in pandas/numpy Series-DataFrame/ndarray just like I do in list? Or these libraries support on built-in types (numerics, strings).

我可以像在列表中一样在 pandas/numpy Series-DataFrame/ndarray 中“存储”类的实例吗?或者这些库支持内置类型(数字、字符串)。

For example I have Pointwith x,ycoordinates, and I want to store Pointsin Plane, that would return Pointwith given coordinates.

例如,我有Pointx,y坐标,我想存储PointsPlane,这将返回Point与给定的坐标。

#my class
class MyPoint:

    def __init__(self, x,y):
        self.x = x
        self.y = y

    @property
    def x(self):
        return self.x

    @property
    def y(self):
        return self.y

Here I create instances:

我在这里创建实例:

first_point = MyClass(1,1)
second_point = MyClass(2,2)

I can store instances in some list

我可以将实例存储在某个列表中

my_list = []
my_list.append(first_point)
my_list.append(second_point)

The problem in list is that it's indexes do not correspond to x,y properties.

list 中的问题是它的索引与 x,y 属性不对应。

Dictionary/DataFrame approach:

字典/数据帧方法:

Plane = {"x" : [first_point.x, second_point.x], "y" : [first_point.y, second_point.y], "some_reference/id_to_point_instance" = ???}
Plane_pd = pd.DataFrame(Plane)

I've read posts, that using "id" of instance as third column value in DataFrame could cause problems with the garbage collector.

我读过帖子,使用实例的“id”作为 DataFrame 中的第三列值可能会导致垃圾收集器出现问题。

回答by Stephen Rauch

A pandas.DataFramewill gladly store python objects.

Apandas.DataFrame很乐意存储 python 对象。

Some test code to demonstrate...

一些测试代码来演示...

Test Code:

测试代码:

class MyPoint:
    def __init__(self, x, y):
        self._x = x
        self._y = y

    @property
    def x(self):
        return self._x

    @property
    def y(self):
        return self._y

my_list = [MyPoint(1, 1), MyPoint(2, 2)]
print(my_list)

plane_pd = pd.DataFrame([[p.x, p.y, p] for p in my_list],
                        columns=list('XYO'))
print(plane_pd.dtypes)
print(plane_pd)

Results:

结果:

[<__main__.MyPoint object at 0x033D2AF0>, <__main__.MyPoint object at 0x033D2B10>]

X     int64
Y     int64
O    object
dtype: object

   X  Y                                        O
0  1  1  <__main__.MyPoint object at 0x033D2AF0>
1  2  2  <__main__.MyPoint object at 0x033D2B10>

Notes:

笔记:

Note the two object in the list are the same two objects in the dataframe. Also note the dtype for the Ocolumn is object.

请注意列表中的两个对象与数据框中的两个对象相同。还要注意O列的 dtype是object.