pandas 用于存储对象的 Python DataFrame 或列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44219023/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python DataFrame or list for storing objects
提问by Demaunt
Can I "store" instances of class in pandas/numpy Series-DataFrame/ndarray just like I do in list? Or these libraries support on built-in types (numerics, strings).
我可以像在列表中一样在 pandas/numpy Series-DataFrame/ndarray 中“存储”类的实例吗?或者这些库支持内置类型(数字、字符串)。
For example I have Point
with x,y
coordinates, and I want to store Points
in Plane
, that would return Point
with given coordinates.
例如,我有Point
带x,y
坐标,我想存储Points
的Plane
,这将返回Point
与给定的坐标。
#my class
class MyPoint:
def __init__(self, x,y):
self.x = x
self.y = y
@property
def x(self):
return self.x
@property
def y(self):
return self.y
Here I create instances:
我在这里创建实例:
first_point = MyClass(1,1)
second_point = MyClass(2,2)
I can store instances in some list
我可以将实例存储在某个列表中
my_list = []
my_list.append(first_point)
my_list.append(second_point)
The problem in list is that it's indexes do not correspond to x,y properties.
list 中的问题是它的索引与 x,y 属性不对应。
Dictionary/DataFrame approach:
字典/数据帧方法:
Plane = {"x" : [first_point.x, second_point.x], "y" : [first_point.y, second_point.y], "some_reference/id_to_point_instance" = ???}
Plane_pd = pd.DataFrame(Plane)
I've read posts, that using "id" of instance as third column value in DataFrame could cause problems with the garbage collector.
我读过帖子,使用实例的“id”作为 DataFrame 中的第三列值可能会导致垃圾收集器出现问题。
回答by Stephen Rauch
A pandas.DataFrame
will gladly store python objects.
Apandas.DataFrame
很乐意存储 python 对象。
Some test code to demonstrate...
一些测试代码来演示...
Test Code:
测试代码:
class MyPoint:
def __init__(self, x, y):
self._x = x
self._y = y
@property
def x(self):
return self._x
@property
def y(self):
return self._y
my_list = [MyPoint(1, 1), MyPoint(2, 2)]
print(my_list)
plane_pd = pd.DataFrame([[p.x, p.y, p] for p in my_list],
columns=list('XYO'))
print(plane_pd.dtypes)
print(plane_pd)
Results:
结果:
[<__main__.MyPoint object at 0x033D2AF0>, <__main__.MyPoint object at 0x033D2B10>]
X int64
Y int64
O object
dtype: object
X Y O
0 1 1 <__main__.MyPoint object at 0x033D2AF0>
1 2 2 <__main__.MyPoint object at 0x033D2B10>
Notes:
笔记:
Note the two object in the list are the same two objects in the dataframe. Also note the dtype for the O
column is object
.
请注意列表中的两个对象与数据框中的两个对象相同。还要注意O
列的 dtype是object
.