在python中轻松保存/加载数据

Question

提问by nos

What is the easiest way to save and load data in python, preferably in a human-readable output format?

在 python 中保存和加载数据的最简单方法是什么，最好是人类可读的输出格式？

The data I am saving/loading consists of two vectors of floats. Ideally, these vectors would be named in the file (e.g. X and Y).

我正在保存/加载的数据由两个浮点数向量组成。理想情况下，这些向量将在文件中命名（例如 X 和 Y）。

My current save()and load()functions use file.readline(), file.write()and string-to-float conversion. There must be something better.

我的当前save()和load()函数使用file.readline(),file.write()和字符串到浮点数的转换。一定有更好的东西。

Answer 1

采纳答案by Sven Marnach

There are several options -- I don't exactly know what you like. If the two vectors have the same length, you could use numpy.savetxt()to save your vectors, say xand y, as columns:

有几种选择——我不完全知道你喜欢什么。如果两个向量具有相同的长度，则可以使用numpy.savetxt()将向量（例如x和）保存y为列：

 # saving:
 f = open("data", "w")
 f.write("# x y\n")        # column names
 numpy.savetxt(f, numpy.array([x, y]).T)
 # loading:
 x, y = numpy.loadtxt("data", unpack=True)

If you are dealing with larger vectors of floats, you should probably use NumPy anyway.

如果您正在处理较大的浮点数向量，则无论如何您都应该使用 NumPy。

Answer 2

回答by Mark Byers

A simple serialization format that is easy for both humans to computers read is JSON.

JSON是一种简单的序列化格式，人与计算机都可以轻松读取。

You can use the jsonPython module.

您可以使用jsonPython 模块。

Answer 3

回答by Jamie Rumbelow

The most simple way to get a human-readable output is by using a serialisation format such a JSON. Python contains a jsonlibrary you can use to serialise data to and from a string. Like pickle, you can use this with an IO object to write it to a file.

获得人类可读输出的最简单方法是使用诸如 JSON 之类的序列化格式。Python 包含一个json库，可用于将数据序列化为字符串或从字符串序列化数据。与pickle一样，您可以将其与 IO 对象一起使用以将其写入文件。

import json

file = open('/usr/data/application/json-dump.json', 'w+')
data = { "x": 12153535.232321, "y": 35234531.232322 }

json.dump(data, file)

If you want to get a simple string back instead of dumping it to a file, you can use json.dumps()instead:

如果您想返回一个简单的字符串而不是将其转储到文件中，您可以使用json。转储（）代替：

import json
print json.dumps({ "x": 12153535.232321, "y": 35234531.232322 })

Reading back from a file is just as easy:

从文件中读回同样简单：

import json

file = open('/usr/data/application/json-dump.json', 'r')
print json.load(file)

The json library is full-featured, so I'd recommend checking out the documentationto see what sorts of things you can do with it.

json 库功能齐全，因此我建议您查看文档以了解您可以使用它做什么。

Answer 4

回答by Lennart Regebro

If it should be human-readable, I'd also go with JSON. Unless you need to exchange it with enterprise-type people, they like XML better. :-)
If it should be human editableand isn't too complex, I'd probably go with some sort of INI-like format, like for example configparser.
If it is complex, and doesn't need to be exchanged, I'd go with just pickling the data, unless it's very complex, in which case I'd use ZODB.
If it's a LOT of data, and needs to be exchanged, I'd use SQL.

如果它应该是人类可读的，我也会使用 JSON。除非您需要与企业类型的人交流，否则他们更喜欢 XML。:-)
如果它应该是人类可编辑的并且不太复杂，我可能会使用某种类似 INI 的格式，例如 configparser。
如果它很复杂，并且不需要交换，我只会酸洗数据，除非它非常复杂，在这种情况下我会使用 ZODB。
如果它有很多数据，并且需要交换，我会使用 SQL。

That pretty much covers it, I think.

我认为这几乎涵盖了它。

Answer 5

回答by NPE

Since we're talking about a human editing the file, I assume we're talking about relatively little data.

由于我们谈论的是人工编辑文件，因此我认为我们谈论的是相对较少的数据。

How about the following skeleton implementation. It simply saves the data as key=valuepairs and works with lists, tuples and many other things.

下面的框架实现怎么样。它只是将数据key=value成对保存并处理列表、元组和许多其他内容。

    def save(fname, **kwargs):
      f = open(fname, "wt")
      for k, v in kwargs.items():
        print >>f, "%s=%s" % (k, repr(v))
      f.close()

    def load(fname):
      ret = {}
      for line in open(fname, "rt"):
        k, v = line.strip().split("=", 1)
        ret[k] = eval(v)
      return ret

    x = [1, 2, 3]
    y = [2.0, 1e15, -10.3]
    save("data.txt", x=x, y=y)
    d = load("data.txt")
    print d["x"]
    print d["y"]

Answer 6

回答by Dalker

As I commented in the accepted answer, using numpythis can be done with a simple one-liner:

正如我在接受的答案中评论的那样，numpy可以通过简单的单行来使用它：

Assuming you have numpyimported as np(which is common practice),

假设您已numpy导入为np（这是常见做法），

np.savetxt('xy.txt', np.array([x, y]).T, fmt="%.3f", header="x   y")

will save the data in the (optional) format and

将以（可选）格式保存数据和

x, y = np.loadtxt('xy.txt', unpack=True)

will load it.

将加载它。

The file xy.txtwill then look like:

该文件xy.txt将如下所示：

# x   y
1.000 1.000
1.500 2.250
2.000 4.000
2.500 6.250
3.000 9.000

Note that the format string fmt=...is optional, but if the goal is human-readability it may prove quite useful. If used, it is specified using the usual printf-like codes (In my example: floating-point number with 3 decimals).

请注意，格式字符串fmt=...是可选的，但如果目标是人类可读性，它可能会非常有用。如果使用，则使用通常的printf类似代码指定（在我的示例中：带有 3 个小数的浮点数）。

Answer 7

回答by Koke Cacao

Here is an example of Encoder until you probably want to write for Bodyclass:

这是编码器的示例，直到您可能想要为Body类编写代码：

# add this to your code
class BodyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        if hasattr(obj, '__jsonencode__'):
            return obj.__jsonencode__()
        if isinstance(obj, set):
            return list(obj)
        return obj.__dict__

    # Here you construct your way to dump your data for each instance
    # you need to customize this function
    def deserialize(data):
        bodies = [Body(d["name"],d["mass"],np.array(d["p"]),np.array(d["v"])) for d in data["bodies"]]
        axis_range = data["axis_range"]
        timescale = data["timescale"]
        return bodies, axis_range, timescale

    # Here you construct your way to load your data for each instance
    # you need to customize this function
    def serialize(data):
        file = open(FILE_NAME, 'w+')
        json.dump(data, file, cls=BodyEncoder, indent=4)
        print("Dumping Parameters of the Latest Run")
        print(json.dumps(data, cls=BodyEncoder, indent=4))

Here is an example of the class I want to serialize:

这是我要序列化的类的示例：

class Body(object):
    # you do not need to change your class structure
    def __init__(self, name, mass, p, v=(0.0, 0.0, 0.0)):
        # init variables like normal
        self.name = name
        self.mass = mass
        self.p = p
        self.v = v
        self.f = np.array([0.0, 0.0, 0.0])

    def attraction(self, other):
        # not important functions that I wrote...

Here is how to serialize:

以下是序列化的方法：

# you need to customize this function
def serialize_everything():
    bodies, axis_range, timescale = generate_data_to_serialize()

    data = {"bodies": bodies, "axis_range": axis_range, "timescale": timescale}
    BodyEncoder.serialize(data)

Here is how to dump:

以下是转储方法：

def dump_everything():
    data = json.loads(open(FILE_NAME, "r").read())
    return BodyEncoder.deserialize(data)

在python中轻松保存/加载数据

提问by nos

采纳答案by Sven Marnach

回答by Mark Byers

回答by Jamie Rumbelow

回答by Lennart Regebro

回答by NPE

回答by Dalker

回答by Koke Cacao

相关推荐

最近更新

标签

在python中轻松保存/加载数据

提问by nos

采纳答案by Sven Marnach

回答by Mark Byers

回答by Jamie Rumbelow

回答by Lennart Regebro

回答by NPE

回答by Dalker

回答by Koke Cacao

相关推荐

python theading.Timer：如何将参数传递给回调？

更改python shell的背景颜色

Python 在进程运行时不断打印子进程输出

Python BitTorrent 库

相关推荐

最近更新

标签