在 PostgreSQL 中存储图像
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/54500/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Storing Images in PostgreSQL
提问by akdom
Alright, so I'm working on an application which will use a Linux back-end running PostgreSQL to serve up images to a Windows box with the front end written in C#.NET, though the front-end should hardly matter. My question is:
好的,所以我正在开发一个应用程序,该应用程序将使用运行 PostgreSQL 的 Linux 后端将图像提供给 Windows 框,前端用 C#.NET 编写,尽管前端应该无关紧要。我的问题是:
- What is the best way to deal with storing images in Postgres?
- 处理在 Postgres 中存储图像的最佳方法是什么?
The images are around 4-6 megapixels each, and we're storing upwards of 3000. It might also be good to note: this is not a web application, there will at most be about two front-ends accessing the database at once.
每个图像大约 4-6 兆像素,我们存储了 3000 个以上的图像。还需要注意:这不是一个 Web 应用程序,最多将有大约两个前端同时访问数据库。
回答by Peter Krauss
Updating to 2012, when we see that image sizes, and number of images, are growing and growing, in all applications...
更新到 2012 年,当我们看到所有应用程序中的图像大小和图像数量都在不断增长时……
We need some distinction between "original image" and "processed image", like thumbnail.
我们需要区分“原始图像”和“处理后的图像”,例如缩略图。
As Jcoby's answer says, there are two options, then, I recommend:
正如 Jcoby 的回答所说,有两种选择,然后,我建议:
use blob(Binary Large OBject): for original image store, at your table. See Ivan's answer (no problem with backing up blobs!), PostgreSQL additional supplied modules, How-tosetc.
use a separate database with DBlink: for original image store, at another (unified/specialized) database. In this case, I prefer bytea, but blobis near the same. Separating database is the best way for a "unified image webservice".
use bytea(BYTE Array): for caching thumbnail images. Cache the little images to send it fast to the web-browser (to avoiding rendering problems) and reduce server processing. Cache also essential metadata, like width and height. Database caching is the easiest way, but check your needs and server configs (ex. Apache modules): store thumbnails at file systemmay be better, compare performances. Remember that it is a (unified) web-service, then can be stored at a separate database (with no backups), serving many tables. See also PostgreSQL binary data types manual, tests with bytea column, etc.
使用blob(二进制大对象):用于原始图像存储,在您的餐桌上。见伊万的回答(用备份的斑点没问题!),PostgreSQL的额外供电的模块,如何渡等。
将单独的数据库与DBlink一起使用:用于原始图像存储,位于另一个(统一/专用)数据库中。在这种情况下,我更喜欢bytea,但blob几乎相同。分离数据库是“统一图像网络服务”的最佳方式。
使用bytea(BYTE Array):用于缓存缩略图。缓存小图像以将其快速发送到网络浏览器(以避免渲染问题)并减少服务器处理。还缓存必要的元数据,如宽度和高度。数据库缓存是最简单的方法,但请检查您的需求和服务器配置(例如 Apache 模块):在文件系统中存储缩略图可能更好,比较性能。请记住,它是一个(统一的)网络服务,然后可以存储在一个单独的数据库中(没有备份),为许多表提供服务。另请参阅PostgreSQL 二进制数据类型手册、使用 bytea 列的测试等。
NOTE1: today the use of "dual solutions" (database+filesystem)is deprecated (!). There are many advantages to using "only database" instead dual. PostgreSQL have comparable performance and good tools for export/import/input/output.
注意1:今天不推荐使用“双重解决方案”(数据库+文件系统)(!)。使用“仅数据库”而不是双数据库有很多优点。PostgreSQL 具有相当的性能和良好的导出/导入/输入/输出工具。
NOTE2: remember that PostgreSQL have only bytea, not have a default Oracle's BLOB: "The SQL standard defines (...) BLOB. The input format is different from bytea, but the provided functions and operators are mostly the same",Manual.
注意2:请记住,PostgreSQL 只有bytea,没有默认的 Oracle BLOB:“SQL 标准定义了 (...) BLOB。输入格式与 bytea 不同,但提供的函数和运算符基本相同”,手册。
EDIT 2014: I have not changed the original text above today (my answer was Apr 22 '12, now with 14 votes), I am opening the answer for your changes(see "Wiki mode", you can edit!), for proofreadingand for updates.
The question is stable (@Ivans's '08 answer with 19 votes), please, help to improve this text.
编辑2014 年:我今天没有更改上面的原文(我的回答是2012年 4 月 22 日,现在有 14 票),我正在为您的更改打开答案(请参阅“维基模式”,您可以编辑!),进行校对和更新。
问题是稳定的(@Ivans 的 '08 回答,有 19 票),请帮助改进本文。
回答by Ivan Krechetov
Re jcoby's answer:
Re jcoby 的回答:
bytea being a "normal" column also means the value being read completely into memory when you fetch it. Blobs, in contrast, you can stream into stdout. That helps in reducing the server memory footprint. Especially, when you store 4-6 MPix images.
bytea 是一个“正常”列也意味着当您获取它时该值被完全读入内存。相比之下,Blob 可以流式传输到 stdout。这有助于减少服务器内存占用。特别是当您存储 4-6 个 MPix 图像时。
No problem with backing up blobs. pg_dump provides "-b" option to include the large objects into the backup.
备份 blob 没有问题。pg_dump 提供“-b”选项将大对象包含到备份中。
So, I prefer using pg_lo_*, you may guess.
所以,我更喜欢使用 pg_lo_*,你可能猜到了。
Re Kris Erickson's answer:
重新克里斯埃里克森的回答:
I'd say the opposite :). When images are not the only data you store, don't store them on the file system unless you absolutely have to. It's such a benefit to be always sure about your data consistency, and to have the data "in one piece" (the DB). BTW, PostgreSQL is great in preserving consistency.
我会说相反的:)。当图像不是您存储的唯一数据时,除非绝对必要,否则不要将它们存储在文件系统中。始终确保您的数据一致性,并将数据“整合到一起”(DB),这是非常有益的。顺便说一句,PostgreSQL 在保持一致性方面非常出色。
However, true, reality is often too performance-demanding ;-), and it pushes you to serve the binary files from the file system. But even then I tend to use the DB as the "master" storage for binaries, with all the other relations consistently linked, while providing some file system-based caching mechanism for performance optimization.
然而,确实,现实往往对性能要求太高;-),它促使您从文件系统提供二进制文件。但即便如此,我还是倾向于将数据库用作二进制文件的“主”存储,所有其他关系始终保持链接,同时提供一些基于文件系统的缓存机制以优化性能。
回答by jcoby
In the database, there are two options:
在数据库中,有两个选项:
- bytea. Stores the data in a column, exported as part of a backup. Uses standard database functions to save and retrieve. Recommended for your needs.
- blobs. Stores the data externally, not normally exported as part of a backup. Requires special database functions to save and retrieve.
- 拜茶。将数据存储在列中,作为备份的一部分导出。使用标准数据库函数来保存和检索。推荐满足您的需求。
- 斑点。在外部存储数据,通常不作为备份的一部分导出。需要特殊的数据库函数来保存和检索。
I've used bytea columns with great success in the past storing 10+gb of images with thousands of rows. PG's TOAST functionality pretty much negates any advantage that blobs have. You'll need to include metadata columns in either case for filename, content-type, dimensions, etc.
过去,我使用 bytea 列成功地存储了 10+gb 的数千行图像。PG 的 TOAST 功能几乎否定了 blob 的任何优势。在任何一种情况下,您都需要为文件名、内容类型、维度等包含元数据列。
回答by Kenyakorn Ketsombut
Quick update to mid 2015:
到 2015 年中期的快速更新:
You can use the Postgres Foreign Data interface, to store the files in more suitable database. For example put the files in a GridFS which is part of MongoDB. Then use https://github.com/EnterpriseDB/mongo_fdwto access it in Postgres.
您可以使用Postgres 外部数据接口,将文件存储在更合适的数据库中。例如,将文件放在 GridFS 中,它是 MongoDB 的一部分。然后使用 https://github.com/EnterpriseDB/mongo_fdw在 Postgres 中访问它。
That has the advantages, that you can access/read/write/backup it in Postrgres and MongoDB, depending on what gives you more flexiblity.
这样做的好处是,您可以在 Postrgres 和 MongoDB 中访问/读取/写入/备份它,这取决于什么给了您更多的灵活性。
There are also foreign data wrappers for file systems:https://wiki.postgresql.org/wiki/Foreign_data_wrappers#File_Wrappers
还有用于文件系统的外部数据包装器:https : //wiki.postgresql.org/wiki/Foreign_data_wrappers#File_Wrappers
As an example you can use this one: https://multicorn.readthedocs.org/en/latest/foreign-data-wrappers/fsfdw.html(see here for brief usage example)
作为一个例子,你可以使用这个:https: //multicorn.readthedocs.org/en/latest/foreign-data-wrappers/fsfdw.html(见这里的简要使用示例)
That gives you the advantage of the consistency (all linked files are definitely there) and all the other ACIDs, while there are still on the actual file system, which means you can use any file system you want and the webserver can serve them directly (OS caching applies too).
这为您提供了一致性(所有链接文件都在那里)和所有其他 ACID 的优势,而实际文件系统上仍然存在,这意味着您可以使用任何您想要的文件系统,并且网络服务器可以直接为它们提供服务(操作系统缓存也适用)。
回答by Kris Erickson
Update from 10 years laterIn 2008 the hard drives you would run a database on would have much different characteristics and much higher cost than the disks you would store files on. These days there are much better solutions for storing files that didn't exist 10 years ago and I would revoke this advice and advise readers to look at some of the other answers in this thread.
10 年后的更新2008 年,用于运行数据库的硬盘驱动器与用于存储文件的磁盘相比具有截然不同的特性和更高的成本。如今,有更好的解决方案来存储 10 年前不存在的文件,我将撤销此建议并建议读者查看此线程中的其他一些答案。
Original
原来的
Don't store in images in the database unless you absolutely have to. I understand that this is not a web application, but if there isn't a shared file location that you can point to save the location of the file in the database.
除非绝对必要,否则不要将图像存储在数据库中。我知道这不是一个 Web 应用程序,但如果没有共享文件位置,您可以指向保存文件在数据库中的位置。
//linuxserver/images/imagexxx.jpg
then perhaps you can quickly set up a webserver and store the web urls in the database (as well as the local path). While databases can handle LOB's and 3000 images (4-6 Megapixels, assuming 500K an image) 1.5 Gigs isn't a lot of space file systems are much better designed for storing large files than a database is.
那么也许您可以快速设置一个网络服务器并将网址存储在数据库中(以及本地路径)。虽然数据库可以处理 LOB 和 3000 个图像(4-6 兆像素,假设图像有 500K),但 1.5 Gigs 并不是很多空间文件系统比数据库更适合存储大文件。
回答by Mike Reedell
回答by ccleve
If your images are small, consider storing them as base64 in a plain text field.
如果您的图像很小,请考虑将它们以 base64 格式存储在纯文本字段中。
The reason is that while base64 has an overhead of 33%, with compression that mostly goes away. (See What is the space overhead of Base64 encoding?) Your database will be bigger, but the packets your webserver sends to the client won't be. In html, you can inline base64 in an <img src=""> tag, which can possibly simplify your app because you won't have to serve up the images as binary in a separate browser fetch. Handling images as text also simplifies things when you have to send/receive json, which doesn't handle binary very well.
原因是虽然 base64 有 33% 的开销,但压缩大部分都会消失。(请参阅Base64 编码的空间开销是多少?)您的数据库会更大,但您的网络服务器发送给客户端的数据包不会。在 html 中,您可以在 <img src=""> 标记中内联 base64,这可能会简化您的应用程序,因为您不必在单独的浏览器提取中以二进制形式提供图像。当您必须发送/接收 json 时,将图像作为文本处理也可以简化事情,这不能很好地处理二进制文件。
Yes, I understand you could store the binary in the database and convert it to/from text on the way in and out of the database, but sometimes ORMs make that a hassle. It can be simpler just to treat it as straight text just like all your other fields.
是的,我知道您可以将二进制文件存储在数据库中,并在进出数据库的过程中将其转换为文本/从文本转换,但有时 ORM 会使这变得很麻烦。就像所有其他字段一样,将其视为纯文本会更简单。
This is definitely the right way to handle thumbnails.
这绝对是处理缩略图的正确方法。
(OP's images are not small, so this is not really an answer to his question.)
(OP的图像不小,所以这不是他问题的真正答案。)