database 要做或不做:将图像存储在数据库中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/815626/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
To Do or Not to Do: Store Images in a Database
提问by nickytonline
In the context of a web application, my old boss always said put a reference to an image in the database, not the image itself. I tend to agree that storing an url vs. the image itself in the DB is a good idea, but where I work now, we store a lot of images in the database.
在 Web 应用程序的上下文中,我的老老板总是说在数据库中放置对图像的引用,而不是图像本身。我倾向于同意将 url 与图像本身存储在数据库中是个好主意,但我现在工作的地方,我们在数据库中存储了很多图像。
The only reason I can think of is perhaps it's more secure? You don't want someone having a direct link to an url? But if that is the case, you can always have the web site/server handle images, like handlers in asp.net so that a user needs to authenticate to view the image. I'm also thinking performance would be hurt by pulling out the images from the database. Any other reasons why it might be a good/not so good idea to store images in a database?
我能想到的唯一原因可能是它更安全?您不希望有人直接链接到 url 吗?但如果是这种情况,您始终可以让网站/服务器处理图像,例如 asp.net 中的处理程序,以便用户需要进行身份验证才能查看图像。我还认为从数据库中提取图像会损害性能。将图像存储在数据库中可能是一个好主意/不是一个好主意的任何其他原因?
Exact Duplicate:User Images: Database or filesystem storage?
Exact Duplicate:Storing images in database: Yea or nay?
Exact Duplicate:Should I store my images in the database or folders?
Exact Duplicate:Would you store binary data in database or folders?
Exact Duplicate:Store pictures as files or or the database for a web app?
Exact Duplicate:Storing a small number of images: blob or fs?
Exact Duplicate:store image in filesystem or database?
完全重复:用户图像:数据库还是文件系统存储?
完全重复:在数据库中存储图像:是还是不是?
完全重复:我应该将图像存储在数据库还是文件夹中?
完全重复:您会将二进制数据存储在数据库或文件夹中吗?
完全重复:将图片存储为文件或网络应用程序的数据库?
完全重复:存储少量图像:blob 还是 fs?
完全重复:将图像存储在文件系统或数据库中?
采纳答案by Emil H
If you on occasionneed to retrieve an image and it has to be available on several different web servers. But I think that's pretty much it.
如果您有时需要检索图像并且它必须在多个不同的 Web 服务器上可用。但我认为差不多就是这样。
- If it doesn't have to be available on several servers, it's always better to put them in the file system.
- If it has to be available on several servers and there's actually some kind of load in the system, you'll need some kind of distributed storage.
- 如果它不必在多台服务器上可用,最好将它们放在文件系统中。
- 如果它必须在多台服务器上可用,并且系统中实际上存在某种负载,则您将需要某种分布式存储。
We're talking an edge case here, where you can avoid adding an additional level of complexity to your system by leveraging the database.
我们在这里讨论的是边缘情况,您可以通过利用数据库来避免给系统增加额外的复杂性。
Other than that, don't do it.
除此之外,不要这样做。
回答by Will Hartung
Pros of putting images in a Database.
将图像放入数据库的优点。
Transactions. When you save the blob, you can commit it just like any other piece of DB data. That means you can commit the blob along with any of the associate meta-data and be assured that the two are in sync. If you run out of disk space? No commit. File didn't upload completely? No commit. Silly application error? No commit. If keeping the images and their associated meta data consistent with each other is important to your application, then the transactions that a DB can provide can be a boon.
One system to manage. Need to back up the meta data and blobs? Back up the database. Need to replicate them? Replicate the database. Need to recover from a partial system failure? Reload the DB and roll the logs forward. All of the advantages that DBs bring to data in general (volume mapping, storage control, backups, replication, recovery, etc.) apply to your blobs. More consistency, easier management.
Security. Databases have very fine grained security features that can be leveraged. Schemas, user roles, even things like "read only views" to give secure access to a subset of data. All of those features work with tables holding blobs as well.
Centralized management. Related to #2, but basically the DBAs (as if they don't have enough power) get to manage one thing: the database. Modern databases (especially the larger ones) work very well with large installations across several machines. Single source of management simplifies procedures, simplifies knowledge transfer.
Most modern databases handle blobs just fine. With first class support of blobs in your data tier, you can easily stream blobs from the DB to the client. While there are operations that you can do that will "suck in" the entire blob all at once, if you don't need that facility, then don't use it. Study the SQL interface for your DB and leverage its features. No reason to treat them like "big strings" that are treated monolithically and turn your blobs in to big, memory gobbling, cache smashing bombs.
Just like you can set up dedicated file servers for images, you can set up dedicated blob servers in your database. Give them dedicated disk volumes, dedicated schemas, dedicated caches, etc. All of your data in the DB isn't the same, or behaves the same, no reason to configure it all the same. Good databases have the fine level of control.
交易。保存 blob 时,您可以像提交任何其他数据库数据一样提交它。这意味着您可以将 blob 与任何关联元数据一起提交,并确保两者同步。如果磁盘空间不足?没有提交。文件没有完全上传?没有提交。愚蠢的应用程序错误?没有提交。如果保持图像及其相关元数据彼此一致对您的应用程序很重要,那么 DB 可以提供的事务可能是一个福音。
一个系统来管理。需要备份元数据和 blob?备份数据库。需要复制它们吗?复制数据库。需要从部分系统故障中恢复?重新加载数据库并向前滚动日志。DB 为数据带来的所有优势(卷映射、存储控制、备份、复制、恢复等)都适用于您的 Blob。一致性更高,管理更轻松。
安全。数据库具有可以利用的非常细粒度的安全功能。模式、用户角色,甚至是诸如“只读视图”之类的东西,以提供对数据子集的安全访问。所有这些功能也适用于包含 blob 的表。
集中管理。与#2 相关,但基本上 DBA(好像他们没有足够的权力)可以管理一件事:数据库。现代数据库(尤其是大型数据库)在跨多台机器的大型安装中运行良好。单一来源的管理简化了程序,简化了知识转移。
大多数现代数据库都能很好地处理 blob。借助数据层中对 blob 的一流支持,您可以轻松地将 blob 从数据库流式传输到客户端。虽然您可以执行某些操作,这将一次性“吸收”整个 blob,但如果您不需要该设施,则不要使用它。研究您的数据库的 SQL 接口并利用其功能。没有理由将它们视为被整体处理的“大字符串”,并将您的 blob 变成大的、内存吞噬、缓存粉碎炸弹。
就像可以为图像设置专用文件服务器一样,您也可以在数据库中设置专用 Blob 服务器。为他们提供专用磁盘卷、专用架构、专用缓存等。数据库中的所有数据都不相同,或者行为相同,没有理由对其进行完全相同的配置。好的数据库具有良好的控制水平。
The primary nit regarding serving up an blob from a DB is ensuring that your HTTP layer actually leverages all of the HTTP protocol to perform the service.
关于从数据库提供 blob 的主要问题是确保您的 HTTP 层实际上利用所有 HTTP 协议来执行服务。
Many naive implementations simply grab the blob, and dump them wholesale down the socket. But HTTP has several important features well suited to streaming images, etc. Notably caching headers, ETags, and chunked transfer to allow clients to request "pieces" of the blob.
许多幼稚的实现只是简单地获取 blob,然后将它们全部倾倒到套接字中。但是 HTTP 有几个非常适合流式传输图像等的重要功能。特别是缓存标头、ETag 和分块传输以允许客户端请求 blob 的“片段”。
Ensure that your HTTP service is properly honoring all of those requests, and your DB can be a very good Web citizen. By caching the files in a filesystem for serving by the HTTP server, you gain some of those advantages "for free" (since a good server will do that anyway for "static" resources), but make sure if you do that, that you honor things like modification dates etc. for images.
确保您的 HTTP 服务正确地满足所有这些请求,并且您的数据库可以成为一个非常好的 Web 公民。通过在文件系统中缓存文件以供 HTTP 服务器提供服务,您可以“免费”获得一些优势(因为一个好的服务器无论如何都会为“静态”资源这样做),但请确保如果您这样做,您尊重图像的修改日期等。
For example, someone requests spaceshuttle.jpg, an image created on Jan 1, 2009. That ends up cached on the file system on the request date, say, Feb 1, 2009. Later, the image is purged from the cache (FIFO policy, or whatever), and someone, later, on Mar 1, 2009 requests it again. Well, now it has a Mar 1, 2009 "create date", even though the entire time its create date was really Jan 1. So, you can see, especially if your cache turns around a lot, clients that may be using If-Modified headers may be getting more data than they actually need, since the server THINKS the resource has changed, when in fact it has not.
例如,有人请求 spaceshuttle.jpg,这是一张创建于 2009 年 1 月 1 日的图像。它最终在请求日期(例如 2009 年 2 月 1 日)缓存在文件系统上。后来,图像从缓存中清除(FIFO 策略) ,或其他),后来有人在 2009 年 3 月 1 日再次请求它。好吧,现在它有一个 2009 年 3 月 1 日的“创建日期”,即使它的整个创建日期实际上是 1 月 1 日。所以,你可以看到,特别是如果你的缓存周转很多,客户端可能正在使用 If-修改后的标头可能会获得比实际需要更多的数据,因为服务器认为资源已更改,而实际上并没有。
If you keep the cache creation date in sync with the actual creation date, this can be less of a problem.
如果您将缓存创建日期与实际创建日期保持同步,则问题不大。
But the point is that it's something to think through about the entire problem in order be a "good web citizen", and save you and your clients potentially some bandwidth etc.
但关键是,为了成为一个“优秀的网络公民”,并为您和您的客户节省一些带宽等,需要仔细考虑整个问题。
I've just gone through all this for a Java project serving videos from a DB, and it all works a treat.
我刚刚为一个从数据库提供视频的 Java 项目经历了所有这些,这一切都很好。
回答by Adam Robinson
I understand that the majority of database professionals will cross their fingers and hiss at you if you store images in the database (or even mention it). Yes, there are definitely performance and storage implications when using the database as the repository for large blocks of binary data of any kind (images just tend to be the most common bits of data that can't be normalized). However, there are most certainly circumstances where database storage of images is not only allowable but advisable.
我知道,如果您将图像存储在数据库中(甚至提及它),大多数数据库专业人员都会对您交叉手指并向您发出嘘声。是的,当使用数据库作为任何类型的大块二进制数据的存储库时,肯定会有性能和存储影响(图像往往是无法标准化的最常见的数据位)。然而,在大多数情况下,图像的数据库存储不仅是允许的,而且是可取的。
For instance, in my old job we had an application where users would attach images to several different points of a report that they were writing, and those images had to be printed out when it was done. These reports were moved about via SQL Server replication, and it would have introduced a HUGE headache to try to manage these images and file paths across multiple systems and servers with any sort of reliability. Storing them in the database gave us all of that "for free," and the reporting tool didn't have to go out to the file system to retrieve the image.
例如,在我以前的工作中,我们有一个应用程序,用户可以将图像附加到他们正在编写的报告的几个不同点上,并且在完成后必须打印出这些图像。这些报告是通过 SQL Server 复制移动的,如果尝试以任何形式的可靠性跨多个系统和服务器管理这些图像和文件路径,将会带来巨大的麻烦。将它们存储在数据库中为我们提供了所有这些“免费”,并且报告工具不必去文件系统来检索图像。
回答by James Orr
My general advice would be not to limit yourself to one approach or the other - go with the technique that fits the situation. File systems are very good at storing files, and databases are very good at providing bite-sized chunks of data on request. On the other hand, one of my company's products has a requirement to store the entire state of the application in the database, which means that file attachments go in there as well. With our DB server (SQL Server 2005) I've yet to run into observable performance problems even with large customers and databases.
我的一般建议是不要将自己限制在一种方法或另一种方法 - 使用适合情况的技术。文件系统非常擅长存储文件,而数据库非常擅长根据请求提供一口大小的数据块。另一方面,我公司的一个产品要求将应用程序的整个状态存储在数据库中,这意味着文件附件也会进入数据库。使用我们的数据库服务器 (SQL Server 2005),即使对于大客户和数据库,我也没有遇到可观察到的性能问题。
Microsoft's SQL 2008 gives you the best of both worlds with the FileStream feature - might be worth checking out. http://technet.microsoft.com/en-us/library/bb933993.aspx
Microsoft 的 SQL 2008 通过 FileStream 功能为您提供了两全其美的功能 - 可能值得一试。 http://technet.microsoft.com/en-us/library/bb933993.aspx
回答by Thevs
One of the advantages of storing images into database is that it's portable across the systems and independent on filesystem(s) layout.
将图像存储到数据库中的优点之一是它可以跨系统移植并且独立于文件系统布局。
回答by cliff.meyers
The simplest / most performant / most scalable solution is to store your images on the file system. If security is a concern, put them in a location that is not accessible by the web server and write a script that handles security and serves up the files.
最简单/最高效/最可扩展的解决方案是将图像存储在文件系统上。如果需要考虑安全性,请将它们放在 Web 服务器无法访问的位置,并编写处理安全性和提供文件的脚本。
Assuming your web/app server and DB server are different machines, you will take a few hits by putting images in the DB: (1) Network latency between the two machines, (2) DB connection overhead, (3) consuming an additional DB connection for each image served. I would be more concerned about the last point: if your site serves a lot of images, your web servers are going to be consuming many DB connections and could exhaust your connection pools.
假设您的 web/app 服务器和 DB 服务器是不同的机器,您将通过将图像放入 DB 来获得一些成功:(1) 两台机器之间的网络延迟,(2) DB 连接开销,(3) 消耗额外的 DB服务的每个图像的连接。我会更关心最后一点:如果您的站点提供大量图像,则您的 Web 服务器将消耗许多数据库连接并可能耗尽您的连接池。
回答by chaos
If your application runs on multiple servers, I'd store the reference copy of your images in the database and then cache them on demand on the filesystems. Doing so is just way less of an error prone pain in the ass than trying to sync filesystems laterally.
如果您的应用程序在多台服务器上运行,我会将您的图像的参考副本存储在数据库中,然后根据需要将它们缓存在文件系统上。这样做比尝试横向同步文件系统更容易出错。
If your application is on a single server, then yeah, stick to the filesystem and have the database maintain a path to the data.
如果您的应用程序在单个服务器上,那么是的,请坚持使用文件系统并让数据库维护数据的路径。
回答by rhettg
Most SQL databases are of course not designed with serving up images in mind, but there is a certain amount of convenience associated with having them in the database.
大多数 SQL 数据库当然不是为提供图像而设计的,但是将它们放在数据库中会带来一定的便利。
For example, if you already have a database running and have replication configured. You instantly have an HA image store rather than trying to work some rsync or nfs based filesystem replication. Also, having a bunch of web processes (or designing some new service) to write files to disk increases your complexity a bit. Really it's just more moving parts.
例如,如果您已经运行了一个数据库并配置了复制。您立即拥有一个 HA 映像存储,而不是尝试进行一些基于 rsync 或 nfs 的文件系统复制。此外,拥有一堆 Web 进程(或设计一些新服务)来将文件写入磁盘会增加您的复杂性。实际上它只是更多的活动部件。
At the very least, I would recommend keeping 'meta' data about the image (such as any permissions, who owns it, etc) and the actual data separated into different tables so it will be fairly easy to switch to a different data store down the line. That coupled with some sort of CDN or caching should give you pretty good performance up to a point, so I suppose it depends on how scalable this application needs to be and how you balance that with ease of implementation.
至少,我建议保留有关图像的“元”数据(例如任何权限、谁拥有它等)和实际数据分离到不同的表中,因此切换到不同的数据存储将相当容易线。再加上某种 CDN 或缓存,在一定程度上应该可以为您提供非常好的性能,所以我认为这取决于此应用程序需要具有多大的可扩展性以及您如何在易于实现的情况下进行平衡。
回答by Fortyrunner
You don't have to store the URL (if you feel this is unsafe). You can just store a unique id that references the image elsewhere.
您不必存储 URL(如果您觉得这不安全)。您可以只存储一个在别处引用图像的唯一 ID。
Database storage tends to be more expensive and costly to maintain than a file system - hence I wouldn't store LOTS of images in a database.
数据库存储往往比文件系统更昂贵且维护成本更高 - 因此我不会在数据库中存储大量图像。
回答by cherouvim
- database for data
- filesystem for files
- 数据数据库
- 文件的文件系统

