database 存储上传的图像、SQL 数据库或磁盘文件系统的最佳位置是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/348363/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 07:06:54  来源:igfitidea点击:

What is the best place for storing uploaded images, SQL database or disk file system?

databasefilesystemsimage-uploadingbinaryfiles

提问by Tobias

I'm writing an application that allows users to upload images onto the server. I expect about 20 images per day all jpeg and probably not edited/resized. (This is another question, how to resize the images on the server side before storing. Maybe someone can please drop a .NET resource for that in the comment or so). I wonder now what the best place for storing uploaded images is.

我正在编写一个允许用户将图像上传到服务器的应用程序。我预计每天大约 20 张图片都是 jpeg,可能没有编辑/调整大小。(这是另一个问题,如何在存储之前在服务器端调整图像大小。也许有人可以在评论中删除一个 .NET 资源)。我现在想知道存储上传图像的最佳位置是什么。

  • Store the images as a file in the file system and create a record in a table with the exact path to that image.

  • Or, store the image itself in a table using an "image" or "binary data" data type of the database server.

  • 将图像作为文件存储在文件系统中,并在表中创建包含该图像确切路径的记录。

  • 或者,使用数据库服务器的“图像”或“二进制数据”数据类型将图像本身存储在表中。

I see advantages and disadvantages in both. I like a) because I can easily relocate the files and just have to change the table entry. On the other hand I don't like storing business data on the web server and I don't really want to connect the web server to any other datasource that holds business data (for security reasons) I like b) because all the information is in one place and easily accessible by a query. On the other hand the database will get very big very soon. Outsourcing that data could be more difficult.

我看到了两者的优点和缺点。我喜欢 a) 因为我可以轻松地重新定位文件,只需要更改表条目。另一方面,我不喜欢将业务数据存储在 Web 服务器上,并且我真的不想将 Web 服务器连接到任何其他保存业务数据的数据源(出于安全原因)我喜欢 b) 因为所有信息都是在一个地方,可以通过查询轻松访问。另一方面,数据库很快就会变得非常大。外包这些数据可能会更加困难。

采纳答案by Eran Galperin

I generally store files on the file-system, since that's what its there for, though there are exceptions. For files, the file-system is the most flexible and performant solution (usually).

我通常将文件存储在文件系统上,因为这就是它的用途,尽管也有例外。对于文件,文件系统是最灵活和性能最好的解决方案(通常)。

There are a few problems with storing files on a database - files are generally much larger than your average row - result-sets containing many large files will consume a lot of memory. Also, if you use a storage engine that employs table-locks for writes (ISAM for example), your files table might be locked often depending on the size / rate of files you are storing there.

在数据库中存储文件存在一些问题 - 文件通常比平均行大得多 - 包含许多大文件的结果集将消耗大量内存。此外,如果您使用使用表锁进行写入的存储引擎(例如 ISAM),您的文件表可能经常被锁定,具体取决于您存储在那里的文件的大小/速率。

Regarding security - I usually store the files in a directory that is outside of the document root (not accessible through an http request) and serve them through a script that checks for the proper authorization first.

关于安全性 - 我通常将文件存储在文档根目录之外的目录中(无法通过 http 请求访问),并通过首先检查正确授权的脚本为它们提供服务。

回答by David Avsajanishvili

The only benefit for the option B is having all the data in one system, yet it's a false benefit! You may argue that your code is also a form of data, and therefore also can be stored in database - how would you like it?

选项 B 的唯一好处是在一个系统中拥有所有数据,但这是一个错误的好处!您可能会争辩说您的代码也是一种数据形式,因此也可以存储在数据库中 - 您会喜欢它吗?

Unless you have some unique case:

除非你有一些独特的案例:

  • Business logic belongs in code.
  • Structured data belongs in database (relational or non-relational).
  • Bulk data belongs in storage (filesystem or other).
  • 业务逻辑属于代码。
  • 结构化数据属于数据库(关系或非关系)。
  • 批量数据属于存储(文件系统或其他)。

Files, Code, Data

文件、代码、数据

It is not necessary to use filesystem to keep files. Instead you may use cloud storage (such as Amazon S3) or Infrastructure-as-a-service on top of it (such as Uploadcare):

没有必要使用文件系统来保存文件。相反,您可以使用云存储(例如Amazon S3)或基础设施即服务(例如Uploadcare):

https://uploadcare.com/upload-api-cloud-storage-and-cdn/

https://uploadcare.com/upload-api-cloud-storage-and-cdn/

But storing files in the database is a bad idea.

但是将文件存储在数据库中是一个坏主意。

回答by Martin Beckett

Flickr use the filesystem -they discuss the reasons here

Flickr 使用文件系统 - 他们在这里讨论原因

回答by Brian MacKay

We have had clients insist on option B (database storage) a few times on a few different backends, and we alwaysended up going back to option A (filesystem storage) eventually.

我们让客户在几个不同的后端坚持使用选项 B(数据库存储)几次,而我们最终总是回到选项 A(文件系统存储)。

Large BLOBs like that just have not been handled well enough even by SQL Server 2005, which is the latest one we tried it on.

即使是 SQL Server 2005(我们尝试过的最新版本),像这样的大型 BLOB 也没有得到很好的处理。

Specifically, we saw serious bloat and I think maybe locking problems.

具体来说,我们看到了严重的膨胀,我认为可能是锁定问题。

One other note: if you are using NTFS based storage (windows server, etc) you might consider finding a way around putting thousands and thousands of files in one directory. I am not sure why, but sometimes the file system does not cope well with that situation. If anyone knows more about this I would love to hear it.

另一个注意事项:如果您使用基于 NTFS 的存储(Windows 服务器等),您可能会考虑找到一种方法来将成千上万个文件放在一个目录中。我不知道为什么,但有时文件系统不能很好地应对这种情况。如果有人对此有更多了解,我很想听听。

But I always try to use subdirectories to break things up a bit. Creation date often works well for this:

但我总是尝试使用子目录来分解一些东西。创建日期通常适用于此:

Images/2008/12/17/.jpg

图片/2008/12/17/.jpg

...This provides a decent level of separation, and also helps a bit during debugging. Explorer and FTP clients alike can choke a bit when there are truly huge directories.

...这提供了相当程度的分离,并且在调试过程中也有所帮助。当存在真正巨大的目录时,Explorer 和 FTP 客户端都可能会有些卡顿。

EDIT:Just a quick note for 2017, in more recent versions of SQL Server, there are new options for handling lots of BLOBs that are supposed to avoid the drawbacks I discussed.

编辑:只是 2017 年的一个简短说明,在更新的 SQL Server 版本中,有处理大量 BLOB 的新选项,这些选项应该避免我讨论的缺点。

回答by Sulung Nugroho

I know this is an old post. But many visitors to this page are getting nothing related to the question. Especially for a newbie.

我知道这是一个旧帖子。但是此页面的许多访问者都没有得到与该问题相关的任何信息。特别是对于一个新手。

How to upload and store images or file in our website:

如何在我们的网站上上传和存储图像或文件:

For a static website there maybe no problem since the file storage for some share hosting still adequate. The problem comes from a dynamic website when it gets bigger. Bigger in the database can be handled, but bigger in file such as images is becomes a problem. There are two type of images in a website:

对于静态网站,可能没有问题,因为某些共享托管的文件存储仍然足够。问题来自动态网站,当它变大时。可以处理更大的数据库,但更大的文件(例如图像)就成了问题。网站中有两种类型的图像:

  1. Images come from the administrator for dynamic blog. Usually, these images have been optimized before upload.

  2. Images from users in case of users is allowed to upload images such as avatar. Or users can create blog content and put some images from text editor. This kind of images is difficult to predict the size. Users can upload big images just for small content by resize the view size but not resize the image size.

  1. 图片来自动态博客的管理员。通常,这些图像在上传之前已经过优化。

  2. 来自用户的图像,如果用户允许上传头像等图像。或者用户可以创建博客内容并从文本编辑器中放置一些图像。这种图像很难预测大小。用户可以通过调整视图大小而不是调整图像大小来上传小内容的大图像。

By ignoring item no. 1 above, quick solution for item no. 2 can be temporary solved by the following tips if we don't have image optimizer functionality in our website :

通过忽略项目编号。1 以上,项目编号的快速解决方案。2 如果我们的网站没有图片优化功能,可以通过以下提示暂时解决:

  1. Do not allow users to directly upload from text editor by redirecting them to image gallery. On this page users must upload file in advance before they can embedded in the content. This method is called as a File Manager.

  2. Use a crop image function for users to upload images. This will limit the image size even users upload very big file. The final image is the result of the cropped image. We can define the size in server side and accept only for example 500Kb or lower.

  1. 不允许用户通过将用户重定向到图片库来直接从文本编辑器上传。在此页面上,用户必须提前上传文件,然后才能嵌入内容。此方法称为文件管理器。

  2. 使用裁剪图像功能供用户上传图像。即使用户上传非常大的文件,这也会限制图像大小。最终图像是裁剪图像的结果。我们可以在服务器端定义大小,只接受例如 500Kb 或更低。

Now, that is only temporary. For final solution, the question is repeated :

现在,这只是暂时的。对于最终解决方案,重复该问题:

  • How to handle a big images storage?
  • Resize or change the extension.
  • How a big or medium website or e-commerce handle the file storage for their images?
  • 如何处理大图像存储?
  • 调整或更改扩展名。
  • 大中型网站或电子商务如何处理其图像的文件存储?

What we can do then :

那么我们可以做什么:

  1. Migrate from share hosting VPS. Not enough? Then more higher by upgrading to Dedicated.

  2. Create your own server for file storage. Googling to do it. This is not as difficult as you think. Some people do it for their website.

  3. The easy way is use the CDN file storage service.

  1. 从共享托管 VPS 迁移。不够?然后通过升级到专用更高。

  2. 创建您自己的文件存储服务器。谷歌搜索做到这一点。这并不像你想象的那么困难。有些人为他们的网站这样做。

  3. 最简单的方法是使用 CDN 文件存储服务。

Okay, 1 and 2 is little bit expensive. But no 3 I think is the best solution.

好吧,1和2有点贵。但我认为没有 3 是最好的解决方案。

Some CDN services allow you to store as many web files as you want.

某些 CDN 服务允许您存储任意数量的 Web 文件。

Question, "how to upload file to CDN from our website?"

问题,“如何从我们的网站上传文件到 CDN?”

Don't worry, once you register, usually free, you will get guidance how to upload file and get their link from/to your website. You will get an API and more. It's easy.

不用担心,一旦您注册(通常是免费的),您将获得有关如何上传文件以及从/到您网站的链接的指导。您将获得一个 API 等等。这很简单。

Some providers give us a free service for 14 days with limited storage and bandwidth. But that will be okay for starting point. The only problem is because 'people never try'.

一些提供商为我们提供 14 天的免费服务,但存储空间和带宽有限。但这对于起点来说是可以的。唯一的问题是因为“人们从不尝试”。

Hope it will help for newbie.

希望对新手有帮助。

回答by too much php

I have recently created a PHP/MySQL app which stores PDFs/Word files in a MySQL table (as big as 40MB per file so far).

我最近创建了一个 PHP/MySQL 应用程序,它将 PDF/Word 文件存储在 MySQL 表中(到目前为止每个文件最大 40MB)。

Pros:

优点:

  • Uploaded files are replicated to backup server along with everything else, no separate backup strategy is needed (peace of mind).
  • Setting up the web server is slightly simpler because I don't need to have an uploads/ folder and tell all my applications where it is.
  • I get to use transactions for edits to improve data integrity - I don't have to worry about orphaned and missing files
  • 上传的文件与其他所有内容一起复制到备份服务器,不需要单独的备份策略(安心)。
  • 设置 Web 服务器稍微简单一些,因为我不需要有一个 uploads/ 文件夹并告诉我的所有应用程序它在哪里。
  • 我可以使用事务进行编辑以提高数据完整性 - 我不必担心孤立和丢失的文件

Cons:

缺点:

  • mysqldump now takes a looooong time because there is 500MB of file data in one of the tables.
  • Overall not very memory/cpu efficient when compared to filesystem
  • mysqldump 现在需要很长时间,因为其中一个表中有 500MB 的文件数据。
  • 与文件系统相比,总体上内存/CPU 效率不是很高

I'd call my implementation a success, it takes care of backup requirements and simplifies the layout of the project. The performance is fine for the 20-30 people who use the app.

我认为我的实施是成功的,它满足了备份要求并简化了项目的布局。对于使用该应用程序的 20-30 人来说,性能很好。

回答by barfoon

I use uploaded images on my website and I would definitely say option a).

我在我的网站上使用上传的图片,我肯定会说选项 a)。

One other thing I'd highly recommend is immediately changing the file name from what the user has named the photo, to something more manageable. For example something with the date and time to uniquely identify each picture.

我强烈推荐的另一件事是立即将文件名从用户为照片命名的名称更改为更易于管理的名称。例如用日期和时间来唯一标识每张图片。

It also helps to strip the user's file name of any strange characters to avoid future complications.

它还有助于去除任何奇怪字符的用户文件名,以避免将来出现复杂情况。

回答by Tim Howland

Definitely resize the image, and check it's format if you can. There have been cases of malicious files being uploaded and served by unwitting hosts- for instance, the GIFARvulnerability allowed you to hide a malicious java applet in a GIF file, which would then be able to read cookies in the current context and send them to another site for a cross-site scripting attack. Resizing the images usually prevents this, as it munges the embedded code. While this attack has been fixed by JVM patches, naively serving up binary files without scrubbing them opens you up to a whole range of vulnerabilities.

一定要调整图像的大小,如果可以的话,检查它的格式。曾经有过恶意文件被不知情的主机上传和服务的情况——例如,GIFAR漏洞允许你在 GIF 文件中隐藏一个恶意的 Java 小程序,然后它就能够读取当前上下文中的 cookie 并将它们发送到另一个站点进行跨站点脚本攻击。调整图像大小通常可以防止这种情况,因为它会修改嵌入的代码。虽然这种攻击已由 JVM 补丁修复,但天真地提供二进制文件而不清理它们会让您面临一系列漏洞。

Remember, most virus scanners can only run against the filesystem- if you store your binaries in the DB, you won't be able to run a scanner against them very easily.

请记住,大多数病毒扫描程序只能针对文件系统运行——如果您将二进制文件存储在数据库中,您将无法很容易地针对它们运行扫描程序。

回答by Charles Graham

There's sort of a hybrid approach in SQL Server 2008 called the filestream datatypethat was talked about on RunAs Radio #74, which is sort of like the best of both worlds. Most people don't have the 2008 otion, but if you do, this option looks pretty cool

在 SQL Server 2008 中有一种称为文件流数据类型的混合方法,在RunAs Radio #74上讨论过,这有点像两全其美。大多数人没有 2008 otion,但如果你有,这个选项看起来很酷

回答by Uday Hiwarale

This is basically I do.

这基本上是我做的。

  1. Store an uploaded image in temporary directory or memory.
  2. Process that image before permanently storing it. 2.1. Color corrections 2.2. Compress 2.3. Create several copies based on image dimensions 2.4. Rename with .xl, .lg, .md, .sm etc. suffixes
  3. Pack all processed image files (from a single file) inside a folder with folder name as idwhich will be stored in database for any row/document along with image file name(or may be random name as image name).
  4. Create yyyy/mm/dpathfolder if doesn't exist. For example 2016/08/21. Remember that path and store in database for same document and row.
  5. Move image idfolder to pathfolder. (Path folder may be located in /var/web-content folder.)
  6. Flush memory buffer or delete temporary file.
  1. 将上传的图像存储在临时目录或内存中。
  2. 在永久存储之前处理该图像。2.1. 颜色校正 2.2。压缩 2.3。2.4 根据图像尺寸创建多个副本。使用 .xl、.lg、.md、.sm 等后缀重命名
  3. 将所有处理过的图像文件(来自单个文件)打包在一个文件夹中,文件夹名称id将与任何行/文档一起存储在数据库中image file name(或可能是随机名称作为图像名称)。
  4. 如果不存在,则创建yyyy/mm/dpath文件夹。例如 2016/08/21。记住该路径并在数据库中存储相同的文档和行。
  5. 将图像id文件path夹移动到文件夹。(路径文件夹可能位于 /var/web-content 文件夹中。)
  6. 刷新内存缓冲区或删除临时文件。


When you need to access any image mentioned in a document, you have the path and id of the folder than contains images. For example /var/web-content/{{path}}/{{id}}/image-file-name.sm.jpg

当您需要访问文档中提到的任何图像时,您拥有包含图像的文件夹的路径和 ID。例如/var/web-content/{{path}}/{{id}}/image-file-name.sm.jpg

This way if you have to delete all processed image files, just delete the folder and it's content recursively.

这样,如果您必须删除所有处理过的图像文件,只需递归删除文件夹及其内容。