在高负载站点中使用 PHP 的策略
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24675/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Tactics for using PHP in a high-load site
提问by Ross
Before you answer this I have never developed anything popular enough to attain high server loads. Treat me as (sigh) an alien that has just landed on the planet, albeit one that knows PHP and a few optimisation techniques.
在你回答这个问题之前,我从来没有开发过任何流行的东西来获得高服务器负载。把我当作(叹息)一个刚刚降落在这个星球上的外星人,尽管一个知道 PHP 和一些优化技术的人。
I'm developing a tool in PHPthat could attain quite a lot of users, if it works out right. However while I'm fully capable of developing the program I'm pretty much clueless when it comes to making something that can deal with huge traffic. So here's a few questions on it (feel free to turn this question into a resource thread as well).
我正在用PHP开发一个工具,如果它运行得当,它可以吸引相当多的用户。然而,虽然我完全有能力开发这个程序,但在制作可以处理巨大流量的东西时,我几乎一无所知。所以这里有一些关于它的问题(也可以随意将这个问题变成一个资源线程)。
Databases
数据库
At the moment I plan to use the MySQLi features in PHP5. However how should I setup the databases in relation to users and content? Do I actually needmultiple databases? At the moment everything's jumbled into one database - although I've been considering spreading user data to one, actual content to another and finally core site content (template masters etc.) to another. My reasoning behind this is that sending queries to different databases will ease up the load on them as one database = 3 load sources. Also would this still be effective if they were all on the same server?
目前我计划使用 PHP5 中的 MySQLi 功能。但是,我应该如何设置与用户和内容相关的数据库?我真的需要多个数据库吗?目前,一切都混杂在一个数据库中——尽管我一直在考虑将用户数据传播到一个数据库,将实际内容传播到另一个数据库,最后将核心站点内容(模板母版等)传播到另一个数据库。这背后的原因是,将查询发送到不同的数据库将减轻它们的负载,因为一个数据库 = 3 个负载源。如果它们都在同一台服务器上,这仍然有效吗?
Caching
缓存
I have a template system that is used to build the pages and swap out variables. Master templates are stored in the database and each time a template is called it's cached copy (a html document) is called. At the moment I have two types of variable in these templates - a static var and a dynamic var. Static vars are usually things like page names, the name of the site - things that don't change often; dynamic vars are things that change on each page load.
我有一个用于构建页面和换出变量的模板系统。主模板存储在数据库中,每次调用模板时都会调用它的缓存副本(html 文档)。目前我在这些模板中有两种类型的变量 - 静态变量和动态变量。静态变量通常是诸如页面名称、站点名称之类的东西 - 不经常更改的东西;动态变量是在每个页面加载时发生变化的东西。
My question on this:
我的问题是:
Say I have comments on different articles. Which is a better solution: store the simple comment template and render comments (from a DB call) each time the page is loaded or store a cached copy of the comments page as a html page - each time a comment is added/edited/deleted the page is recached.
假设我对不同的文章有评论。哪个是更好的解决方案:每次加载页面时存储简单的评论模板并呈现评论(来自数据库调用)或将评论页面的缓存副本存储为 html 页面 - 每次添加/编辑/删除评论页面被重新缓存。
Finally
最后
Does anyone have any tips/pointers for running a high load site on PHP. I'm pretty sure it's a workable language to use - Facebook and Yahoo! give it great precedence - but are there any experiences I should watch out for?
有没有人有在 PHP 上运行高负载站点的任何提示/指针。我很确定这是一种可行的语言 - Facebook 和 Yahoo! 给它很大的优先权 - 但有什么我应该注意的经历吗?
采纳答案by Gary Richardson
No two sites are alike. You really need to get a tool like jmeterand benchmark to see where your problem points will be. You can spend a lot of time guessing and improving, but you won't see real results until you measure and compare your changes.
没有两个站点是相同的。你真的需要得到一个像jmeter和 benchmark这样的工具来看看你的问题点在哪里。您可以花费大量时间进行猜测和改进,但在您衡量和比较更改之前,您不会看到真正的结果。
For example, for many years, the MySQL query cache was the solution to all of our performance problems. If your site was slow, MySQL experts suggested turning the query cache on. It turns out that if you have a high write load, the cache is actually crippling. If you turned it on without testing, you'd never know.
例如,多年来,MySQL 查询缓存是我们所有性能问题的解决方案。如果您的站点速度很慢,MySQL 专家建议打开查询缓存。事实证明,如果您的写入负载很高,则缓存实际上会严重损坏。如果你在没有测试的情况下打开它,你永远不会知道。
And don't forget that you are never done scaling. A site that handles 10req/s will need changes to support 1000req/s. And if you're lucking enough to need to support 10,000req/s, your architecture will probably look completely different as well.
并且不要忘记您永远不会完成缩放。处理 10req/s 的站点需要更改以支持 1000req/s。如果您足够幸运需要支持 10,000req/s,您的架构也可能看起来完全不同。
Databases
数据库
- Don't use MySQLi -- PDOis the 'modern' OO database access layer. The most important feature to use is placeholders in your queries. It's smart enough to use server side prepares and other optimizations for you as well.
- You probably don't want to break your database up at this point. If you do find that one database isn't cutting, there are several techniques to scale up, depending on your app. Replicating to additional servers typically works well if you have more reads than writes. Sharding is a technique to split your data over many machines.
- 不要使用 MySQLi—— PDO是“现代”OO 数据库访问层。要使用的最重要的功能是查询中的占位符。它足够聪明,可以为您使用服务器端准备和其他优化。
- 此时您可能不想破坏数据库。如果您确实发现一个数据库没有被削减,则有多种技术可以扩展,具体取决于您的应用程序。如果读取多于写入,则复制到其他服务器通常效果很好。分片是一种将数据拆分到多台机器上的技术。
Caching
缓存
- You probably don't want to cache in your database. The database is typically your bottleneck, so adding more IO's to it is typically a bad thing. There are several PHP caches out there that accomplish similar things like APCand Zend.
- Measure your system with caching on and off. I bet your cache is heavier than serving the pages straight.
- If it takes a long time to build your comments and article data from the db, integrate memcacheinto your system. You can cache the query results and store them in a memcached instance. It's important to remember that retrieving the data from memcache must be faster than assembling it from the database to see any benefit.
- If your articles aren't dynamic, or you have simple dynamic changes after it's generated, consider writing out html or php to the disk. You could have an index.php page that looks on disk for the article, if it's there, it streams it to the client. If it isn't, it generates the article, writes it to the disk and sends it to the client. Deleting files from the disk would cause pages to be re-written. If a comment is added to an article, delete the cached copy -- it would be regenerated.
- 您可能不想在数据库中缓存。数据库通常是您的瓶颈,因此向其添加更多 IO 通常是一件坏事。有几个 PHP 缓存可以完成类似的事情,比如APC和 Zend。
- 打开和关闭缓存来衡量您的系统。我敢打赌你的缓存比直接提供页面更重。
- 如果从 db 构建评论和文章数据需要很长时间,请将memcache集成到您的系统中。您可以缓存查询结果并将它们存储在 memcached 实例中。重要的是要记住,从 memcache 检索数据必须比从数据库组装数据更快才能看到任何好处。
- 如果您的文章不是动态的,或者您在生成后有简单的动态更改,请考虑将 html 或 php 写入磁盘。你可以有一个在磁盘上查找文章的 index.php 页面,如果它在那里,它会将它流式传输到客户端。如果不是,它会生成文章,将其写入磁盘并将其发送到客户端。从磁盘中删除文件会导致页面被重写。如果将评论添加到文章中,请删除缓存副本——它将重新生成。
回答by thesmart
I'm a lead developer on a site with over 15M users. We have had very little scaling problems because we planned for it EARLY and scaled thoughtfully. Here are some of the strategies I can suggest from my experience.
我是一个拥有超过 1500 万用户的网站的首席开发人员。我们很少遇到扩展问题,因为我们很早就计划好了,并且经过深思熟虑。以下是我可以根据我的经验提出的一些策略。
SCHEMAFirst off, denormalize your schemas. This means that rather than to have multiple relational tables, you should instead opt to have one big table. In general, joins are a waste of precious DB resources because doing multiple prepares and collation burns disk I/O's. Avoid them when you can.
SCHEMA首先,非规范化你的模式。这意味着与其拥有多个关系表,不如选择拥有一张大表。通常,连接会浪费宝贵的数据库资源,因为进行多次准备和整理会消耗磁盘 I/O。尽可能避免它们。
The trade-off here is that you will be storing/pulling redundant data, but this is acceptable because data and intra-cage bandwidth is very cheap (bigger disks) whereas multiple prepare I/O's are orders of magnitude more expensive (more servers).
这里的权衡是您将存储/提取冗余数据,但这是可以接受的,因为数据和笼内带宽非常便宜(更大的磁盘),而多个准备 I/O 的成本要高几个数量级(更多服务器) .
INDEXINGMake sure that your queries utilize at least one index. Beware though, that indexes will cost you if you write or update frequently. There are some experimental tricks to avoid this.
索引确保您的查询使用至少一个索引。但请注意,如果您经常编写或更新,索引会花费您。有一些实验技巧可以避免这种情况。
You can try adding additional columns that aren't indexed which run parallel to your columns that are indexed. Then you can have an offline process that writes the non-indexed columns over the indexed columns in batches. This way, you can control better when mySQL will need to recompute the index.
您可以尝试添加未编入索引的其他列,这些列与编入索引的列并行运行。然后你可以有一个离线过程,将非索引列批量写入索引列。这样,您可以更好地控制 mySQL 何时需要重新计算索引。
Avoid computed queries like a plague. If you must compute a query, try to do this once at write time.
避免像瘟疫一样的计算查询。如果您必须计算查询,请尝试在写入时执行一次。
CACHINGI highly recommend Memcached. It has been proven by the biggest players on the PHP stack (Facebook) and is very flexible. There are two methods to doing this, one is caching in your DB layer, the other is caching in your business logic layer.
缓存我强烈推荐 Memcached。它已被 PHP 堆栈 (Facebook) 上的最大玩家证明,并且非常灵活。有两种方法可以做到这一点,一种是在数据库层缓存,另一种是在业务逻辑层缓存。
The DB layer option would require caching the result of queries retrieved from the DB. You can hash your SQL query using md5() and use that as a lookup key before going to database. The upside to this is that it is pretty easy to implement. The downside (depending on implementation) is that you lose flexibility because you're treating all caching the same with regard to cache expiration.
DB 层选项需要缓存从 DB 检索的查询结果。您可以使用 md5() 散列您的 SQL 查询,并在进入数据库之前将其用作查找键。这样做的好处是它很容易实现。缺点(取决于实现)是您失去了灵活性,因为您在缓存过期方面对待所有缓存都是一样的。
In the shop I work in, we use business layer caching, which means each concrete class in our system controls its own caching schema and cache timeouts. This has worked pretty well for us, but be aware that items retrieved from DB may not be the same as items from cache, so you will have to update cache and DB together.
在我工作的商店中,我们使用业务层缓存,这意味着我们系统中的每个具体类都控制自己的缓存模式和缓存超时。这对我们来说效果很好,但请注意,从 DB 检索的项目可能与从缓存中检索的项目不同,因此您必须同时更新缓存和 DB。
DATA SHARDINGReplication only gets you so far. Sooner than you expect, your writes will become a bottleneck. To compensate, make sure to support data sharding early as possible. You will likely want to shoot yourself later if you don't.
数据分片复制只能让你走到这一步。比您预期的更快,您的写入将成为瓶颈。作为补偿,请确保尽早支持数据分片。如果你不这样做,你很可能想在以后拍摄自己。
It is pretty simple to implement. Basically, you want to separate the key authority from the data storage. Use a global DB to store a mapping between primary keys and cluster ids. You query this mapping to get a cluster, and then query the cluster to get the data. You can cache the hell out of this lookup operation which will make it a negligible operation.
实现起来非常简单。基本上,您希望将密钥权限与数据存储分开。使用全局数据库来存储主键和集群 ID 之间的映射。您查询此映射以获取集群,然后查询集群以获取数据。你可以缓存这个查找操作的地狱,这将使它成为一个可以忽略的操作。
The downside to this is that it may be difficult to piece together data from multiple shards. But, you can engineer your way around that as well.
这样做的缺点是可能很难将来自多个分片的数据拼凑在一起。但是,您也可以设计自己的方式来解决这个问题。
OFFLINE PROCESSINGDon't make the user wait for your backend if they don't have to. Build a job queue and move any processing that you can offline, doing it separate from the user's request.
离线处理如果不需要,不要让用户等待您的后端。构建一个作业队列并移动您可以离线的任何处理,将其与用户的请求分开进行。
回答by Ryan Doherty
I've worked on a few sites that get millions/hits/month backed by PHP & MySQL. Here are some basics:
我曾在几个网站上工作过,这些网站由 PHP 和 MySQL 支持,每月点击数百万次。以下是一些基础知识:
- Cache, cache, cache. Caching is one of the simplest and most effective ways to reduce load on your webserver and database. Cache page content, queries, expensive computation, anything that is I/O bound. Memcache is dead simple and effective.
- Use multiple servers once you are maxed out. You can have multiple web servers and multiple database servers (with replication).
- Reduce overall # of request to your webservers. This entails caching JS, CSS and images using expires headers. You can also move your static content to a CDN, which will speed up your user's experience.
- Measure & benchmark. Run Nagios on your production machines and load test on your dev/qa server. You need to know when your server will catch on fire so you can prevent it.
- 缓存,缓存,缓存。缓存是减少网络服务器和数据库负载的最简单和最有效的方法之一。缓存页面内容、查询、昂贵的计算、任何 I/O 绑定的东西。Memcache 非常简单而有效。
- 一旦您用完,请使用多台服务器。您可以拥有多个 Web 服务器和多个数据库服务器(具有复制功能)。
- 减少对您的网络服务器的请求总数。这需要使用过期标头缓存 JS、CSS 和图像。您还可以将静态内容移动到 CDN,这将加快您的用户体验。
- 衡量和基准。在您的生产机器上运行 Nagios 并在您的 dev/qa 服务器上进行负载测试。您需要知道您的服务器何时会着火,以便您可以防止它。
I'd recommend reading Building Scalable Websites, it was written by one of the Flickr engineers and is a great reference.
我推荐阅读Building Scalable Websites,它是由一位 Flickr 工程师编写的,是一个很好的参考。
Check out my blog post about scalability too, it has a lot of links to presentations about scaling with multiple languages and platforms: http://www.ryandoherty.net/2008/07/13/unicorns-and-scalability/
也可以查看我关于可扩展性的博客文章,它有很多关于使用多种语言和平台进行扩展的演示链接:http: //www.ryandoherty.net/2008/07/13/unicorns-and-scalability/
回答by DavidM
Re: PDO / MySQLi / MySQLND
回复:PDO / MySQLi / MySQLND
@gary
@加里
You cannot just say "don't use MySQLi" as they have different goals. PDO is almost like an abstraction layer (although it is not actually) and is designed to make it easy to use multiple database products whereas MySQLi is specific to MySQL conections. It is wrong to say that PDO is the modern access layer in the context of comparing it to MySQLi because your statement implies that the progression has been mysql -> mysqli -> PDO which is not the case.
你不能只说“不要使用 MySQLi”,因为它们有不同的目标。PDO 几乎就像一个抽象层(尽管它实际上不是),旨在使使用多个数据库产品变得容易,而 MySQLi 特定于 MySQL 连接。在将 PDO 与 MySQLi 进行比较的上下文中,说 PDO 是现代访问层是错误的,因为您的陈述暗示了进程是 mysql -> mysqli -> PDO,但事实并非如此。
The choice between MySQLi and PDO is simple - if you need to support multiple database products then you use PDO. If you're just using MySQL then you can choose between PDO and MySQLi.
MySQLi 和 PDO 之间的选择很简单——如果您需要支持多个数据库产品,那么您可以使用 PDO。如果您只是使用 MySQL,那么您可以在 PDO 和 MySQLi 之间进行选择。
So why would you choose MySQLi over PDO? See below...
那么为什么你会选择 MySQLi 而不是 PDO?见下文...
You are correct about MySQLnd which is the newest MySQL core language level library, however it is not a replacement for MySQLi. MySQLi (as with PDO) remains the way you would interact with MySQL through your PHP code. Both of these use libmysql as the C client behind the PHP code. The problem is that libmysql is outside of the core PHP engine and that is where mysqlnd comes in i.e. it is a Native Driver which makes use of the core PHP internals to maximise efficiency, specifically where memory usage is concerned.
您对 MySQLnd 是正确的,它是最新的 MySQL 核心语言级别库,但它不是 MySQLi 的替代品。MySQLi(与 PDO 一样)仍然是您通过 PHP 代码与 MySQL 交互的方式。这两者都使用 libmysql 作为 PHP 代码背后的 C 客户端。问题是 libmysql 在核心 PHP 引擎之外,这就是 mysqlnd 的用武之地,即它是一个原生驱动程序,它利用核心 PHP 内部来最大限度地提高效率,特别是在涉及内存使用的情况下。
MySQLnd is being developed by MySQL themselves and has recently landed onto the PHP 5.3 branch which is in RC testing, ready for a release later this year. You will then be able to use MySQLnd with MySQLi...but not with PDO. This will give MySQLi a performance boostin many areas (not all) and will make it the best choice for MySQL interaction if you do not need the abstraction like capabilities of PDO.
MySQLnd 由 MySQL 自己开发,最近已登陆 PHP 5.3 分支,该分支正在进行 RC 测试,准备在今年晚些时候发布。然后,您将能够将 MySQLnd 与 MySQLi 一起使用……但不能与 PDO 一起使用。这将使 MySQLi在许多领域(并非全部)获得性能提升,并且如果您不需要 PDO 之类的抽象功能,它将成为 MySQL 交互的最佳选择。
That said, MySQLnd is now available in PHP 5.3for PDO and so you can get the advantages of the performance enhancements from ND into PDO, however, PDO is still a generic database layer and so will be unlikely to be able to benefit as much from the enhancements in ND as MySQLi can.
也就是说,MySQLnd现在可用于 PDO 的PHP 5.3,因此您可以从 ND 到 PDO 中获得性能增强的优势,但是,PDO 仍然是一个通用的数据库层,因此不太可能从MySQLi 中 ND 的增强功能。
Some useful benchmarks can be found herealthough they are from 2006. You also need to be aware of things like this option.
可以在此处找到一些有用的基准测试,尽管它们是 2006 年的。您还需要了解此选项之类的内容。
There are a lot of considerations that need to be taken into account when deciding between MySQLi and PDO. It reality it is not going to matter until you get to rediculously high request numbers and in that case, it makes more sense to be using an extension that has been specifically designed for MySQL rather than one which abstracts things and happens to provide a MySQL driver.
在决定 MySQLi 和 PDO 之间有很多需要考虑的因素。实际上,直到您达到可笑的高请求数才无关紧要,在这种情况下,使用专为 MySQL 设计的扩展而不是抽象事物并恰好提供 MySQL 驱动程序的扩展更有意义.
It is not a simple matter of which is best because each has advantages and disadvantages. You need to read the links I've provided and come up with your own decision, then test it and find out. I have used PDO in past projects and it is a good extension but my choice for pure performance would be MySQLi with the new MySQLND option compiled (when PHP 5.3 is released).
哪个最好不是一个简单的问题,因为每个都有优点和缺点。您需要阅读我提供的链接并做出自己的决定,然后进行测试并找出答案。我在过去的项目中使用过 PDO,它是一个很好的扩展,但我对纯性能的选择是 MySQLi,并编译了新的 MySQLND 选项(当 PHP 5.3 发布时)。
回答by Paul Kroll
General
一般的
- Do not try to optimize before you start to see real world load. You might guess right, but if you don't, you've wasted your time.
- Use jmeter, xdebugor another tool to benchmark the site.
- If load starts to be an issue, either object or data caching will likely be involved, so generally read up on caching options (memcached, MySQL caching options)
- 在开始看到真实世界的负载之前,不要尝试优化。你可能猜对了,但如果你不猜对,你就浪费了时间。
- 使用jmeter、xdebug或其他工具对站点进行基准测试。
- 如果负载开始成为问题,可能会涉及对象缓存或数据缓存,因此通常阅读缓存选项(memcached、MySQL 缓存选项)
Code
代码
- Profile your code so that you know where the bottleneck is, and whether it's in code or the database
- 分析您的代码,以便您知道瓶颈在哪里,以及它是在代码中还是在数据库中
Databases
数据库
- Use MYSQLiif portability to other databases is not vital, PDOotherwise
- If benchmarks reveal the database is the issue, check the queries before you start caching. Use EXPLAINto see where your queries are slowing down.
- After the queries are optimized and the database is cached in some way, you may want to use multiple databases. Either replicating to multiple servers or sharding (splitting the data over multiple databases/servers) may be appropriate, depending on the data, the queries, and the kind of read/write behavior.
- 如果对其他数据库的可移植性不重要,请使用MYSQLi,否则使用PDO
- 如果基准测试表明数据库是问题所在,请在开始缓存之前检查查询。使用EXPLAIN查看您的查询速度减慢的地方。
- 在查询优化并以某种方式缓存数据库后,您可能希望使用多个数据库。根据数据、查询和读/写行为的类型,复制到多个服务器或分片(将数据拆分到多个数据库/服务器上)可能是合适的。
Caching
缓存
- Plenty of writing has been done on caching code, objects, and data. Look up articles on APC, Zend Optimizer, memcached, QuickCache, JPCache. Do some of this before you really need to, and you'll be less concerned about starting off unoptimized.
- APC and Zend Optimizer are opcode caches, they speed up PHP code by avoiding reparsing and recompilation of code. Generally simple to install, worth doing early.
- Memcached is a generic cache, that you can use to cache queries, PHP functions or objects, or entire pages. Code must be specifically written to use it, which can be an involved process if there are no central points to handle creation, update and deletion of cached objects.
- QuickCache and JPCache are file caches, otherwise similar to Memcached. The basic concept is simple, but also requires code and is easier with central points of creation, update and deletion.
- 关于缓存代码、对象和数据已经做了大量的工作。查找有关APC、Zend Optimizer、memcached、QuickCache、JPCache 的文章。在你真正需要之前做一些这样的事情,你就不会那么担心未经优化的开始。
- APC 和 Zend Optimizer 是操作码缓存,它们通过避免重新解析和重新编译代码来加速 PHP 代码。一般安装简单,值得早做。
- Memcached 是一种通用缓存,可用于缓存查询、PHP 函数或对象或整个页面。必须专门编写代码才能使用它,如果没有中心点来处理缓存对象的创建、更新和删除,这可能是一个复杂的过程。
- QuickCache 和 JPCache 是文件缓存,其他方面与 Memcached 类似。基本概念很简单,但也需要代码,而且创建、更新和删除的中心点更容易。
Miscellaneous
各种各样的
- Consider alternative web servers for high load. Servers like lighthttpand nginxcan handle large amounts of traffic in much less memory than Apache, if you can sacrifice Apache's power and flexibility (or if you just don't need those things, which often, you don't).
- Remember that hardware is surprisingly cheap these days, so be sure to cost out the effort to optimize a large block of code versus "let's buy a monster server."
- Consider adding the "MySQL" and "scaling" tags to this question
回答by tslocum
APCis an absolute must. Not only does it make for a great caching system, but the gain from the auto-cached PHP files is a godsend. As for the multiple database idea, I don't think you would get much out of having different databases on the same server. It may give you a bit of a gain in speed during query time, but I doubt the effort it would take to deploy and maintain the code for all three while making sure they are in sync would be worth it.
APC绝对是必须的。它不仅可以打造出色的缓存系统,而且从自动缓存的 PHP 文件中获益也是天赐之物。至于多数据库的想法,我认为在同一台服务器上拥有不同的数据库不会有太多好处。它可能会让您在查询期间加快速度,但我怀疑在确保它们同步的同时部署和维护所有三个代码所需的努力是否值得。
I also highly recommend running Xdebugto find bottlenecks in your program. It made optimization a breeze for me.
我还强烈建议运行Xdebug来查找程序中的瓶颈。它让优化变得轻而易举。
回答by Eric Scrivner
Firstly, as I think Knuth said, "Premature optimization is the root of all evil". If you don't have to deal with these issues right now then don't, focus on delivering something that works correctly first. That being said, if the optimizations can't wait.
首先,正如我认为 Knuth 所说,“过早的优化是万恶之源”。如果你现在不需要处理这些问题,那就不要,首先专注于提供一些可以正常工作的东西。话虽如此,如果优化不能等待。
Try profiling your database queries, figure out what's slow and what happens alot and come up with an optimization strategy from that.
尝试分析您的数据库查询,找出什么是缓慢的,什么发生了很多,并从中提出优化策略。
I would investigate Memcachedas it's what a lot of the higher load sites use for efficiently caching content of all types, and the PHP object interface to it is quite nice.
我会研究Memcached,因为它是许多高负载站点用于有效缓存所有类型内容的工具,而且它的 PHP 对象接口非常好。
Splitting up databases among servers and using some sort of load balancing technique (e.g. generate a random number between 1 and # redundant databases with necessary data - and use that number to determine which database server to connect to) can also be an excellent way to increase efficiency.
在服务器之间拆分数据库并使用某种负载平衡技术(例如,在 1 和 # 冗余数据库之间生成一个随机数,其中包含必要的数据 - 并使用该数字来确定要连接到哪个数据库服务器)也可以是增加效率。
These have all worked out pretty well in the past for some fairly high load sites. Hope this helps to get you started :-)
这些在过去对于一些相当高负载的站点都非常有效。希望这有助于让你开始:-)
回答by Bob Somers
Profiling your app with something like Xdebug (like tj9991 recommended) is definitely going to be a must. It doesn't make a whole lot of sense to just go around optimizing things blindly. Xdebug will help you find the real bottlenecks in your code so you can spend your optimization time wisely and fix chunks of code that are actually causing slow downs.
使用 Xdebug 之类的东西(如推荐的 tj9991)分析您的应用程序绝对是必须的。盲目地优化事物并没有多大意义。Xdebug 将帮助您找到代码中的真正瓶颈,以便您可以明智地花费优化时间并修复实际上导致速度变慢的代码块。
If you're using Apache, another utility that can help in testing is Siege. It will help you anticipate how your server and application will react to high loads by really putting it through its paces.
如果您使用的是 Apache,则可以帮助进行测试的另一个实用程序是Siege。它将帮助您预测您的服务器和应用程序将如何应对高负载,从而真正完成它的步伐。
Any kind of opcode cache for PHP (like APC or one of the many others) will help a lot as well.
任何类型的 PHP 操作码缓存(如 APC 或许多其他操作码之一)也会有很大帮助。
回答by Vegard Larsen
I run a website with 7-8 million page views a month. Not terribly much, but enough that our server felt the load. The solution we chose was simple: Memcache at the database level. This solution works well if the database load is your main problem.
我运行的网站每月有 7-8 百万次页面浏览。不是特别多,但足以让我们的服务器感受到负载。我们选择的解决方案很简单:数据库级别的 Memcache。如果数据库负载是您的主要问题,则此解决方案效果很好。
We started out using Memcache to cache entire objects and the database results that were most frequently used. It did work, but it also introduced bugs (we might have avoided some of those if we had been more careful).
我们开始使用 Memcache 来缓存整个对象和最常用的数据库结果。它确实有效,但它也引入了错误(如果我们更加小心,我们可能会避免其中的一些错误)。
So we changed our approach. We built a database wrapper (with the exact same methods as our old database, so it was easy to switch), and then we subclassed it to provide memcached database access methods.
所以我们改变了我们的方法。我们构建了一个数据库包装器(使用与我们旧数据库完全相同的方法,因此很容易切换),然后我们将其子类化以提供 memcached 数据库访问方法。
Now all you have to do is decide whether a query can use cached (and possibly out of date) results or not. Most of the queries run by the users are now fetched directly from Memcache. The exceptions are updates and inserts, which for the main website only happens because of logging. This rather simple measure reduced our server load by about 80%.
现在您要做的就是决定查询是否可以使用缓存的(可能是过时的)结果。用户运行的大多数查询现在直接从 Memcache 中获取。例外是更新和插入,对于主网站来说,这只是因为日志记录而发生。这个相当简单的措施将我们的服务器负载减少了大约 80%。
回答by Vegard Larsen
For what it's worth, caching is DIRT SIMPLE in PHP even without an extension/helper package like memcached.
就其价值而言,即使没有像 memcached 这样的扩展/帮助程序包,缓存在 PHP 中也很简单。
All you need to do is create an output buffer using ob_start().
您需要做的就是使用ob_start().
Create a global cache function. Call ob_start, pass the function as a callback. In the function, look for a cached version of the page. If exists, serve it and end.
创建全局缓存函数。Call ob_start,将函数作为回调传递。在该函数中,查找页面的缓存版本。如果存在,服务它并结束。
If it doesn't exist, the script will continue processing. When it reaches the matching ob_end() it will call the function you specified. At that time, you just get the contents of the output buffer, drop them in a file, save the file, and end.
如果它不存在,脚本将继续处理。当它到达匹配的 ob_end() 时,它将调用您指定的函数。那时,您只需获取输出缓冲区的内容,将它们放入文件中,保存文件,然后结束。
Add in some expiration/garbage collection.
添加一些过期/垃圾收集。
And many people don't realize you can nest ob_start()/ob_end()calls. So if you're already using an output buffer to, say, parse in advertisements or do syntax highlighting or whatever, you can just nest another ob_start/ob_endcall.
许多人没有意识到您可以嵌套ob_start()/ob_end()调用。因此,如果您已经在使用输出缓冲区来解析广告或进行语法突出显示或其他任何操作,则可以嵌套另一个ob_start/ob_end调用。

