您如何在 PHP/MySQL 应用程序中充分利用多核 CPU?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2267345/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 05:48:22  来源:igfitidea点击:

How do you make good use of multicore CPUs in your PHP/MySQL applications?

phpmysqlmulticore

提问by jkndrkn

I maintain a custom built CMS-like application.

我维护一个自定义构建的类似 CMS 的应用程序。

Whenever a document is submitted, several tasks are performed that can be roughly grouped into the following categories:

每当提交文档时,都会执行多项任务,大致可分为以下几类:

  1. MySQL queries.
  2. HTML content parsing.
  3. Search index updating.
  1. MySQL 查询。
  2. HTML 内容解析。
  3. 搜索索引更新。

Category 1 includes updates to various MySQL tables relating to a document's content.

类别 1 包括对与文档内容相关的各种 MySQL 表的更新。

Category 2 includes parsing of HTML content stored in MySQL LONGTEXT fields to perform some automatic anchor tag transformations. I suspect that a great deal of computation time is spent in this task.

类别 2 包括解析存储在 MySQL LONGTEXT 字段中的 HTML 内容以执行一些自动锚标记转换。我怀疑在这个任务中花费了大量的计算时间。

Category 3 includes updates to a simple MySQL-based search index using just a handful of fields corresponding to the document.

类别 3 包括对简单的基于 MySQL 的搜索索引的更新,仅使用与文档对应的少数字段。

All of these tasks need to complete for the document submission to be considered complete.

所有这些任务都需要完成才能将文档提交视为完成。

The machine that hosts this application has dual quad-core Xeon processors (a total of 8 cores). However, whenever a document submits, all PHP code that executes is constrained to a single process running on one of the cores.

承载此应用程序的机器具有双四核至强处理器(共 8 个内核)。但是,无论何时提交文档,执行的所有 PHP 代码都被限制为在其中一个内核上运行的单个进程。

My question:

我的问题:

What schemes, if any, have you used to split up your PHP/MySQL web application processing load among multiple CPU cores? My ideal solution would basically spawn a few processes, let them execute in parallel on several cores, and then block until all of the processes are done.

您使用过哪些方案(如果有)在多个 CPU 内核之间分配 PHP/MySQL Web 应用程序处理负载?我的理想解决方案基本上是产生几个进程,让它们在几个内核上并行执行,然后阻塞直到所有进程都完成。

Related question:

相关问题:

What is your favorite PHP performance profiling tool?

您最喜欢的 PHP 性能分析工具是什么?

回答by Baba

Introduction

介绍

PHP has full Multi-Threadingsupport which you can take full advantage of in so many ways. Have been able to demonstrate this Multi-Threading ability in different examples:

PHP 具有完整的多线程支持,您可以通过多种方式充分利用它。已经能够在不同的例子中展示这种多线程能力:

A quick Searchwould give additional resources.

一个快速搜索将提供额外的资源。

Categories

类别

1: MySQL queries

1:MySQL查询

MySQL is fully multi-threadedand will make use of multiple CPUs, provided that the operating system supports them, It would also maximize system resources if properly configured for performance.

MySQL 是完全多线程的,将使用多个 CPU,前提是操作系统支持它们,如果正确配置以提高性能,它还可以最大限度地利用系统资源。

A typical setting in the my.inithat affect thread performance is :

my.ini影响线程性能的典型设置 是:

thread_cache_size = 8

thread_cache_sizecan be increased to improve performance if you have a lot of new connections. Normally, this does not provide a notable performance improvement if you have a good thread implementation. However, if your server sees hundreds of connections per second you should normally set thread_cache_size high enough so that most new connections use cached threads

如果您有很多新连接,可以增加thread_cache_size以提高性能。通常,如果您有一个好的线程实现,这不会提供显着的性能改进。但是,如果您的服务器每秒看到数百个连接,您通常应该将 thread_cache_size 设置得足够高,以便大多数新连接使用缓存线程

If you are using Solaristhen you can use

如果您使用的是Solaris,那么您可以使用

thread_concurrency = 8 

thread_concurrencyenables applications to give the threads system a hint about the desired number of threads that should be run at the same time.

thread_concurrency使应用程序能够向线程系统提供有关应同时运行的所需线程数的提示。

This variable is deprecated as of MySQL 5.6.1 and is removed in MySQL 5.7. You should remove this from MySQL configuration files whenever you see it unless they are for Solaris 8 or earlier.

此变量自 MySQL 5.6.1 起已弃用,并在 MySQL 5.7 中删除。无论何时看到它,您都应该从 MySQL 配置文件中删除它,除非它们是针对 Solaris 8 或更早版本的。

InnoDB::

InnoDB::

You don't have such limitations if you are using Innodbhas the storage engine because it full supports thread concurrency

如果您使用的是Innodb有存储引擎,则没有这样的限制,因为它完全支持线程并发

innodb_thread_concurrency //  Recommended 2 * CPUs + number of disks

You can also look at innodb_read_io_threadsand innodb_write_io_threadswhere the default is 4and it can be increased to as high as 64depending on the hardware

您还可以看看innodb_read_io_threadsinnodb_write_io_threads其中默认的是4,它可以提高到高达64取决于硬件

Others:

其他:

Other configurations to also look at include key_buffer_size, table_open_cache, sort_buffer_sizeetc. which cal all result in better performance

其他配置也看包括key_buffer_sizetable_open_cachesort_buffer_size等这一切CAL导致更好的性能

PHP:

PHP:

In pure PHP you can create MySQL Worker where each query are executed in separate PHP threads

在纯 PHP 中,您可以创建 MySQL Worker,其中每个查询都在单独的 PHP 线程中执行

$sql = new SQLWorker($host, $user, $pass, $db);
$sql->start();

$sql->stack($q1 = new SQLQuery("One long Query")); 
$sql->stack($q2 = new SQLQuery("Another long Query"));

$q1->wait(); 
$q2->wait(); 

// Do Something Useful

Here is a Full Working Example of SQLWorker

这是 SQLWorker 的完整工作示例

2: HTML content parsing

2:HTML内容解析

I suspect that a great deal of computation time is spent in this task.

我怀疑在这个任务中花费了大量的计算时间。

If you already know the problem then it makes it easier to solve via event loops , Job Queue or using Threads.

如果您已经知道问题所在,那么通过事件循环、作业队列或使用线程可以更轻松地解决问题。

Working on one document one at a time can be a very, veryslow, painful process. @kaonce hacked his way out using ajax to calling multiple request, Some Creative minds would just fork the process using pcntl_forkbut if you are using windowsthen you can not take advantage of pcntl

一次处理一个文档可能是一个非常、非常缓慢、痛苦的过程。@ka曾经使用 ajax 破解了他的出路来调用多个请求,一些有创意的人只会使用pcntl_fork 来分叉这个过程,但是如果你正在使用,windows那么你就不能利用pcntl

With pThreadssupporting both windows and Unix systems, You don't have such limitation. Is as easy as .. If you need to parse 100 document? Spawn 100 Threads ... Simple

由于同时pThreads支持 windows 和 Unix 系统,您没有这样的限制。就像..一样简单。如果您需要解析 100 个文档?产生 100 个线程......简单

HTML Scanning

HTML 扫描

// Scan my System
$dir = new RecursiveDirectoryIterator($dir, RecursiveDirectoryIterator::SKIP_DOTS);
$dir = new RecursiveIteratorIterator($dir);

// Allowed Extension
$ext = array(
        "html",
        "htm"
);

// Threads Array
$ts = array();

// Simple Storage
$s = new Sink();

// Start Timer
$time = microtime(true);

$count = 0;
// Parse All HTML
foreach($dir as $html) {
    if ($html->isFile() && in_array($html->getExtension(), $ext)) {
        $count ++;
        $ts[] = new LinkParser("$html", $s);
    }
}

// Wait for all Threads to finish
foreach($ts as $t) {
    $t->join();
}

// Put The Output
printf("Total Files:\t\t%s \n", number_format($count, 0));
printf("Total Links:\t\t%s \n", number_format($t = count($s), 0));
printf("Finished:\t\t%0.4f sec \n", $tm = microtime(true) - $time);
printf("AvgSpeed:\t\t%0.4f sec per file\n", $tm / $t);
printf("File P/S:\t\t%d file per sec\n", $count / $tm);
printf("Link P/S:\t\t%d links per sec\n", $t / $tm);

Output

输出

Total Files: ? ? ? ? ? ?8,714
Total Links:            105,109
Finished:               108.3460 sec
AvgSpeed:               0.0010 sec per file
File P/S:               80 file per sec
Link P/S:               907 links per sec

Class Used

使用的类

Sink

Sink

class Sink extends Stackable {
    public function run() {
    }
}

LinkParser

LinkParser

class LinkParser extends Thread {

    public function __construct($file, $sink) {
        $this->file = $file;
        $this->sink = $sink;
        $this->start();
    }

    public function run() {
        $dom = new DOMDocument();
        @$dom->loadHTML(file_get_contents($this->file));
        foreach($dom->getElementsByTagName('a') as $links) {
            $this->sink[] = $links->getAttribute('href');
        }
    }
}

Experiment

实验

Trying parsing 8,714files that have 105,109links without threads and see how long it would take.

尝试解析没有线程的链接的8,714文件,105,109看看需要多长时间。

Better Architecture

更好的架构

Spawning too many threads which is not a clever thing to do In production. A better approch would be to use Pooling. Have a pool of define Workersthen stackwith a Task

产生太多线程,这在生产中不是一件聪明的事情。更好的方法是使用Pooling。有一个限定池的工人,然后Task

Performance Improvement

性能改进

Fine, the example above can still be improved. Instead of waiting for the system to scan all files in a single thread you can use multiple threads to scan my system for files then stack the data to Workers for processing

好吧,上面的例子还是可以改进的。无需等待系统在单个线程中扫描所有文件,您可以使用多个线程扫描我的系统中的文件,然后将数据堆叠到 Workers 进行处理

3: Search index updating

3:搜索索引更新

This has been pretty much answered by the first answer, but there are so many ways for performance improvement. Have you ever considered an Event based approach?

第一个答案几乎已经回答了这个问题,但是有很多方法可以提高性能。你有没有考虑过基于事件的方法?

Introducing Event

介绍活动

@rdlowreyQuote 1:

@rdlowrey引用 1:

Well think of it like this. Imagine you need to serve 10,000 simultaneously connected clients in your web application. Traditional thread-per-requestor process-per-requestservers aren't an option because no matter how lightweight your threads are you still can't hold 10,000 of them open at a time.

这么想吧。想象一下,您需要在您的 Web 应用程序中为 10,000 个同时连接的客户端提供服务。传统的thread-per-requestprocess-per-request服务器不是一种选择,因为无论您的线程多么轻量级,您仍然无法同时打开 10,000 个线程。

@rdlowreyQuote 2:

@rdlowrey引用 2:

On the other hand, if you keep all the sockets in a single process and listen for those sockets to become readable or writable you can put your entire server inside a single event loop and operate on each socket only when there's something to read/write.

另一方面,如果您将所有套接字保存在一个进程中并监听这些套接字变得可读或可写,您可以将整个服务器放在一个事件循环中,并仅在有内容要读/写时才对每个套接字进行操作。

Why don't you experiment with event-driven, non-blocking I/Oapproach to your problem. PHP has libeventto supercharge your application.

你为什么不尝试event-driven,non-blocking I/O方法来解决你的问题。PHP 有libevent来增强您的应用程序。

I know this question is all Multi-Threadingbut if you have some time you can look this Nuclear Reactor written in PHPby @igorw

我知道这个问题是所有的Multi-Threading,但如果你有时间,你可以看看这个用PHP编写的核反应堆@igorw

Finally

最后

Consideration

考虑

I think you should consider using Cacheand Job Queuefor some of your tasks. You can easily have a message saying

我认为你应该考虑使用CacheJob Queue来完成你的一些任务。你可以很容易地有一条消息说

Document uploaded for processing ..... 5% - Done   

Then do all the time wasting tasks in the background. Please look at Making a large processing job smallerfor a similar case study.

然后在后台执行所有浪费时间的任务。有关类似的案例研究,请查看使大型处理作业更小

Profiling

剖析

Profiling Tool? There is no single profile tool for a web application from Xdebugto Ysloware all very useful. Eg. Xdebug is not useful when it comes to threads because its not supported

分析工具?从XdebugYslow,没有单一的 Web 应用程序配置文件工具都非常有用。例如。Xdebug 在线程方面没有用,因为它不受支持

I don't have a favorite

我没有最爱

回答by Pascal MARTIN

PHP is not quite oriented towards multi-threading : as you already noticed, each page is served by one PHP process -- that does one thing at a time, including just "waiting" while an SQL query is executed on the database server.

PHP 并不完全面向多线程:正如您已经注意到的,每个页面都由一个 PHP 进程提供服务——它一次只做一件事,包括在数据库服务器上执行 SQL 查询时“等待”。

There is not much you can do about that, unfortunately : it's the way PHP works.

不幸的是,您对此无能为力:这是 PHP 的工作方式。


Still, here's a couple of thoughts :


不过,这里有一些想法:

  • First of all, you'll probably have more that 1 user at a time on your server, which means you'll serve several pages at the same time, which, in turn, means you'll have several PHP processes and SQL queries running at the same time... which means several cores of your server will be used.
    • Each PHP process will run on one core, in response to the request of one user, but there are several sub-processes of Apache running in parallel (one for each request, up to a couple of dozens or hundreds, depending on your configuration)
    • The MySQL server is multi-threaded, which means it can use several distinct cores to answer several concurrent requests -- even if each request cannot be served by more that one core.
  • 首先,您的服务器上一次可能有超过 1 个用户,这意味着您将同时提供多个页面,这反过来又意味着您将运行多个 PHP 进程和 SQL 查询同时......这意味着将使用您服务器的多个核心。
    • 每个 PHP 进程将运行在一个内核上,以响应一个用户的请求,但 Apache 有多个子进程并行运行(每个请求一个,最多几十个或几百个,取决于您的配置)
    • MySQL 服务器是多线程的,这意味着它可以使用多个不同的内核来响应多个并发请求——即使每个请求不能由多个内核提供服务。

So, in fact, your server's 8 core will end up being used ;-)

因此,实际上,您服务器的 8 核最终将被使用;-)


And, if you think your pages are taking too long to generate, a possible solution is to separate your calculations in two groups :


而且,如果您认为您的页面生成时间太长,一个可能的解决方案是将您的计算分为两组:

  • On one hand, the things that have to be done to generate the page : for those, there is not much you can do
  • On the other hand, the things that have to be run sometimes, but not necessarily immediately
    • For instance, I am think about some statistics calculations : you want them to be quite up to date, but if they lag a couple of minutes behind, that's generally quite OK.
    • Same for e-mail sending : anyway, several minutes will pass before your users receive/read their mail, so there is no need to send them immediately.
  • 一方面,生成页面必须做的事情:对于那些,你无能为力
  • 另一方面,有时必须运行的东西,但不一定立即运行
    • 例如,我正在考虑一些统计计算:您希望它们是最新的,但如果它们落后几分钟,那通常就可以了。
    • 电子邮件发送也是如此:无论如何,在您的用户接收/阅读他们的邮件之前会过去几分钟,因此没有必要立即发送它们。

For the kind of situations in my second point, as you don't need those things done immediately... Well, just don't do them immediately ;-)
A solution that I often use is some queuing mechanism :

对于我第二点中的那种情况,因为你不需要立即完成这些事情......好吧,只是不要立即做;-)
我经常使用的一个解决方案是一些排队机制:

  • The web application store things in a "todo-list"
  • And that "todo-list" is de-queued by some batches that are run frequently via a cronjob
  • Web 应用程序将事物存储在“待办事项列表”中
  • 并且“todo-list”被一些通过cronjob频繁运行的批次出队

And for some other manipulations, you just want them run every X minutes -- and, here too, a cronjob is the perfect tool.

对于其他一些操作,您只希望它们每 X 分钟运行一次——而且,在这里,cronjob 也是完美的工具。

回答by RolandoMySQLDBA

Scaling out Web Servers is not going to make MySQL budge one inch when it comes to accessing Multicore CPUs. Why? First consider the two main Storage Engines of MySQL

在访问多核 CPU 时,横向扩展 Web 服务器不会让 MySQL 改变一英寸。为什么?首先考虑MySQL的两个主要存储引擎

MyISAM

我的ISAM

This storage engine does not access multiple cores. It never has and never will. It does full table locking for each INSERT, UPDATE, and DELETE. Sending queries from multiple Web Servers to do anything with a MyISAM just gets bottlenecked.

此存储引擎不访问多个内核。它从来没有,也永远不会。它为每个 INSERT、UPDATE 和 DELETE 执行全表锁定。从多个 Web 服务器发送查询以使用 MyISAM 执行任何操作只会遇到瓶颈。

InnoDB

数据库

Prior to MySQL 5.1.38, this storage engine has accessed only one CPU. You had to do strange things like run MySQL multiple times on one machine to coerce the cores to handle different instances of MySQL. Then, have the Web Servers' DB connections load balanced among the multiple instances. That's old school (especially if you are using versions of MySQL before MySQl 5.1.38).

在 MySQL 5.1.38 之前,这个存储引擎只访问了一个 CPU。你不得不做一些奇怪的事情,比如在一台机器上多次运行 MySQL 来强制内核处理 MySQL 的不同实例。然后,在多个实例之间平衡 Web 服务器的数据库连接负载。那是老派(特别是如果您使用的是 MySQl 5.1.38 之前的 MySQL 版本)。

Starting with MySQL 5.1.38, you install the new InnoDB Plugin. It has features that you have to tune for getting InnoDB to access multiple CPUs. I have written about this in the DBA StackExchange

从 MySQL 5.1.38 开始,您安装新的 InnoDB 插件。它具有您必须调整才能让 InnoDB 访问多个 CPU 的功能。我在 DBA StackExchange 中写过这个

Those new features are fully available in MySQL 5.5/5.6 and Percona Server as well.

这些新功能在 MySQL 5.5/5.6 和 Percona Server 中也完全可用。

CAVEAT

警告

If your custom CMS uses FULLTEXT indexing/searching, you should upgrade to MySQL 5.6 because InnoDB now supports FULLTEXT indexing/searching.

如果您的自定义 CMS 使用 FULLTEXT 索引/搜索,您应该升级到 MySQL 5.6,因为 InnoDB 现在支持 FULLTEXT 索引/搜索。

Installing to MySQL 5.6 is not going to automatically make the CPUs get going. You will have to tune it because, LEFT UNCONFIGURED, it is possible for older versions of MySQL to outrun and outgun newer versions:

安装到 MySQL 5.6 不会自动使 CPU 运行。您必须对其进行调整,因为在未配置的情况下,旧版本的 MySQL 可能会跑得更快,并且可能会超过新版本:

回答by Anthony Forloney

This might not be an answer to the question you are looking for, but the solution you seek deals with threading. Threading is necessary for multicore-programming, and threading is notimplemented in PHP.

这可能不是您正在寻找的问题的答案,但您寻求的解决方案涉及线程。多核编程需要线程,PHP 中没有实现线程。

But, in a sense, you could fake threading in PHP by relying on the operating system's multitasking abilities. I suggest given a quick overview of Multi-threading strategies in PHPto develop a strategy to achieve what you need.

但是,从某种意义上说,您可以依靠操作系统的多任务处理能力来伪造 PHP 中的线程。我建议快速概述PHP 中多线程策略,以制定实现您需要的策略。

Dead link: Multi-threading strategies in PHP

死链接: PHP 中的多线程策略

回答by Toskan

Just letting you guys know when you think: "poor PHP does not have multithreading"

只是让你们知道,当你想到:“可怜的 PHP 没有多线程”

Well... Python doesn't have real multithreading either. Nor does NodeJS have multi-threading support. Java has some sort of multithreading, but even there, some code halts the whole machine afaik.

嗯... Python 也没有真正的多线程NodeJS 也没有多线程支持。Java 有某种多线程,但即使在那里,一些代码也会使整个机器停止运行

But: unless you do heavy programming of one single thing, it's irrelevant. Many requests hit your page and all your cores will be used none the less as each request spawns its own process with its own single thread.

但是:除非您对一件事进行大量编程,否则它是无关紧要的。许多请求会访问您的页面,并且您的所有内核都将被使用,因为每个请求都会使用自己的单线程生成自己的进程。