MySQL 与 MongoDB 1000 次读取

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9702643/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 12:32:00  来源:igfitidea点击:

MySQL vs MongoDB 1000 reads

mysqlperformancemongodb

提问by Imran Omar Bukhsh

I have been very excited about MongoDb and have been testing it lately. I had a table called posts in MySQL with about 20 million records indexed only on a field called 'id'.

我对 MongoDb 感到非常兴奋,最近一直在测试它。我在 MySQL 中有一个名为 posts 的表,其中约有 2000 万条记录仅在名为“id”的字段上建立索引。

I wanted to compare speed with MongoDB and I ran a test which would get and print 15 records randomly from our huge databases. I ran the query about 1,000 times each for mysql and MongoDB and I am suprised that I do not notice a lot of difference in speed. Maybe MongoDB is 1.1 times faster. That's very disappointing. Is there something I am doing wrong? I know that my tests are not perfect but is MySQL on par with MongoDb when it comes to read intensive chores.

我想将速度与 MongoDB 进行比较,并进行了一项测试,该测试将从我们庞大的数据库中随机获取并打印 15 条记录。我对 mysql 和 MongoDB 分别运行了大约 1,000 次查询,我很惊讶我没有注意到速度上的很大差异。也许 MongoDB 快 1.1 倍。这是非常令人失望的。有什么我做错了吗?我知道我的测试并不完美,但在阅读密集型杂务方面,MySQL 与 MongoDb 相当。


Note:


笔记:

  • I have dual core + ( 2 threads ) i7 cpu and 4GB ram
  • I have 20 partitions on MySQL each of 1 million records
  • 我有双核 +(2 个线程)i7 cpu 和 4GB 内存
  • 我在 MySQL 上有 20 个分区,每个分区有 100 万条记录

Sample Code Used For Testing MongoDB

用于测试 MongoDB 的示例代码

<?php
function microtime_float()
{
    list($usec, $sec) = explode(" ", microtime());
    return ((float)$usec + (float)$sec);
}
$time_taken = 0;
$tries = 100;
// connect
$time_start = microtime_float();

for($i=1;$i<=$tries;$i++)
{
    $m = new Mongo();
    $db = $m->swalif;
    $cursor = $db->posts->find(array('id' => array('$in' => get_15_random_numbers())));
    foreach ($cursor as $obj)
    {
        //echo $obj["thread_title"] . "<br><Br>";
    }
}

$time_end = microtime_float();
$time_taken = $time_taken + ($time_end - $time_start);
echo $time_taken;

function get_15_random_numbers()
{
    $numbers = array();
    for($i=1;$i<=15;$i++)
    {
        $numbers[] = mt_rand(1, 20000000) ;

    }
    return $numbers;
}

?>


Sample Code For Testing MySQL


用于测试 MySQL 的示例代码

<?php
function microtime_float()
{
    list($usec, $sec) = explode(" ", microtime());
    return ((float)$usec + (float)$sec);
}
$BASE_PATH = "../src/";
include_once($BASE_PATH  . "classes/forumdb.php");

$time_taken = 0;
$tries = 100;
$time_start = microtime_float();
for($i=1;$i<=$tries;$i++)
{
    $db = new AQLDatabase();
    $sql = "select * from posts_really_big where id in (".implode(',',get_15_random_numbers()).")";
    $result = $db->executeSQL($sql);
    while ($row = mysql_fetch_array($result) )
    {
        //echo $row["thread_title"] . "<br><Br>";
    }
}
$time_end = microtime_float();
$time_taken = $time_taken + ($time_end - $time_start);
echo $time_taken;

function get_15_random_numbers()
{
    $numbers = array();
    for($i=1;$i<=15;$i++)
    {
        $numbers[] = mt_rand(1, 20000000);

    }
    return $numbers;
}
?>

回答by Sean Reilly

MongoDB is not magically faster. If you store the same data, organised in basically the same fashion, and access it exactly the same way, then you really shouldn't expect your results to be wildly different. After all, MySQL and MongoDB are both GPL, so if Mongo had some magically better IO code in it, then the MySQL team could just incorporate it into their codebase.

MongoDB 并没有神奇地更快。如果您存储相同的数据,以基本相同的方式组织,并以完全相同的方式访问它,那么您真的不应该期望结果会有很大不同。毕竟,MySQL 和 MongoDB 都是 GPL,所以如果 Mongo 中有一些神奇的更好的 IO 代码,那么 MySQL 团队可以将其合并到他们的代码库中。

People are seeing real world MongoDB performance largely because MongoDB allows you to query in a different manner that is more sensible to your workload.

人们看到真实世界的 MongoDB 性能主要是因为 MongoDB 允许您以更适合您的工作负载的不同方式进行查询。

For example, consider a design that persisted a lot of information about a complicated entity in a normalised fashion. This could easily use dozens of tables in MySQL (or any relational db) to store the data in normal form, with many indexes needed to ensure relational integrity between tables.

例如,考虑一个设计,它以规范化的方式保存了关于复杂实体的大量信息。这可以轻松地使用 MySQL(或任何关系数据库)中的数十个表来以正常形式存储数据,需要许多索引来确保表之间的关系完整性。

Now consider the same design with a document store. If all of those related tables are subordinate to the main table (and they often are), then you might be able to model the data such that the entire entity is stored in a single document. In MongoDB you can store this as a single document, in a single collection. This is where MongoDB starts enabling superior performance.

现在考虑具有文档存储的相同设计。如果所有这些相关表都从属于主表(而且它们经常是),那么您可能能够对数据进行建模,以便将整个实体存储在单个文档中。在 MongoDB 中,您可以将其作为单个文档存储在单个集合中。这是 MongoDB 开始实现卓越性能的地方。

In MongoDB, to retrieve the whole entity, you have to perform:

在 MongoDB 中,要检索整个实体,您必须执行:

  • One index lookup on the collection (assuming the entity is fetched by id)
  • Retrieve the contents of one database page (the actual binary json document)
  • 对集合进行一次索引查找(假设实体是通过 id 获取的)
  • 检索一个数据库页面的内容(实际的二进制json文档)

So a b-tree lookup, and a binary page read. Log(n) + 1 IOs. If the indexes can reside entirely in memory, then 1 IO.

所以一个 b 树查找,和一个二进制页面读取。Log(n) + 1 个 IO。如果索引可以完全驻留在内存中,那么 1 个 IO。

In MySQL with 20 tables, you have to perform:

在具有 20 个表的 MySQL 中,您必须执行:

  • One index lookup on the root table (again, assuming the entity is fetched by id)
  • With a clustered index, we can assume that the values for the root row are in the index
  • 20+ range lookups (hopefully on an index) for the entity's pk value
  • These probably aren't clustered indexes, so the same 20+ data lookups once we figure out what the appropriate child rows are.
  • 对根表进行一次索引查找(再次假设实体是通过 id 获取的)
  • 使用聚集索引,我们可以假设根行的值在索引中
  • 实体 pk 值的 20 多个范围查找(希望在索引上)
  • 这些可能不是聚集索引,因此一旦我们找出合适的子行是什么,同样需要进行 20 多次数据查找。

So the total for mysql, even assuming that all indexes are in memory (which is harder since there are 20 times more of them) is about 20 range lookups.

所以 mysql 的总数,即使假设所有索引都在内存中(这更难,因为它们的数量是 20 倍)大约是 20 次范围查找。

These range lookups are likely comprised of random IO — different tables will definitely reside in different spots on disk, and it's possible that different rows in the same range in the same table for an entity might not be contiguous (depending on how the entity has been updated, etc).

这些范围查找可能由随机 IO 组成——不同的表肯定会驻留在磁盘上的不同位置,并且实体的同一表中同一范围内的不同行可能不连续(取决于实体如何更新等)。

So for this example, the final tally is about 20 timesmore IO with MySQL per logical access, compared to MongoDB.

因此,对于此示例,与 MongoDB 相比,MySQL 每次逻辑访问的最终IO大约是20 倍

This is how MongoDB can boost performance in some use cases.

这就是 MongoDB在某些用例中提高性能的方式。

回答by theAndroid

Do you have concurrency, i.e simultaneous users ? If you just run 1000 times the query straight, with just one thread, there will be almost no difference. Too easy for these engines :)

你有并发,即并发用户吗?如果你直接运行 1000 次查询,只有一个线程,几乎没有区别。这些引擎太容易了:)

BUT I strongly suggest that you build a true load testing session, which means using an injector such as JMeter with 10, 20 or 50 users AT THE SAME TIME so you can really see a difference (try to embed this code inside a web page JMeter could query).

但是我强烈建议您构建一个真正的负载测试会话,这意味着同时使用具有 10、20 或 50 个用户的注入器,例如 JMeter,这样您就可以真正看到差异(尝试将此代码嵌入网页 JMeter可以查询)。

I just did it today on a single server (and a simple collection / table) and the results are quite interesting and surprising (MongoDb was really faster on writes & reads, compared to MyISAM engine and InnoDb engine).

我今天刚刚在一台服务器(和一个简单的集合/表)上完成了它,结果非常有趣和令人惊讶(与 MyISAM 引擎和 InnoDb 引擎相比,MongoDb 的写入和读取速度确实更快)。

This really should be part of your test : concurrency & MySQL engine. Then, data/schema design & application needs are of course huge requirements, beyond response times. Let me know when you get results, I'm also in need of inputs about this!

这确实应该成为您测试的一部分:并发性和 MySQL 引擎。然后,数据/模式设计和应用程序需求当然是巨大的需求,超出了响应时间。当你得到结果时让我知道,我也需要这方面的投入!

回答by user2081518

Source: https://github.com/webcaetano/mongo-mysql

来源:https: //github.com/webcaetano/mongo-mysql

10 rows

10 行

mysql insert: 1702ms
mysql select: 11ms

mongo insert: 47ms
mongo select: 12ms

100 rows

100 行

mysql insert: 8171ms
mysql select: 10ms

mongo insert: 167ms
mongo select: 60ms

1000 rows

1000 行

mysql insert: 94813ms (1.58 minutes)
mysql select: 13ms

mongo insert: 1013ms
mongo select: 677ms

10.000 rows

10.000 行

mysql insert: 924695ms (15.41 minutes)
mysql select: 144ms

mongo insert: 9956ms (9.95 seconds)
mongo select: 4539ms (4.539 seconds)

回答by Gabe Rainbow

man,,, the answer is that you're basically testing PHP and not a database.

伙计,答案是您基本上是在测试 PHP 而不是数据库。

don't bother iterating the results, whether commenting out the print or not. there's a chunk of time.

不要打扰迭代结果,无论是否注释掉打印。有很多时间。

   foreach ($cursor as $obj)
    {
        //echo $obj["thread_title"] . "<br><Br>";
    }

while the other chunk is spend yacking up a bunch of rand numbers.

而另一部分则是花在一堆兰特数字上。

function get_15_random_numbers()
{
    $numbers = array();
    for($i=1;$i<=15;$i++)
    {
        $numbers[] = mt_rand(1, 20000000) ;

    }
    return $numbers;
}

then theres a major difference b/w implode and in.

那么黑白内爆和输入之间存在重大差异。

and finally what is going on here. looks like creating a connection each time, thus its testing the connection time plus the query time.

最后这里发生了什么。看起来每次都创建一个连接,因此它测试连接时间加上查询时间。

$m = new Mongo();

vs

对比

$db = new AQLDatabase();

so your 101% faster might turn out to be 1000% faster for the underlying query stripped of jazz.

因此,对于去除了爵士乐的底层查询,速度提高 101% 可能会提高 1000%。

urghhh.

呃。

回答by Reoxey

https://github.com/reoxey/benchmark

https://github.com/reoxey/benchmark

benchmark

基准

speed comparison of MySQL & MongoDB in GOLANG1.6 & PHP5

GOLANG1.6 & PHP5 MySQL & MongoDB速度对比

system used for benchmark: DELL cpu i5 4th gen 1.70Ghz * 4 ram 4GB GPU ram 2GB

用于基准测试的系统:DELL cpu i5 4th gen 1.70Ghz * 4 ram 4GB GPU ram 2GB

Speed comparison of RDBMS vs NoSQL for INSERT, SELECT, UPDATE, DELETE executing different number of rows 10,100,1000,10000,100000,1000000

RDBMS 与 NoSQL 的 INSERT、SELECT、UPDATE、DELETE 执行不同行数的速度比较 10,100,1000,10000,100000,1000000

Language used to execute is: PHP5 & Google fastest language GO 1.6

用于执行的语言是:PHP5 & Google 最快的语言 GO 1.6

________________________________________________
GOLANG with MySQL (engine = MyISAM)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            INSERT
------------------------------------------------
num of rows             time taken
------------------------------------------------
10                      1.195444ms
100                     6.075053ms
1000                    47.439699ms
10000                   483.999809ms
100000                  4.707089053s
1000000                 49.067407174s


            SELECT
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 872.709μs


        SELECT & DISPLAY
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 20.717354746s


            UPDATE
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 2.309209968s
100000                  257.411502ms
10000                   26.73954ms
1000                    3.483926ms
100                     915.17μs
10                      650.166μs


            DELETE
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 6.065949ms
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


________________________________________________
GOLANG with MongoDB
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            INSERT
------------------------------------------------
num of rows             time taken
------------------------------------------------
10                      2.067094ms
100                     8.841597ms
1000                    106.491732ms
10000                   998.225023ms
100000                  8.98172825s
1000000                 1m 29.63203158s


            SELECT
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 5.251337439s


        FIND & DISPLAY (with index declared)
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 21.540603252s


            UPDATE
------------------------------------------------
num of rows             time taken
------------------------------------------------
1                       1.330954ms
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

________________________________________________
PHP5 with MySQL (engine = MyISAM)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            INSERT
------------------------------------------------
num of rows             time taken
------------------------------------------------
 10                     0.0040680000000001s
 100                    0.011595s
 1000                   0.049718s
 10000                  0.457164s
 100000                 4s
 1000000                42s


            SELECT
------------------------------------------------
num of rows             time taken
------------------------------------------------
 1000000                <1s


            SELECT & DISPLAY
------------------------------------------------
num of rows             time taken
------------------------------------------------
  1000000               20s
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

________________________________________________
PHP5 with MongoDB 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            INSERT
------------------------------------------------
num of rows             time taken
------------------------------------------------
10                      0.065744s
100                     0.190966s
1000                    0.2163s
10000                   1s
100000                  8s
1000000                 78s


            FIND
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 <1s


            FIND & DISPLAY
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 7s


            UPDATE
------------------------------------------------
num of rows             time taken
------------------------------------------------
1000000                 9s

回答by Jason Hitchings

Here is a little researchthat explored RDBMS vs NoSQL using MySQL vs Mongo, the conclusions were inline with @Sean Reilly's response. In short, the benefit comes from the design, not some raw speed difference. Conclusion on page 35-36:

这是一项使用 MySQL 与 Mongo 探索 RDBMS 与 NoSQL的小研究,结论与 @Sean Reilly 的回应一致。简而言之,好处来自设计,而不是一些原始的速度差异。第 35-36 页的结论:

RDBMS vs NoSQL: Performance and Scaling Comparison

RDBMS 与 NoSQL:性能和扩展比较

The project tested, analysed and compared the performance and scalability of the two database types. The experiments done included running different numbers and types of queries, some more complex than others, in order to analyse how the databases scaled with increased load. The most important factor in this case was the query type used as MongoDB could handle more complex queries faster due mainly to its simpler schema at the sacrifice of data duplication meaning that a NoSQL database may contain large amounts of data duplicates. Although a schema directly migrated from the RDBMS could be used this would eliminate the advantage of MongoDB's underlying data representation of subdocuments which allowed the use of less queries towards the database as tables were combined. Despite the performance gain which MongoDB had over MySQL in these complex queries, when the benchmark modelled the MySQL query similarly to the MongoDB complex query by using nested SELECTs MySQL performed best although at higher numbers of connections the two behaved similarly.The last type of query benchmarked which was the complex query containing two JOINS and and a subquery showed the advantage MongoDB has over MySQL due to its use of subdocuments. This advantage comes at the cost of data duplication which causes an increase in the database size. If such queries are typical in an application then it is important to consider NoSQL databases as alternatives while taking in account the cost in storage and memory size resulting from the larger database size.

该项目测试、分析和比较了两种数据库类型的性能和可扩展性。所做的实验包括运行不同数量和类型的查询,其中一些查询比其他查询更复杂,以分析数据库如何随着负载的增加而扩展。在这种情况下,最重要的因素是用作 MongoDB 的查询类型可以更快地处理更复杂的查询,这主要是因为它的架构更简单,但牺牲了数据重复,这意味着 NoSQL 数据库可能包含大量数据重复。尽管可以使用直接从 RDBMS 迁移的模式,但这将消除 MongoDB 的子文档的底层数据表示的优势,该表示允许在组合表时对数据库使用较少的查询。尽管在这些复杂查询中 MongoDB 的性能优于 MySQL,但当基准测试通过使用嵌套的 SELECT 将 MySQL 查询建模为类似于 MongoDB 复杂查询时,MySQL 表现最好,尽管在更高数量的连接下,两者的行为相似。最后一种进行基准测试的查询是包含两个 JOINS 和一个子查询的复杂查询,这表明 MongoDB 由于使用子文档而优于 MySQL。这种优势是以数据重复为代价的,这会导致数据库大小的增加。如果此类查询在应用程序中很常见,那么考虑 NoSQL 数据库作为替代方案就很重要,同时还要考虑到由于较大的数据库大小而导致的存储成本和内存大小。

回答by zhuomin chen

On Single Server, MongoDb would not be any faster than mysql MyISAM on both read and write, given table/doc sizes are small 1 GB to 20 GB.
MonoDB will be faster on Parallel Reduce on Multi-Node clusters, where Mysql can NOT scale horizontally.

在单服务器上,MongoDb 在读取和写入方面都不会比 mysql MyISAM 快,因为表/文档的大小很小,只有 1 GB 到 20 GB。
MonoDB 在多节点集群上的 Parallel Reduce 上会更快,其中 Mysql 不能水平扩展。