MySQL MyISAM 与 InnoDB
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20148/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
MyISAM versus InnoDB
提问by user2013
I'm working on a projects which involves a lot of database writes, I'd say (70% inserts and 30% reads). This ratio would also include updates which I consider to be one read and one write. The reads can be dirty (e.g. I don't need 100% accurate information at the time of read).
The task in question will be doing over 1 million database transactions an hour.
我正在处理一个涉及大量数据库写入的项目,我想说(70% 插入和 30% 读取)。这个比率还包括我认为是一次读取和一次写入的更新。读取可能是脏的(例如,我在读取时不需要 100% 准确的信息)。
有问题的任务将每小时处理超过 100 万次数据库事务。
I've read a bunch of stuff on the web about the differences between MyISAM and InnoDB, and MyISAM seems like the obvious choice to me for the particular database/tables that I'll be using for this task. From what I seem to be reading, InnoDB is good if transactions are needed since row level locking is supported.
我在网上阅读了很多关于 MyISAM 和 InnoDB 之间差异的内容,对于我将用于此任务的特定数据库/表,MyISAM 似乎是我的明显选择。从我似乎正在阅读的内容来看,如果需要事务,InnoDB 是很好的,因为支持行级锁定。
Does anybody have any experience with this type of load (or higher)? Is MyISAM the way to go?
有没有人对这种类型的负载(或更高)有任何经验?MyISAM 是要走的路吗?
回答by developer99
I have briefly discussedthis question in a table so you can conclude whether to go with InnoDBor MyISAM.
我在表格中简要讨论了这个问题,因此您可以得出结论是使用InnoDB还是MyISAM。
Here is a small overview of which db storage engine you should use in which situation:
以下是您应该在哪种情况下使用哪个数据库存储引擎的小概述:
MyISAM InnoDB ---------------------------------------------------------------- Required full-text search Yes 5.6.4 ---------------------------------------------------------------- Require transactions Yes ---------------------------------------------------------------- Frequent select queries Yes ---------------------------------------------------------------- Frequent insert, update, delete Yes ---------------------------------------------------------------- Row locking (multi processing on single table) Yes ---------------------------------------------------------------- Relational base design Yes
Summary
概括
- In almost all circumstances, InnoDBis the best way to go
- But, frequent reading, almost no writing, use MyISAM
- Full-text search in MySQL <= 5.5, use MyISAM
- 在几乎所有情况下,InnoDB都是最好的方法
- 但是,经常阅读,几乎不写,使用MyISAM
- MySQL中全文搜索<= 5.5,使用MyISAM
回答by rix0rrr
I'm not a database expert, and I do not speak from experience. However:
我不是数据库专家,也不是凭经验说话。然而:
MyISAM tables use table-level locking. Based on your traffic estimates, you have close to 200 writes per second. With MyISAM, only one of these could be in progress at any time. You have to make sure that your hardware can keep up with these transaction to avoid being overrun, i.e., a single query can take no more than 5ms.
MyISAM 表使用表级锁定。根据您的流量估计,您每秒有接近 200 次写入。使用 MyISAM,任何时候都只能进行其中之一。您必须确保您的硬件能够跟上这些事务以避免超限,即单个查询的时间不能超过 5 毫秒。
That suggests to me you would need a storage engine which supports row-level locking, i.e., InnoDB.
这向我表明您需要一个支持行级锁定的存储引擎,即 InnoDB。
On the other hand, it should be fairly trivial to write a few simple scripts to simulate the load with each storage engine, then compare the results.
另一方面,编写几个简单的脚本来模拟每个存储引擎的负载,然后比较结果应该是相当简单的。
回答by Bill Karwin
People often talk about performance, reads vs. writes, foreign keys, etc. but there's one other must-have feature for a storage engine in my opinion: atomic updates.
人们经常谈论性能、读取与写入、外键等,但在我看来,存储引擎还有另一个必备功能: 原子更新。
Try this:
尝试这个:
- Issue an UPDATE against your MyISAM table that takes 5 seconds.
- While the UPDATE is in progress, say 2.5 seconds in, hit Ctrl-C to interrupt it.
- Observe the effects on the table. How many rows were updated? How many were not updated? Is the table even readable, or was it corrupted when you hit Ctrl-C?
- Try the same experiment with UPDATE against an InnoDB table, interrupting the query in progress.
- Observe the InnoDB table. Zerorows were updated. InnoDB has assured you have atomic updates, and if the full update could not be committed, it rolls back the whole change. Also, the table is not corrupt. This works even if you use
killall -9 mysqld
to simulate a crash.
- 针对需要 5 秒的 MyISAM 表发出 UPDATE。
- 当 UPDATE 正在进行时,比如说 2.5 秒后,按 Ctrl-C 来中断它。
- 观察对表的影响。更新了多少行?有多少没更新?该表是否可读,或者当您按 Ctrl-C 时它是否已损坏?
- 对 InnoDB 表使用 UPDATE 尝试相同的实验,中断正在进行的查询。
- 观察 InnoDB 表。 更新了零行。InnoDB 保证你有原子更新,如果无法提交完整更新,它会回滚整个更改。此外,该表没有损坏。即使您
killall -9 mysqld
用来模拟崩溃,这也有效。
Performance is desirable of course, but not losing datashould trump that.
性能当然是可取的,但不丢失数据应该胜过这一点。
回答by alanc10n
I've worked on a high-volume system using MySQL and I've tried both MyISAM and InnoDB.
我曾在使用 MySQL 的大容量系统上工作过,并且尝试过 MyISAM 和 InnoDB。
I found that the table-level locking in MyISAM caused serious performance problems for our workload which sounds similar to yours. Unfortunately I also found that performance under InnoDB was also worse than I'd hoped.
我发现 MyISAM 中的表级锁定对我们的工作负载造成了严重的性能问题,这听起来与您的类似。不幸的是,我还发现 InnoDB 下的性能也比我希望的要差。
In the end I resolved the contention issue by fragmenting the data such that inserts went into a "hot" table and selects never queried the hot table.
最后,我通过分割数据解决了争用问题,这样插入进入“热”表并且选择从不查询热表。
This also allowed deletes (the data was time-sensitive and we only retained X days worth) to occur on "stale" tables that again weren't touched by select queries. InnoDB seems to have poor performance on bulk deletes so if you're planning on purging data you might want to structure it in such a way that the old data is in a stale table which can simply be dropped instead of running deletes on it.
这也允许删除(数据是时间敏感的,我们只保留 X 天的价值)发生在再次没有被选择查询触及的“陈旧”表上。InnoDB 似乎在批量删除方面的性能很差,因此如果您计划清除数据,您可能希望以这样一种方式构建它,即旧数据位于陈旧表中,可以简单地删除该表,而不是对其运行删除。
Of course I have no idea what your application is but hopefully this gives you some insight into some of the issues with MyISAM and InnoDB.
当然,我不知道您的应用程序是什么,但希望这能让您深入了解 MyISAM 和 InnoDB 的一些问题。
回答by d4nyll
A bit late to the game...but here's a quite comprehensive post I wrote a few months back, detailing the major differences between MYISAM and InnoDB. Grab a cuppa (and maybe a biscuit), and enjoy.
游戏有点晚了……但这是我几个月前写的一篇非常全面的文章,详细介绍了 MYISAM 和 InnoDB 之间的主要区别。拿一杯咖啡(也许还有饼干),享受吧。
The major difference between MyISAM and InnoDB is in referential integrity and transactions. There are also other difference such as locking, rollbacks, and full-text searches.
MyISAM 和 InnoDB 之间的主要区别在于参照完整性和事务。还有其他区别,例如锁定、回滚和全文搜索。
Referential Integrity
参照完整性
Referential integrity ensures that relationships between tables remains consistent. More specifically, this means when a table (e.g. Listings) has a foreign key (e.g. Product ID) pointing to a different table (e.g. Products), when updates or deletes occur to the pointed-to table, these changes are cascaded to the linking table. In our example, if a product is renamed, the linking table's foreign keys will also update; if a product is deleted from the ‘Products' table, any listings which point to the deleted entry will also be deleted. Furthermore, any new listing must have that foreign key pointing to a valid, existing entry.
参照完整性确保表之间的关系保持一致。更具体地说,这意味着当一个表(例如 Listings)有一个指向不同表(例如 Products)的外键(例如 Product ID)时,当指向的表发生更新或删除时,这些更改将级联到链接桌子。在我们的例子中,如果一个产品被重命名,链接表的外键也将更新;如果产品从“产品”表中删除,则指向已删除条目的任何列表也将被删除。此外,任何新列表都必须具有指向有效的现有条目的外键。
InnoDB is a relational DBMS (RDBMS) and thus has referential integrity, while MyISAM does not.
InnoDB 是关系型 DBMS (RDBMS),因此具有参照完整性,而 MyISAM 则没有。
Transactions & Atomicity
事务和原子性
Data in a table is managed using Data Manipulation Language (DML) statements, such as SELECT, INSERT, UPDATE and DELETE. A transaction group two or more DML statements together into a single unit of work, so either the entire unit is applied, or none of it is.
表中的数据使用数据操作语言 (DML) 语句进行管理,例如 SELECT、INSERT、UPDATE 和 DELETE。事务将两个或多个 DML 语句组合到一个工作单元中,因此要么应用整个单元,要么不应用任何一个单元。
MyISAM do not support transactions whereas InnoDB does.
MyISAM 不支持事务,而 InnoDB 支持。
If an operation is interrupted while using a MyISAM table, the operation is aborted immediately, and the rows (or even data within each row) that are affected remains affected, even if the operation did not go to completion.
如果在使用 MyISAM 表时操作被中断,操作会立即中止,受影响的行(甚至每行中的数据)仍然受到影响,即使操作没有完成。
If an operation is interrupted while using an InnoDB table, because it using transactions, which has atomicity, any transaction which did not go to completion will not take effect, since no commit is made.
如果操作在使用 InnoDB 表时被中断,因为它使用具有原子性的事务,任何没有完成的事务都不会生效,因为没有进行提交。
Table-locking vs Row-locking
表锁定与行锁定
When a query runs against a MyISAM table, the entire table in which it is querying will be locked. This means subsequent queries will only be executed after the current one is finished. If you are reading a large table, and/or there are frequent read and write operations, this can mean a huge backlog of queries.
当对 MyISAM 表运行查询时,它正在查询的整个表将被锁定。这意味着后续查询只会在当前查询完成后执行。如果您正在读取一个大表,和/或频繁的读写操作,这可能意味着大量的查询积压。
When a query runs against an InnoDB table, only the row(s) which are involved are locked, the rest of the table remains available for CRUD operations. This means queries can run simultaneously on the same table, provided they do not use the same row.
当对 InnoDB 表运行查询时,只有涉及的行被锁定,表的其余部分仍可用于 CRUD 操作。这意味着查询可以在同一个表上同时运行,前提是它们不使用同一行。
This feature in InnoDB is known as concurrency. As great as concurrency is, there is a major drawback that applies to a select range of tables, in that there is an overhead in switching between kernel threads, and you should set a limit on the kernel threads to prevent the server coming to a halt.
InnoDB 中的此功能称为并发。与并发性一样,有一个主要缺点适用于选定范围的表,因为在内核线程之间切换会产生开销,您应该对内核线程设置限制以防止服务器停止运行.
Transactions & Rollbacks
事务和回滚
When you run an operation in MyISAM, the changes are set; in InnoDB, those changes can be rolled back. The most common commands used to control transactions are COMMIT, ROLLBACK and SAVEPOINT. 1. COMMIT - you can write multiple DML operations, but the changes will only be saved when a COMMIT is made 2. ROLLBACK - you can discard any operations that have not yet been committed yet 3. SAVEPOINT - sets a point in the list of operations to which a ROLLBACK operation can rollback to
当您在 MyISAM 中运行操作时,会设置更改;在 InnoDB 中,这些更改可以回滚。用于控制事务的最常用命令是 COMMIT、ROLLBACK 和 SAVEPOINT。1. COMMIT - 您可以编写多个 DML 操作,但只有在进行 COMMIT 时才会保存更改 2. ROLLBACK - 您可以放弃尚未提交的任何操作 3. SAVEPOINT - 在列表中设置一个点ROLLBACK 操作可以回滚到的操作
Reliability
可靠性
MyISAM offers no data integrity - Hardware failures, unclean shutdowns and canceled operations can cause the data to become corrupt. This would require full repair or rebuilds of the indexes and tables.
MyISAM 不提供数据完整性 - 硬件故障、不正常关机和取消的操作可能导致数据损坏。这将需要完全修复或重建索引和表。
InnoDB, on the other hand, uses a transactional log, a double-write buffer and automatic checksumming and validation to prevent corruption. Before InnoDB makes any changes, it records the data before the transactions into a system tablespace file called ibdata1. If there is a crash, InnoDB would autorecover through the replay of those logs.
另一方面,InnoDB 使用事务日志、双写缓冲区和自动校验和验证来防止损坏。在 InnoDB 进行任何更改之前,它将事务之前的数据记录到名为 ibdata1 的系统表空间文件中。如果发生崩溃,InnoDB 将通过重播这些日志自动恢复。
FULLTEXT Indexing
全文索引
InnoDB does not support FULLTEXT indexing until MySQL version 5.6.4. As of the writing of this post, many shared hosting providers' MySQL version is still below 5.6.4, which means FULLTEXT indexing is not supported for InnoDB tables.
InnoDB 在 MySQL 5.6.4 版本之前不支持 FULLTEXT 索引。在撰写本文时,许多共享主机提供商的 MySQL 版本仍然低于 5.6.4,这意味着 InnoDB 表不支持 FULLTEXT 索引。
However, this is not a valid reason to use MyISAM. It's best to change to a hosting provider that supports up-to-date versions of MySQL. Not that a MyISAM table that uses FULLTEXT indexing cannot be converted to an InnoDB table.
但是,这不是使用 MyISAM 的正当理由。最好更改为支持 MySQL 最新版本的托管服务提供商。并不是说使用 FULLTEXT 索引的 MyISAM 表不能转换为 InnoDB 表。
Conclusion
结论
In conclusion, InnoDB should be your default storage engine of choice. Choose MyISAM or other data types when they serve a specific need.
总之,InnoDB 应该是您选择的默认存储引擎。在满足特定需求时选择 MyISAM 或其他数据类型。
回答by staticsan
For a load with more writes and reads, you will benefit from InnoDB. Because InnoDB provides row-locking rather than table-locking, your SELECT
s can be concurrent, not just with each other but also with many INSERT
s. However, unless you are intending to use SQL transactions, set the InnoDB commit flush to 2 (innodb_flush_log_at_trx_commit). This gives you back a lot of raw performance that you would otherwise lose when moving tables from MyISAM to InnoDB.
对于具有更多写入和读取的负载,您将受益于 InnoDB。因为 InnoDB 提供行锁定而不是表锁定,所以您的SELECT
s 可以并发,不仅可以相互并发,也可以与许多INSERT
s并发。但是,除非您打算使用 SQL 事务,否则请将 InnoDB 提交刷新设置为 2 ( innodb_flush_log_at_trx_commit)。这为您提供了许多原始性能,否则您将在将表从 MyISAM 移动到 InnoDB 时会失去这些性能。
Also, consider adding replication. This gives you some read scaling and since you stated your reads don't have to be up-to-date, you can let the replication fall behind a little. Just be sure that it can catch up under anything but the heaviest traffic or it will always be behind and will never catch up. If you go this way, however, I stronglyrecommend you isolate reading from the slaves and replication lag management to your database handler. It is so much simpler if the application code does not know about this.
另外,请考虑添加复制。这为您提供了一些读取扩展,并且由于您声明您的读取不必是最新的,您可以让复制稍微落后一点。只要确保它可以在除最繁忙的交通之外的任何情况下赶上,否则它将永远落后并且永远赶不上。但是,如果您采用这种方式,我强烈建议您将读取与从站和复制延迟管理隔离到您的数据库处理程序。如果应用程序代码不知道这一点,那就简单多了。
Finally, be aware of different table loads. You will not have the same read/write ratio on all tables. Some smaller tables with near 100% reads could afford to stay MyISAM. Likewise, if you have some tables that are near 100% write, you may benefit from INSERT DELAYED
, but that is only supported in MyISAM (the DELAYED
clause is ignored for an InnoDB table).
最后,注意不同的表负载。您不会对所有表具有相同的读/写比率。一些接近 100% 读取的较小表可以负担得起保持 MyISAM。同样,如果您有一些接近 100% 写入的表,您可能会受益于INSERT DELAYED
,但这仅在 MyISAM 中受支持(DELAYED
InnoDB 表忽略该子句)。
But benchmark to be sure.
但要确定基准。
回答by StackG
To add to the wide selection of responses here covering the mechanical differences between the two engines, I present an empirical speed comparison study.
为了增加这里涵盖两种发动机之间机械差异的广泛响应选择,我提出了一项经验速度比较研究。
In terms of pure speed, it is not always the case that MyISAM is faster than InnoDB but in my experience it tends to be faster for PURE READ working environments by a factor of about 2.0-2.5 times. Clearly this isn't appropriate for all environments - as others have written, MyISAM lacks such things as transactions and foreign keys.
在纯速度方面,MyISAM 并不总是比 InnoDB 快,但根据我的经验,它在 PURE READ 工作环境中往往快 2.0-2.5 倍。显然这并不适用于所有环境——正如其他人所写的那样,MyISAM 缺少诸如事务和外键之类的东西。
I've done a bit of benchmarking below - I've used python for looping and the timeit library for timing comparisons. For interest I've also included the memory engine, this gives the best performance across the board although it is only suitable for smaller tables (you continually encounter The table 'tbl' is full
when you exceed the MySQL memory limit). The four types of select I look at are:
我在下面做了一些基准测试 - 我使用 python 进行循环,使用 timeit 库进行时间比较。出于兴趣,我还包括了内存引擎,尽管它仅适用于较小的表(The table 'tbl' is full
当您超过 MySQL 内存限制时会不断遇到),但它提供了全面的最佳性能。我看到的四种选择类型是:
- vanilla SELECTs
- counts
- conditional SELECTs
- indexed and non-indexed sub-selects
- 香草选择
- 计数
- 条件选择
- 索引和非索引子选择
Firstly, I created three tables using the following SQL
首先,我使用以下 SQL 创建了三个表
CREATE TABLE
data_interrogation.test_table_myisam
(
index_col BIGINT NOT NULL AUTO_INCREMENT,
value1 DOUBLE,
value2 DOUBLE,
value3 DOUBLE,
value4 DOUBLE,
PRIMARY KEY (index_col)
)
ENGINE=MyISAM DEFAULT CHARSET=utf8
with 'MyISAM' substituted for 'InnoDB' and 'memory' in the second and third tables.
在第二个和第三个表中用“MyISAM”代替“InnoDB”和“memory”。
1) Vanilla selects
1)香草选择
Query: SELECT * FROM tbl WHERE index_col = xx
询问: SELECT * FROM tbl WHERE index_col = xx
Result: draw
结果:平局
The speed of these is all broadly the same, and as expected is linear in the number of columns to be selected. InnoDB seems slightlyfaster than MyISAM but this is really marginal.
这些的速度大致相同,并且正如预期的那样与要选择的列数呈线性关系。InnoDB 似乎比 MyISAM稍快,但这确实是微不足道的。
Code:
代码:
import timeit
import MySQLdb
import MySQLdb.cursors
import random
from random import randint
db = MySQLdb.connect(host="...", user="...", passwd="...", db="...", cursorclass=MySQLdb.cursors.DictCursor)
cur = db.cursor()
lengthOfTable = 100000
# Fill up the tables with random data
for x in xrange(lengthOfTable):
rand1 = random.random()
rand2 = random.random()
rand3 = random.random()
rand4 = random.random()
insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
cur.execute(insertString)
cur.execute(insertString2)
cur.execute(insertString3)
db.commit()
# Define a function to pull a certain number of records from these tables
def selectRandomRecords(testTable,numberOfRecords):
for x in xrange(numberOfRecords):
rand1 = randint(0,lengthOfTable)
selectString = "SELECT * FROM " + testTable + " WHERE index_col = " + str(rand1)
cur.execute(selectString)
setupString = "from __main__ import selectRandomRecords"
# Test time taken using timeit
myisam_times = []
innodb_times = []
memory_times = []
for theLength in [3,10,30,100,300,1000,3000,10000]:
innodb_times.append( timeit.timeit('selectRandomRecords("test_table_innodb",' + str(theLength) + ')', number=100, setup=setupString) )
myisam_times.append( timeit.timeit('selectRandomRecords("test_table_myisam",' + str(theLength) + ')', number=100, setup=setupString) )
memory_times.append( timeit.timeit('selectRandomRecords("test_table_memory",' + str(theLength) + ')', number=100, setup=setupString) )
2) Counts
2) 计数
Query: SELECT count(*) FROM tbl
询问: SELECT count(*) FROM tbl
Result: MyISAM wins
结果:MyISAM 获胜
This one demonstrates a big difference between MyISAM and InnoDB - MyISAM (and memory) keeps track of the number of records in the table, so this transaction is fast and O(1). The amount of time required for InnoDB to count increases super-linearly with table size in the range I investigated. I suspect many of the speed-ups from MyISAM queries that are observed in practice are due to similar effects.
这个展示了 MyISAM 和 InnoDB 之间的一个很大的区别——MyISAM(和内存)跟踪表中的记录数,所以这个事务很快并且 O(1)。InnoDB 计算所需的时间量随我调查的范围内的表大小超线性增加。我怀疑在实践中观察到的 MyISAM 查询的许多加速是由于类似的效果。
Code:
代码:
myisam_times = []
innodb_times = []
memory_times = []
# Define a function to count the records
def countRecords(testTable):
selectString = "SELECT count(*) FROM " + testTable
cur.execute(selectString)
setupString = "from __main__ import countRecords"
# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:
truncateString = "TRUNCATE test_table_innodb"
truncateString2 = "TRUNCATE test_table_myisam"
truncateString3 = "TRUNCATE test_table_memory"
cur.execute(truncateString)
cur.execute(truncateString2)
cur.execute(truncateString3)
for x in xrange(theLength):
rand1 = random.random()
rand2 = random.random()
rand3 = random.random()
rand4 = random.random()
insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
cur.execute(insertString)
cur.execute(insertString2)
cur.execute(insertString3)
db.commit()
# Count and time the query
innodb_times.append( timeit.timeit('countRecords("test_table_innodb")', number=100, setup=setupString) )
myisam_times.append( timeit.timeit('countRecords("test_table_myisam")', number=100, setup=setupString) )
memory_times.append( timeit.timeit('countRecords("test_table_memory")', number=100, setup=setupString) )
3) Conditional selects
3) 条件选择
Query: SELECT * FROM tbl WHERE value1<0.5 AND value2<0.5 AND value3<0.5 AND value4<0.5
询问: SELECT * FROM tbl WHERE value1<0.5 AND value2<0.5 AND value3<0.5 AND value4<0.5
Result: MyISAM wins
结果:MyISAM 获胜
Here, MyISAM and memory perform approximately the same, and beat InnoDB by about 50% for larger tables. This is the sort of query for which the benefits of MyISAM seem to be maximised.
在这里,MyISAM 和内存的性能大致相同,对于较大的表,其性能比 InnoDB 高出约 50%。这种查询似乎可以最大限度地发挥 MyISAM 的优势。
Code:
代码:
myisam_times = []
innodb_times = []
memory_times = []
# Define a function to perform conditional selects
def conditionalSelect(testTable):
selectString = "SELECT * FROM " + testTable + " WHERE value1 < 0.5 AND value2 < 0.5 AND value3 < 0.5 AND value4 < 0.5"
cur.execute(selectString)
setupString = "from __main__ import conditionalSelect"
# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:
truncateString = "TRUNCATE test_table_innodb"
truncateString2 = "TRUNCATE test_table_myisam"
truncateString3 = "TRUNCATE test_table_memory"
cur.execute(truncateString)
cur.execute(truncateString2)
cur.execute(truncateString3)
for x in xrange(theLength):
rand1 = random.random()
rand2 = random.random()
rand3 = random.random()
rand4 = random.random()
insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
cur.execute(insertString)
cur.execute(insertString2)
cur.execute(insertString3)
db.commit()
# Count and time the query
innodb_times.append( timeit.timeit('conditionalSelect("test_table_innodb")', number=100, setup=setupString) )
myisam_times.append( timeit.timeit('conditionalSelect("test_table_myisam")', number=100, setup=setupString) )
memory_times.append( timeit.timeit('conditionalSelect("test_table_memory")', number=100, setup=setupString) )
4) Sub-selects
4) 子选择
Result: InnoDB wins
结果:InnoDB 获胜
For this query, I created an additional set of tables for the sub-select. Each is simply two columns of BIGINTs, one with a primary key index and one without any index. Due to the large table size, I didn't test the memory engine. The SQL table creation command was
对于这个查询,我为子选择创建了一组额外的表。每个都是简单的两列 BIGINT,一列有主键索引,另一列没有任何索引。由于表很大,我没有测试内存引擎。SQL 表创建命令是
CREATE TABLE
subselect_myisam
(
index_col bigint NOT NULL,
non_index_col bigint,
PRIMARY KEY (index_col)
)
ENGINE=MyISAM DEFAULT CHARSET=utf8;
where once again, 'MyISAM' is substituted for 'InnoDB' in the second table.
再次,'MyISAM' 替换为第二个表中的 'InnoDB'。
In this query, I leave the size of the selection table at 1000000 and instead vary the size of the sub-selected columns.
在此查询中,我将选择表的大小保留为 1000000,而是改变子选定列的大小。
Here the InnoDB wins easily. After we get to a reasonable size table both engines scale linearly with the size of the sub-select. The index speeds up the MyISAM command but interestingly has little effect on the InnoDB speed. subSelect.png
InnoDB 在这里轻松获胜。在我们得到一个合理的大小表后,两个引擎都随着子选择的大小线性扩展。索引加速了 MyISAM 命令,但有趣的是对 InnoDB 速度影响不大。子选择.png
Code:
代码:
myisam_times = []
innodb_times = []
myisam_times_2 = []
innodb_times_2 = []
def subSelectRecordsIndexed(testTable,testSubSelect):
selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT index_col FROM " + testSubSelect + " )"
cur.execute(selectString)
setupString = "from __main__ import subSelectRecordsIndexed"
def subSelectRecordsNotIndexed(testTable,testSubSelect):
selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT non_index_col FROM " + testSubSelect + " )"
cur.execute(selectString)
setupString2 = "from __main__ import subSelectRecordsNotIndexed"
# Truncate the old tables, and re-fill with 1000000 records
truncateString = "TRUNCATE test_table_innodb"
truncateString2 = "TRUNCATE test_table_myisam"
cur.execute(truncateString)
cur.execute(truncateString2)
lengthOfTable = 1000000
# Fill up the tables with random data
for x in xrange(lengthOfTable):
rand1 = random.random()
rand2 = random.random()
rand3 = random.random()
rand4 = random.random()
insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
cur.execute(insertString)
cur.execute(insertString2)
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:
truncateString = "TRUNCATE subselect_innodb"
truncateString2 = "TRUNCATE subselect_myisam"
cur.execute(truncateString)
cur.execute(truncateString2)
# For each length, empty the table and re-fill it with random data
rand_sample = sorted(random.sample(xrange(lengthOfTable), theLength))
rand_sample_2 = random.sample(xrange(lengthOfTable), theLength)
for (the_value_1,the_value_2) in zip(rand_sample,rand_sample_2):
insertString = "INSERT INTO subselect_innodb (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"
insertString2 = "INSERT INTO subselect_myisam (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"
cur.execute(insertString)
cur.execute(insertString2)
db.commit()
# Finally, time the queries
innodb_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString) )
myisam_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString) )
innodb_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString2) )
myisam_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString2) )
I think the take-home message of all of this is that if you are reallyconcerned about speed, you need to benchmark the queries that you're doing rather than make any assumptions about which engine will be more suitable.
我认为所有这些的实际信息是,如果您真的关心速度,则需要对正在执行的查询进行基准测试,而不是对哪种引擎更合适做出任何假设。
回答by Patrick Savalle
Slightly off-topic, but for documentation purposes and completeness, I would like to add the following.
稍微偏离主题,但出于文档目的和完整性,我想添加以下内容。
In general using InnoDB will result in a much LESS complex application, probably also more bug-free. Because you can put all referential integrity (Foreign Key-constraints) into the datamodel, you don't need anywhere near as much application code as you will need with MyISAM.
一般来说,使用 InnoDB 会导致应用程序复杂得多,也可能更没有错误。因为您可以将所有参照完整性(外键约束)放入数据模型中,所以您不需要像使用 MyISAM 那样多的应用程序代码。
Every time you insert, delete or replace a record, you will HAVE to check and maintain the relationships. E.g. if you delete a parent, all children should be deleted too. For instance, even in a simple blogging system, if you delete a blogposting record, you will have to delete the comment records, the likes, etc. In InnoDB this is done automatically by the database engine (if you specified the contraints in the model) and requires no application code. In MyISAM this will have to be coded into the application, which is very difficult in web-servers. Web-servers are by nature very concurrent / parallel and because these actions should be atomical and MyISAM supports no real transactions, using MyISAM for web-servers is risky / error-prone.
每次插入、删除或替换记录时,您都必须检查和维护关系。例如,如果您删除父项,则所有子项也应被删除。例如,即使在一个简单的博客系统中,如果您删除了一条博客记录,您将不得不删除评论记录、点赞等。在 InnoDB 中,这是由数据库引擎自动完成的(如果您在模型中指定了约束) ) 并且不需要应用程序代码。在 MyISAM 中,这必须编码到应用程序中,这在 Web 服务器中非常困难。Web 服务器本质上是非常并发/并行的,并且因为这些操作应该是原子的并且 MyISAM 不支持真正的事务,所以将 MyISAM 用于 Web 服务器是有风险的/容易出错的。
Also in most general cases, InnoDB will perform much better, for a multiple of reasons, one them being able to use record level locking as opposed to table-level locking. Not only in a situation where writes are more frequent than reads, also in situations with complex joins on large datasets. We noticed a 3 fold performance increase just by using InnoDB tables over MyISAM tables for very large joins (taking several minutes).
同样在大多数一般情况下,InnoDB 会表现得更好,原因有很多,其中一个原因是能够使用记录级锁定而不是表级锁定。不仅在写入比读取更频繁的情况下,在大型数据集上具有复杂连接的情况下也是如此。我们注意到,对于非常大的连接(需要几分钟),仅使用 InnoDB 表而不是 MyISAM 表,性能就提高了 3 倍。
I would say that in general InnoDB (using a 3NF datamodel complete with referential integrity) should be the default choice when using MySQL. MyISAM should only be used in very specific cases. It will most likely perform less, result in a bigger and more buggy application.
我会说一般来说 InnoDB(使用具有参照完整性的 3NF 数据模型)应该是使用 MySQL 时的默认选择。MyISAM 应该只在非常特殊的情况下使用。它很可能执行得更少,导致应用程序更大、错误更多。
Having said this. Datamodelling is an art seldom found among webdesigners / -programmers. No offence, but it does explain MyISAM being used so much.
说到这。数据建模是一门在网页设计师/程序员中很少见到的艺术。无意冒犯,但它确实解释了 MyISAM 被如此频繁地使用。
回答by Pankaj Khurana
InnoDB offers:
InnoDB 提供:
ACID transactions
row-level locking
foreign key constraints
automatic crash recovery
table compression (read/write)
spatial data types (no spatial indexes)
In InnoDB all data in a row except for TEXT and BLOB can occupy 8,000 bytes at most. No full text indexing is available for InnoDB. In InnoDB the COUNT(*)s (when WHERE, GROUP BY, or JOIN is not used) execute slower than in MyISAM because the row count is not stored internally. InnoDB stores both data and indexes in one file. InnoDB uses a buffer pool to cache both data and indexes.
在 InnoDB 中,一行中除 TEXT 和 BLOB 之外的所有数据最多可占用 8,000 字节。InnoDB 没有全文索引可用。在 InnoDB 中,COUNT(*)s(当不使用 WHERE、GROUP BY 或 JOIN 时)执行速度比在 MyISAM 中慢,因为行计数不在内部存储。InnoDB 将数据和索引存储在一个文件中。InnoDB 使用缓冲池来缓存数据和索引。
MyISAM offers:
MyISAM 提供:
fast COUNT(*)s (when WHERE, GROUP BY, or JOIN is not used)
full text indexing
smaller disk footprint
very high table compression (read only)
spatial data types and indexes (R-tree)
MyISAM has table-level locking, but no row-level locking. No transactions. No automatic crash recovery, but it does offer repair table functionality. No foreign key constraints. MyISAM tables are generally more compact in size on disk when compared to InnoDB tables. MyISAM tables could be further highly reduced in size by compressing with myisampack if needed, but become read-only. MyISAM stores indexes in one file and data in another. MyISAM uses key buffers for caching indexes and leaves the data caching management to the operating system.
MyISAM 有表级锁,但没有行级锁。没有交易。没有自动崩溃恢复,但它确实提供了修复表功能。没有外键约束。与 InnoDB 表相比,MyISAM 表通常在磁盘上的大小更紧凑。如果需要,可以通过使用 myisampack 进行压缩,进一步大大减小 MyISAM 表的大小,但变为只读。MyISAM 将索引存储在一个文件中,将数据存储在另一个文件中。MyISAM 使用关键缓冲区来缓存索引,并将数据缓存管理留给操作系统。
Overall I would recommend InnoDB for most purposes and MyISAM for specialized uses only. InnoDB is now the default engine in new MySQL versions.
总的来说,我会推荐 InnoDB 用于大多数用途,而 MyISAM 仅用于特殊用途。InnoDB 现在是新 MySQL 版本中的默认引擎。
回答by MarkR
If you use MyISAM, you won't be doing anytransactions per hour, unless you consider each DML statement to be a transaction (which in any case, won't be durable or atomic in the event of a crash).
如果您使用 MyISAM,您将不会每小时执行任何事务,除非您将每个 DML 语句视为一个事务(在任何情况下,在发生崩溃时都不是持久或原子的)。
Therefore I think you have to use InnoDB.
因此我认为您必须使用 InnoDB。
300 transactions per second sounds like quite a lot. If you absolutely need these transactions to be durable across power failure make sure your I/O subsystem can handle this many writes per second easily. You will need at least a RAID controller with battery backed cache.
每秒 300 次交易听起来很多。如果您绝对需要这些事务在断电时保持持久性,请确保您的 I/O 子系统可以轻松处理每秒如此多的写入。您至少需要一个带有电池备份缓存的 RAID 控制器。
If you can take a small durability hit, you could use InnoDB with innodb_flush_log_at_trx_commit set to 0 or 2 (see docs for details), you can improve performance.
如果您可以降低持久性,则可以使用 InnoDB 并将 innodb_flush_log_at_trx_commit 设置为 0 或 2(有关详细信息,请参阅文档),您可以提高性能。
There are a number of patches which can increase concurrency from Google and others - these may be of interest if you still can't get enough performance without them.
有许多补丁可以提高 Google 和其他公司的并发性 - 如果没有它们您仍然无法获得足够的性能,这些补丁可能会很有趣。