MySQL 如何检查给定数据是否存在于多个表中(所有表都具有相同的列)?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7232704/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 20:57:34  来源:igfitidea点击:

How to check if a given data exists in multiple tables (all of which has the same column)?

mysqlsql

提问by Ted

I have 3 tables, each consisting of a column called username. On the registration part, I need to check that the requested username is new and unique.

我有 3 个表,每个表都包含一个名为 username 的列。在注册部分,我需要检查请求的用户名是否是新的和唯一的。

I need that single SQL that will tell me if that user exists in any of these tables, before I proceed. I tried:

在我继续之前,我需要该单个 SQL 来告诉我该用户是否存在于这些表中的任何一个中。我试过:

SELECT tbl1.username, tbl2.username, tbl3.username
FROM tbl1,tbl2,tbl3
WHERE tbl1.username = {$username}
   OR tbl2.username = {$username}
   OR tbl3.username ={$username}

Is that the way to go?

这是要走的路吗?

回答by RedFilter

select 1 
from (
    select username as username from tbl1
    union all
    select username from tbl2
    union all
    select username from tbl3
) a
where username = 'someuser'

回答by Michael Hays

In the event you honestly just want to know if a user exists:

如果您真的只想知道用户是否存在:

The quickest approach is an existence query:

最快的方法是存在查询:

select 
NOT EXISTS (select username from a where username = {$username}) AND 
NOT EXISTS (select username from b where username = {$username}) AND 
NOT EXISTS (select username from c where username = {$username});

If your usernamecolumn is marked as Uniquein each table, this should be the most efficient query you will be able to make to perform this operation, and this will outperform a normalized username tablein terms of memory usage and, well, virtually anyother query that cares about usernameand another column, as there are no excessive joins. If you've ever been called on to speed up an organization's database, I can assure you that over-normalization is a nightmare. In regards to the advice you've received on normalization in this thread, be wary. It's great for limiting space, or limiting the number of places you have to update data, but you have to weigh that against the maintenance and speed overhead. Take the advice given to you on this page with a grain of salt.

如果您的usernameUnique在每个表中都被标记为,这应该是您能够执行此操作的最有效查询,并且这在内存使用方面优于规范化的用户名表,而且几乎任何其他查询关心username和另一列,因为没有过多的连接。如果您曾经被要求加快组织数据库的速度,我可以向您保证,过度规范化是一场噩梦。关于您在此线程中收到的关于规范化的建议,请保持警惕。它非常适合限制空间或限制必须更新数据的位置数量,但您必须权衡维护和速度开销。对本页上给您的建议持保留态度。

Get used to running a query analyzer on your queries, if for no other reason than to get in the habit of learning the ramifications of choices when writing queries -- at least until you get your sea legs.

习惯于对您的查询运行查询分析器,如果没有其他原因,只是为了在编写查询时养成学习选择的后果的习惯——至少在您获得海腿之前。

In the event you want to insert a user later:

如果您想稍后插入用户:

If you are doing this for the purpose of eventually adding the user to the database, here is a better approach, and it's worth it to learn it. Attempt to insert the value immediately. Check afterwards to see if it was successful. This way there is no room for some other database call to insert a record in between the time you've checked and the time you inserted into the database. For instance, in MySQL you might do this:

如果您这样做是为了最终将用户添加到数据库中,这里有一个更好的方法,值得学习。尝试立即插入值。之后检查是否成功。这样就没有空间让其他一些数据库调用在您检查的时间和您插入数据库的时间之间插入记录。例如,在 MySQL 中,您可以这样做:

INSERT INTO {$table} (`username`, ... )
  SELECT {$username} as `username`, ... FROM DUAL 
   WHERE 
     NOT EXISTS (select username from a where username = {$username}) AND 
     NOT EXISTS (select username from b where username = {$username}) AND 
     NOT EXISTS (select username from c where username = {$username});

All database API's I've seen, as well as all SQL implementations will provide you a way to discover how many rows were inserted. If it's 1, then the username didn't exist and the insertion was successful. In this case, I don't know your dialect, and so I've chosen MySQL, which provides a DUALtable specifically for returning results that aren't bound to a table, but honestly, there are many ways to skin this cat, whether you put it in a transaction or a stored procedure, or strictly limit the process and procedure that can access these tables.

我见过的所有数据库 API 以及所有 SQL 实现都将为您提供一种方法来发现插入了多少行。如果为 1,则用户名不存在,插入成功。在这种情况下,我不知道你的方言,所以我选择了 MySQL,它提供了一个DUAL专门用于返回未绑定到表的结果的表,但老实说,有很多方法可以给这只猫剥皮,无论是你把它放在事务或存储过程中,或者严格限制可以访问这些表的进程和过程。

Update -- How to handle users who don't complete the sign up process

更新——如何处理未完成注册过程的用户

As @RedFilter points out, if registration is done in multiple steps -- reserving a username, filling out details, perhaps answering an email confirmation, then you will want to at least add a column to flag this user (with a timestamp, not a boolean) so that you can periodically remove users after some time period, though I recommend creating a ToBePurgedtable and add new users to that, along with a timestamp. When the confirmation comes through, you remove the user from this table. Periodically you will check this table for all entries prior to some delta off your current time and simply delete them from whichever table they were originally added. My philosophy behind this is to define more clearly the responsibility of the table and to keep the number of records you are working with very lean. We certainly don't want to over-engineer our solutions, but if you get into the habit of good architectural practices, these designs will flow out as naturally as their less efficient counterparts.

正如@RedFilter 指出的那样,如果注册是通过多个步骤完成的——保留用户名、填写详细信息、或者回复电子邮件确认,那么您至少需要添加一列来标记该用户(带有时间戳,而不是boolean) 以便您可以在一段时间后定期删除用户,但我建议创建一个ToBePurged表并向其中添加新用户以及时间戳。当确认通过时,您从该表中删除用户。您将定期检查此表中的所有条目,然后再从当前时间的某个增量中删除它们,然后将它们从最初添加的任何表中删除。我在这背后的理念是更清楚地定义表的责任,并保持您正在处理的记录数量非常精简。我们当然不想过度设计我们的解决方案,但如果您养成良好架构实践的习惯,这些设计将与效率较低的对应设计一样自然地流出。

回答by Steve Wellens

No. Two processes could run your test at the same time and both would report no user and then both could insert the same user.

不可以。两个进程可以同时运行您的测试,并且都不会报告用户,然后两者都可以插入相同的用户。

It sounds like you need a single table to hold ALL the users with a unique index to prevent duplicates. This master table could link to 'sub-tables' using a user ID, not user name.

听起来您需要一个表来保存所有具有唯一索引的用户以防止重复。该主表可以使用用户 ID 而不是用户名链接到“子表”。

回答by Johan

1- You need to normalize your tables

1-您需要规范化您的表格

See: http://databases.about.com/od/specificproducts/a/normalization.htm

请参阅:http: //databases.about.com/od/specificproducts/a/normalization.htm

2- Don't use implicit SQL '89 joins.

2- 不要使用隐式 SQL '89 连接。

Kick the habit and use explicit joins

改掉习惯并使用显式连接

SELECT a.field1, b.field2, c.field3
FROM a
INNER JOIN b ON (a.id = b.a_id)  -- JOIN criteria go here
INNER JOIN c ON (b.id = c.b_id)  -- and here, nice and explicit.
WHERE ... -- filter criteria go here.

回答by James

Given the collation stuff, you could do this instead, if you don't want to deal with the collation mismatch:

鉴于整理的东西,如果你不想处理整理不匹配,你可以这样做:

select sum(usercount) as usercount
from (
    select count(*) as usercount from tbl1 where username = 'someuser'
    union all
    select count(*) as usercount from tbl2 where username = 'someuser'
    union all
    select count(*) as usercount from tbl3 where username = 'someuser'
) as usercounts

If you get 0, there isn't a user with that username, if you get something higher, there is.

如果您得到 0,则没有具有该用户名的用户,如果您得到更高的用户名,则存在。

Note: Depending on how you do the insert, you could in theory get more than one user with the same username due to race conditions (see other comments about normalisation and unique keys).

注意:根据您如何进行插入,由于竞争条件,理论上您可能会获得多个具有相同用户名的用户(请参阅有关规范化和唯一键的其他评论)。

回答by John Humphreys - w00te

With your current set up RedFilter's answer should work fine. I thought it would be worth noting that you shouldn't have redundant or dispersed data in your database to begin with though.

使用您当前的设置 RedFilter 的答案应该可以正常工作。我认为值得注意的是,您的数据库中不应有冗余或分散的数据。

You should have one and only one place to store any specific data - so in your case, instead of having a username in 3 different tables, you should have one table with username and a primary key identifier for those usernames. Your other 3 tables should then foreign-key reference the username table. You'll be able to construct much simpler and more efficient queries with this layout. You're opening a can of worms by replicating data in various locations.

您应该有一个且只有一个位置来存储任何特定数据 - 因此,在您的情况下,您应该拥有一个包含用户名和这些用户名的主键标识符的表,而不是将用户名放在 3 个不同的表中。您的其他 3 个表应该然后外键引用用户名表。您将能够使用此布局构建更简单、更高效的查询。您正在通过在不同位置复制数据来打开一堆蠕虫。