MySQL SQL:选择一个表中不存在的键

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6731797/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 20:31:25  来源:igfitidea点击:

SQL: Select Keys that doesn't exist in one table

mysqlsqlselect

提问by Thomas

I got a table with a normal setup of auto inc. ids. Some of the rows have been deleted so the ID list could look something like this:

我得到了一张具有正常自动设置的桌子。身。一些行已被删除,因此 ID 列表可能如下所示:

(1, 2, 3, 5, 8, ...)

(1, 2, 3, 5, 8, ...)

Then, from another source (Edit: Another source = NOT in a database) I have this array:

然后,从另一个来源(编辑:另一个来源 = 不在数据库中)我有这个数组:

(1, 3, 4, 5, 7, 8)

(1, 3, 4, 5, 7, 8)

I'm looking for a query I can use on the database to get the list of ID:s NOT in the table from the array I have. Which would be:

我正在寻找一个查询,我可以在数据库上使用它来从我拥有的数组中获取不在表中的 ID:s 列表。这将是:

(4, 7)

(4, 7)

Does such exist? My solution right now is either creating a temporary table so the command "WHERE table.id IS NULL" works, or probably worse, using the PHP function array_diff to see what's missing after having retrieved all the ids from table.

这样的存在吗?我现在的解决方案是创建一个临时表,以便命令“WHERE table.id IS NULL”起作用,或者可能更糟,使用 PHP 函数 array_diff 查看从表中检索所有 id 后丢失的内容。

Since the list of ids are closing in on millions or rows I'm eager to find the best solution.

由于 id 列表接近数百万或行,我渴望找到最佳解决方案。

Thank you! /Thomas

谢谢!/托马斯

Edit 2:

编辑2:

My main application is a rather easy table which is populated by a lot of rows. This application is administrated using a browser and I'm using PHP as the intepreter for the code.

我的主要应用程序是一个相当简单的表,它由很多行填充。这个应用程序是使用浏览器管理的,我使用 PHP 作为代码的解释器。

Everything in this table is to be exported to another system (which is 3rd party product) and there's yet no way of doing this besides manually using the import function in that program. There's also possible to insert new rows in the other system, although the agreed routing is to never ever do this.

此表中的所有内容都将导出到另一个系统(这是第 3 方产品),除了手动使用该程序中的导入功能之外,还没有办法做到这一点。也可以在另一个系统中插入新行,尽管商定的路由永远不会这样做。

The problem is then that my system cannot be 100 % sure that the user did everything correct from when he/she pressed the "export" key. Or, that no rows has ever been created in the other system.

问题是我的系统不能 100% 确定用户从他/她按下“导出”键开始做的所有事情都是正确的。或者,其他系统中从未创建过任何行。

From the other system I can get a CSV-file out where all the rows that system has. So, by comparing the CSV file and my table I can see if: * There are any rows missing in the other system that should have been imported * If someone has created rows in the other system

从另一个系统,我可以得到一个 CSV 文件,其中包含该系统的所有行。因此,通过比较 CSV 文件和我的表,我可以看到: * 其他系统中缺少应该导入的任何行 * 如果有人在其他系统中创建了行

The problem isn't "solving it". It's making the best solution to is since there are so much data in the rows.

问题不是“解决它”。这是最好的解决方案,因为行中有如此多的数据。

Thanks again!

再次感谢!

/Thomas

/托马斯

回答by Talha Ahmed Khan

We can use MYSQL not in option.

我们可以不使用 MYSQL。

SELECT id
FROM table_one
WHERE id NOT IN ( SELECT id FROM table_two )


Edited

已编辑

If you are getting the source from a csv filethen you can simply have to put these values directly like:

如果您从csv 文件获取源代码,那么您只需直接输入这些值,例如:

I am assuming that the CSV are like 1,2,3,...,n

我假设 CSV 就像 1,2,3,...,n

SELECT id
FROM table_one
WHERE id NOT IN ( 1,2,3,...,n );


EDIT 2

编辑 2

Or If you want to select the other way around then you can use mysqlimportto import data in temporary table in MySQL Database and retrieve the result and delete the table.

或者,如果您想选择相反的方式,则可以使用mysqlimport将数据导入 MySQL 数据库中的临时表中并检索结果并删除该表。

Like:

喜欢:

Create table

创建表

CREATE TABLE my_temp_table(
   ids INT,
);

load .csv file

加载 .csv 文件

LOAD DATA LOCAL INFILE 'yourIDs.csv' INTO TABLE my_temp_table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(ids);

Selecting records

选择记录

SELECT ids FROM my_temp_table
WHERE ids NOT IN ( SELECT id FROM table_one )

dropping table

丢弃表

DROP TABLE IF EXISTS my_temp_table

回答by Pascal MARTIN

What about using a left join; something like this :

怎么样使用left join; 像这样:

select second_table.id
from second_table
    left join first_table on first_table.id = second_table.id
where first_table.is is null


You could also go with a sub-query ; depending on the situation, it might, or might not, be faster, though :


您也可以使用子查询;根据情况,它可能会或可能不会更快,但是:

select second_table.id
from second_table
where second_table.id not in (
    select first_table.id
    from first_table
)

Or with a not exists:

或者用一个not exists

select second_table.id
from second_table
where not exists (
    select 1
    from first_table
    where first_table.id = second_table.id
)

回答by Edgar Velasquez Lim

The function you are looking for is NOT IN (an alias for <> ALL)

您要查找的函数不在(<> ALL 的别名)

The MYSQL documentation:

MYSQL 文档:

http://dev.mysql.com/doc/refman/5.0/en/all-subqueries.html

http://dev.mysql.com/doc/refman/5.0/en/all-subqueries.html

An Example of its use:

其使用示例:

http://www.roseindia.net/sql/mysql-example/not-in.shtml

http://www.roseindia.net/sql/mysql-example/not-in.shtml

Enjoy!

享受!

回答by Tim

The problem is that T1 could have a million rows or ten million rows, and that number could change, so you don't know how many rows your comparison table, T2, the one that has no gaps, should have, for doing a WHERE NOT EXISTS or a LEFT JOIN testing for NULL.

问题是 T1 可能有 100 万行或 1000 万行,而且这个数字可能会改变,所以你不知道你的比较表 T2 应该有多少行,没有间隙的表,应该有多少行,用于执行 WHERE NOT EXISTS 或 LEFT JOIN 测试是否为 NULL。

But the question is, why do you care if there are missing values? I submit that, when an application is properly architected, it should not matter if there are gaps in an autoincrementing key sequence. Even an application where gaps do matter, such as a check-register, should not be using an autoincrenting primary key as a synonym for the check number.

但问题是,你为什么要关心是否有缺失值?我认为,当应用程序的架构正确时,自动递增的键序列中是否存在间隙应该无关紧要。即使是存在差距的应用程序,例如支票登记,也不应该使用自动递增的主键作为支票号码的同义词。

Care to elaborate on your application requirement?

想详细说明您的申请要求吗?

回答by Tim

OK, I've read your edits/elaboration. Syncrhonizing two databases where the second is not supposed to insert any new rows, but might do so, sounds like a problem waiting to happen.

好的,我已经阅读了您的编辑/阐述。同步两个数据库,其中第二个不应该插入任何新行,但可能会这样做,听起来像是一个等待发生的问题。

Neither approach suggested above (WHERE NOT EXISTS or LEFT JOIN) is air-tight and neither is a way to guarantee logical integrity between the two systems. They will not let you know which system created a row in situations where both tables contain a row with the same id. You're focusing on gaps now, but another problem is duplicate ids.

上面建议的方法(WHERE NOT EXISTS 或 LEFT JOIN)都不是密封的,也不是保证两个系统之间逻辑完整性的方法。在两个表都包含具有相同 ID 的行的情况下,它们不会让您知道哪个系统创建了一行。您现在专注于差距,但另一个问题是重复的 ID。

For example, if both tables have a row with id 13887, you cannot assume that database1 created the row. It could have been inserted into database2, and then database1 could insert a new row using that same id. You would have to compare all column values to ascertain that the rows are the same or not.

例如,如果两个表都有一个 ID 为 13887 的行,则不能假设 database1 创建了该行。它可以插入到 database2,然后 database1 可以使用相同的 id 插入一个新行。您必须比较所有列值以确定行是否相同。

I'd suggest therefore that you also explore GUID as a replacement for autoincrementing integers. You cannot prevent database2 from inserting rows, but at least with GUIDs you won't run into a problem where the second database has inserted a row and assigned it a primary key value that your first database might also use, resulting in two different rows with the same id. CreationDateTime and LastUpdateDateTime columns would also be useful.

因此,我建议您也探索 GUID 作为自动递增整数的替代品。您无法阻止 database2 插入行,但至少使用 GUID,您不会遇到第二个数据库插入一行并为其分配一个您的第一个数据库也可能使用的主键值的问题,从而导致两个不同的行同一个身。CreationDateTime 和 LastUpdateDateTime 列也很有用。

However, a proper solution, if it is available to you, is to maintain just one database and give users remote access to it, for example, via a web interface. That would eliminate the mess and complication of replication/synchronization issues.

但是,一个合适的解决方案(如果您可以使用)是仅维护一个数据库并让用户远程访问它,例如,通过 Web 界面。这将消除复制/同步问题的混乱和复杂性。

If a remote-access web-interface is not feasible, perhaps you could make one of the databases read-only? Or does database2 have to make updates to the rows? Perhaps you could deny insert privilege? What database engine are you using?

如果远程访问 Web 界面不可行,也许您可​​以将其中一个数据库设为只读?或者 database2 是否必须更新行?也许您可以拒绝插入特权?你用的是什么数据库引擎?

回答by noelbk

I have the same problem: I have a list of values from the user, and I want to find the subset that does notexist in anther table. I did it in oracle by building a pseudo-table in the select statement Here's a way to do it in Oracle. Try it in MySQL without the "from dual":

我有同样的问题:我有一个来自用户的值列表,我想找到花药表中存在的子集。我在oracle 中通过在select 语句中构建一个伪表来实现的这是在Oracle 中实现的一种方法。在没有“from dual”的情况下在MySQL中尝试:

-- find ids from user (1,2,3) that *don't* exist in my person table
-- build a pseudo table and join it with my person table
select pseudo.id from (
  select '1' as id from dual
  union select '2' as id from dual
  union select '3' as id from dual
) pseudo
left join person
  on person.person_id = pseudo.id
where person.person_id is null