SQL 如何在 INNER JOIN 查询中避免笛卡尔积?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2872278/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 06:17:34  来源:igfitidea点击:

How to avoid Cartesian product in an INNER JOIN query?

sqlinner-joincartesian-product

提问by flhe

I have 6 tables, let's call them a,b,c,d,e,f. Now I want to search all the colums (except the ID columns) of all tables for a certain word, let's say 'Joe'. What I did was, I made INNER JOINS over all the tables and then used LIKE to search the columns.

我有 6 个表,我们称它们为 a、b、c、d、e、f。现在我想在所有表的所有列(ID 列除外)中搜索某个单词,比如说“Joe”。我所做的是,我对所有表进行了内部联接,然后使用 LIKE 来搜索列。

INNER JOIN
...
ON
INNER JOIN
...
ON.......etc.
WHERE a.firstname 
~* 'Joe' 
OR a.lastname 
~* 'Joe' 
OR b.favorite_food 
~* 'Joe'
OR c.job
~* 'Joe'.......etc.

The results are correct, I get all the colums I was looking for. But I also get some kind of cartesian product, I get 2 or more lines with almost the same results.

结果是正确的,我得到了我正在寻找的所有列。但是我也得到了某种笛卡尔积,我得到了 2 条或更多条线,结果几乎相同。

How can i avoid this? I want so have each line only once, since the results should appear on a web search.

我怎样才能避免这种情况?我希望每行只有一次,因为结果应该出现在网络搜索中。

UPDATE

更新

I first tried to figure out if the SELECT DISTINCTthing would work by using this statement: pastie.org/970959But it still gives me a cartesian product. What's wrong with this?

我首先试图SELECT DISTINCT通过使用以下语句来确定这件事是否可行:pastie.org/970959但它仍然给了我一个笛卡尔积。这有什么问题?

回答by chris

try SELECT DISTINCT?

试试SELECT DISTINCT

回答by hgulyan

On what condition do you JOINthis tables? Do you have foreign keysor something?

在什么情况下你JOIN这个tables?你有foreign keys吗?

Maybe you should find that word on each table separately?

也许你应该在每张桌子上分别找到那个词?

回答by molnarm

What kind of server are you using? Microsoft SQL Server has a full-text index feature (I think others have something like this too) which lets you search for keywords in a much less resource-intensive way.

你使用什么样的服务器?Microsoft SQL Server 具有全文索引功能(我认为其他人也有类似的功能),它可以让您以更少的资源密集型方式搜索关键字。

Also consider using UNION instead of joining the tables.

还可以考虑使用 UNION 而不是连接表。

回答by lc.

Without seeing your tables, I can only really assume what's going on here is you have a one-to-many relationship somewhere. You probably want to do everything in a subquery, select out the distinct IDs, then get the data you want to display by ID. Something like:

没有看到你的表格,我只能假设这里发生的事情是你在某处有一对多的关系。您可能希望在子查询中执行所有操作,选择不同的 ID,然后获取要按 ID 显示的数据。就像是:

SELECT a.*, b.*
FROM (SELECT DISTINCT a.ID
      FROM ...
      INNER JOIN ...
      INNER JOIN ...
      WHERE ...) x
INNER JOIN a ON x.ID = a.ID
INNER JOIN b ON x.ID = b.ID

A couple of things to note, however:

但是,有几点需要注意:

  • This is going to be sloooowand you probably want to use full-text search instead (if your RDBMS supports it).

  • It may be faster to search each table separately rather than to join everything in a Cartesian product first and then filter with ORs.

  • 这会很,您可能想改用全文搜索(如果您的 RDBMS 支持)。

  • 单独搜索每个表可能比先加入笛卡尔积中的所有内容然后用 OR 过滤更快。

回答by pyrocumulus

Ifyour tables are entity type tables, for example abeing persons and bbeing companies, I don't think you can avoid a cartesian product if you search for the results in this way (single query).

如果您的表是实体类型表,例如a人和b公司,我认为如果您以这种方式搜索结果(单个查询),您将无法避免使用笛卡尔积。

You say you want to search all the tables for a certain word, but you probably want to separate the results into the corresponding types. Right? Otherwise a web search would not make much sense. So if you seach for 'Joe', you want to see persons containing the name 'Joe' and for example the company named 'Joe's gym'. Since you are searching for different entities so you should split the search into different queries.

您说要在所有表格中搜索某个单词,但您可能希望将结果分成相应的类型。对?否则网络搜索将没有多大意义。因此,如果您搜索“Joe”,您希望看到包含“Joe”名称的人,例如名为“Joe'sgym”的公司。由于您正在搜索不同的实体,因此您应该将搜索拆分为不同的查询。

If you really want to do this in one query, you will have to change your database structure to accommodate. You will need some form of 'search table' containing an entity ID (PK) and entity type, and a list of keywords you want that entity to be found with. For example:

如果您真的想在一个查询中执行此操作,则必须更改数据库结构以适应。您将需要某种形式的“搜索表”,其中包含实体 ID (PK) 和实体类型,以及您希望用于找到该实体的关键字列表。例如:

EntityType, EntityID, Keywords
------------------------------
Person,     4,        'Joe', 'Doe'
Company,    12,       'Joe''s Gym', 'Gym'

Something like that?

类似的东西?

Howeverit's different when your search returns only one type of entity, say a Person, and you want to return the Persons for which you get a hit on that keyword (in any related table to that Person). Then you will need to select all the fields you want to show and group by them, leaving out the fields in which you are searching. Including them inevitably leads to a cartesian product.

但是,当您的搜索仅返回一种类型的实体(例如 Person),并且您想返回您在该关键字(在与该 Person 相关的任何表中)获得命中的 Person 时,情况就不同了。然后,您需要选择要显示的所有字段并按它们分组,而忽略要搜索的字段。包括它们不可避免地会导致笛卡尔积。

I'm just brainstorming here, by the way. It hope it's helpful.

顺便说一下,我只是在这里集思广益。它希望它有帮助。