SQL 加入表时删除重复项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13028925/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 11:45:22  来源:igfitidea点击:

Remove duplicates when joining tables

sqldatabasepostgresql

提问by samach

I have a news table as follows

我有一个新闻表如下

News:

消息:

| id  | title                       | description
| 1   | Breaking news               | bla bla bla
| 2   | Heavy snowfall in london    | bla bla bla

a Type table as follows:

一个类型表如下:

| id  | type_name   | type_code
| 1   | weather     | 0567
| 2   | city        | 0653

and a NewsType table as follows

和一个 NewsType 表如下

|id | news_id | type_id | created_by |
| 1 | 2       | 1       | "John"     |
| 2 | 2       | 2       | "Alex"     |

As you can see from the NewsType table that a single news can fall into two or more types.

从 NewsType 表中可以看出,单个新闻可以分为两种或多种类型。

I need to display news corresponding to types. A user might say give me all the news about cities and weather. To display this I am doing something like:

我需要显示与类型相对应的新闻。用户可能会说给我所有关于城市和天气的新闻。为了显示这一点,我正在做类似的事情:

      select distinct n.* , nt.created_at
      from news n, newstype nt, type t where
      n.id = nt.news_id and
      t.id = nt.type_id 
      order by nt.created_at
      limit 25

The problem is this query returns the same news twice (I think it's because of the inner join I am doing). What should I change in the query so that if a news is classified as two types, and the user has requested to view the same two types of news, I get only single news item? instead of two!

问题是这个查询两次返回相同的消息(我认为这是因为我正在做的内部连接)。我应该在查询中更改什么,以便如果新闻被归类为两种类型,并且用户请求查看相同的两种类型的新闻,我只会得到一个新闻项目?而不是两个!

采纳答案by Diego

simple solution:

简单的解决办法:

select * from news where news_id in (
select news_id 
from NewsType 
where type_id in (the types you want)
)

most people would say that you should add a DISTINCT on the news_id on the inner query. You can try that, but Im quite sure it will decrese performance.

大多数人会说你应该在内部查询的 news_id 上添加一个 DISTINCT。你可以试试,但我很确定它会降低性能。

Over all, if you think this solution doesnt perform well, you can make the inner query a CTE, which usually behaves better:

总而言之,如果您认为此解决方案表现不佳,您可以将内部查询设为 CTE,这通常表现得更好:

with my_CTE as(
    select news_id 
    from NewsType 
    where type_id in (the types you want)
)
select * 
from news 
where news_id in (select news_id  from my_CTE)

回答by dan1111

A group byis another approach to this:

Agroup by是另一种方法:

select n.id, n.title, n.description, max(nt.created_at)
      from news n, newstype nt, type t where
      n.id = nt.news_id and
      t.id = nt.type_id
      group by n.id, n.title, n.description
      order by nt.created_at
      limit 25

回答by Teejay

Try

尝试

select distinct n.id, n.title, n.description

but, as @Jan Dvorak stated,

但是,正如@Jan Dvorak 所说,

select distinct n.*

shouldn't select the same news twice

不应该两次选择相同的新闻

回答by Colin 't Hart

You need to decide what to do with the "duplicate" types: Do you want to display just one type for a news item associated with multiple types, or do you want to list them all?

您需要决定如何处理“重复”类型:您想为与多种类型关联的新闻项目只显示一种类型,还是要列出所有类型?

If the latter, you could investigate using the string_agg function, see http://www.postgresql.org/docs/9.2/static/functions-aggregate.html

如果是后者,您可以使用 string_agg 函数进行调查,请参阅http://www.postgresql.org/docs/9.2/static/functions-aggregate.html

select distinct n.id, n.title, n.description, string_agg(t.type_name, ',')
from news n, newstype nt, type t where
n.id = nt.news_id and
t.id = nt.type_id
group by n.id, n.title, n.description
limit 25

回答by Tobsey

You want to select all of the stories that have an entry in the NewsType table for a praticular type. Therefore you want to select the news items where a relationship to the type exists:

您想要选择在 NewsType 表中具有实际类型条目的所有报道。因此,您要选择与该类型存在关系的新闻项目:

SELECT
    News.ID,
    News.Title,
    News.Description
FROM
    News
WHERE
    EXISTS
    (SELECT
        NULL
    FROM
        NewsType
        INNER JOIN Type ON NewsType.Type_ID = Type.ID
    WHERE
        News.ID = NewsType.News_ID
    AND Type.Type_Code = @typeCode)

The last line of the where clause may need to be changed to Type.Type_Name = @typeName if you are using the type name as the parameter

如果使用类型名称作为参数,则 where 子句的最后一行可能需要更改为 Type.Type_Name = @typeName