SQL 加入表时删除重复项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13028925/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove duplicates when joining tables
提问by samach
I have a news table as follows
我有一个新闻表如下
News:
消息:
| id | title | description
| 1 | Breaking news | bla bla bla
| 2 | Heavy snowfall in london | bla bla bla
a Type table as follows:
一个类型表如下:
| id | type_name | type_code
| 1 | weather | 0567
| 2 | city | 0653
and a NewsType table as follows
和一个 NewsType 表如下
|id | news_id | type_id | created_by |
| 1 | 2 | 1 | "John" |
| 2 | 2 | 2 | "Alex" |
As you can see from the NewsType table that a single news can fall into two or more types.
从 NewsType 表中可以看出,单个新闻可以分为两种或多种类型。
I need to display news corresponding to types. A user might say give me all the news about cities and weather. To display this I am doing something like:
我需要显示与类型相对应的新闻。用户可能会说给我所有关于城市和天气的新闻。为了显示这一点,我正在做类似的事情:
select distinct n.* , nt.created_at
from news n, newstype nt, type t where
n.id = nt.news_id and
t.id = nt.type_id
order by nt.created_at
limit 25
The problem is this query returns the same news twice (I think it's because of the inner join I am doing). What should I change in the query so that if a news is classified as two types, and the user has requested to view the same two types of news, I get only single news item? instead of two!
问题是这个查询两次返回相同的消息(我认为这是因为我正在做的内部连接)。我应该在查询中更改什么,以便如果新闻被归类为两种类型,并且用户请求查看相同的两种类型的新闻,我只会得到一个新闻项目?而不是两个!
采纳答案by Diego
simple solution:
简单的解决办法:
select * from news where news_id in (
select news_id
from NewsType
where type_id in (the types you want)
)
most people would say that you should add a DISTINCT on the news_id on the inner query. You can try that, but Im quite sure it will decrese performance.
大多数人会说你应该在内部查询的 news_id 上添加一个 DISTINCT。你可以试试,但我很确定它会降低性能。
Over all, if you think this solution doesnt perform well, you can make the inner query a CTE, which usually behaves better:
总而言之,如果您认为此解决方案表现不佳,您可以将内部查询设为 CTE,这通常表现得更好:
with my_CTE as(
select news_id
from NewsType
where type_id in (the types you want)
)
select *
from news
where news_id in (select news_id from my_CTE)
回答by dan1111
A group by
is another approach to this:
Agroup by
是另一种方法:
select n.id, n.title, n.description, max(nt.created_at)
from news n, newstype nt, type t where
n.id = nt.news_id and
t.id = nt.type_id
group by n.id, n.title, n.description
order by nt.created_at
limit 25
回答by Teejay
Try
尝试
select distinct n.id, n.title, n.description
but, as @Jan Dvorak stated,
但是,正如@Jan Dvorak 所说,
select distinct n.*
shouldn't select the same news twice
不应该两次选择相同的新闻
回答by Colin 't Hart
You need to decide what to do with the "duplicate" types: Do you want to display just one type for a news item associated with multiple types, or do you want to list them all?
您需要决定如何处理“重复”类型:您想为与多种类型关联的新闻项目只显示一种类型,还是要列出所有类型?
If the latter, you could investigate using the string_agg function, see http://www.postgresql.org/docs/9.2/static/functions-aggregate.html
如果是后者,您可以使用 string_agg 函数进行调查,请参阅http://www.postgresql.org/docs/9.2/static/functions-aggregate.html
select distinct n.id, n.title, n.description, string_agg(t.type_name, ',')
from news n, newstype nt, type t where
n.id = nt.news_id and
t.id = nt.type_id
group by n.id, n.title, n.description
limit 25
回答by Tobsey
You want to select all of the stories that have an entry in the NewsType table for a praticular type. Therefore you want to select the news items where a relationship to the type exists:
您想要选择在 NewsType 表中具有实际类型条目的所有报道。因此,您要选择与该类型存在关系的新闻项目:
SELECT
News.ID,
News.Title,
News.Description
FROM
News
WHERE
EXISTS
(SELECT
NULL
FROM
NewsType
INNER JOIN Type ON NewsType.Type_ID = Type.ID
WHERE
News.ID = NewsType.News_ID
AND Type.Type_Code = @typeCode)
The last line of the where clause may need to be changed to Type.Type_Name = @typeName if you are using the type name as the parameter
如果使用类型名称作为参数,则 where 子句的最后一行可能需要更改为 Type.Type_Name = @typeName