SQL 帮助:使用嵌套 SELECT 计算单个查询中的行数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/421949/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 00:40:10  来源:igfitidea点击:

SQL Help: Counting Rows in a Single Query With a Nested SELECT

sql

提问by Greg

I'm looking for a better way to do the following query. I have a table that looks like this:

我正在寻找一种更好的方法来执行以下查询。我有一张看起来像这样的表:

game_id | home_team_id | away_team_id
1       | 100          | 200
2       | 200          | 300
3       | 200          | 400
4       | 300          | 100
5       | 100          | 400

And I want to write a query that counts the number of home games and away games for each team and outputs the following:

我想编写一个查询来计算每支球队的主场比赛和客场比赛的数量并输出以下内容:

team_id | home_games | away_games
100     | 2          | 1
200     | 2          | 1
300     | 1          | 1
400     | 0          | 2

Right now, I wrote this monstrosity that works, but it's slow (I know it's pulling the entire 2,800 row from the table twice).

现在,我写了这个有效的怪物,但它很慢(我知道它两次从表中拉出整个 2,800 行)。

SELECT 
  home_team_id as team_id,
  (SELECT count(*) FROM `game` WHERE home_team_id = temp_game.home_team_id) as home_games,
  (SELECT count(*) FROM `game` WHERE home_team_id = temp_game.away_team_id) as away_games
  FROM (SELECT * FROM `game`) as temp_game
  GROUP BY home_team_id

Can a SQL guru help me knock out a better way? I think my problem is that I don't understand how to get a distinct list of the team IDs to throw at the count queries. I bet there's a better way with a better placed, nested SELECT. Thanks in advance!

SQL 大师能帮我找到更好的方法吗?我认为我的问题是我不明白如何获得团队 ID 的不同列表以抛出计数查询。我敢打赌,有更好的方法可以放置更好的嵌套 SELECT。提前致谢!

回答by Frank Flynn

It's cleaner if you have another table team with team_id and team_name.

如果您有另一个带有 team_id 和 team_name 的表团队,那就更干净了。

SELECT team_id, team_name, 
     sum(team_id = home_team_id) as home_games, 
     sum(team_id = away_team_id) as away_games
 FROM game, team
 GROUP BY team_id

What's going on: the no WHERE clause causes a Cartesian Product between the two tables; we group by team_id to get back to one row per team. Now there are all the rows from the game table for each team_id so you need to count them but the SQL count function isn't quite right (it would count all the rows or all the distinct rows). So we say team_id = home_team_id which resolves to 1 or 0 and we use sum to add up the 1's.

发生了什么: no WHERE 子句导致两个表之间的笛卡尔积;我们按 team_id 分组以返回每个团队的一行。现在每个 team_id 都有游戏表中的所有行,因此您需要对它们进行计数,但 SQL 计数函数不太正确(它会计算所有行或所有不同的行)。所以我们说 team_id = home_team_id 解析为 1 或 0,我们使用 sum 将 1 相加。

The team_name is just because it's geeky to say that 'team 200 had 20 home games' when we ought to say that 'Mud City Stranglers had 20 home games'.

team_name 只是因为当我们应该说“Mud City Stranglers 有 20 场主场比赛”时,说“200 队有 20 场主场比赛”是令人讨厌的。

PS. this will work even if there are no games (often a problem in SQL where there is a team with 0 games and that row will not show up because the join fails).

附注。即使没有游戏,这也能工作(通常是 SQL 中的一个问题,其中有一个团队有 0 场比赛,并且由于加入失败而不会显示该行)。

回答by casperOne

If you want the distinct list of teams, you have to select from the game table twice, unioning the home and the away teams (theoretically, one team could play all its games on the road or at home, if you have logic that prevents that, then you could adjust this query):

如果你想要不同的球队名单,你必须从比赛表中选择两次,将主队和客队联合起来(理论上,一支球队可以在客场或主场打所有比赛,如果你有逻辑阻止那,那么您可以调整此查询):

select home_team_id as team_id from game union
select away_team_id as team_id from game

The unionoperator will make sure you only get distinct elements in the return set (unless you use union all)

union运营商将确保你只能得到返回集合不同的元素(除非你使用union all

From there, you can use left outer joins to aggregate your data:

从那里,您可以使用左外连接来聚合您的数据:

select
    u.team_id, count(h.game_id) as home_games, count(a.game_id) as away_games
from
    (
        select home_team_id as team_id from game union
        select away_team_id as team_id from game
    ) as u
        left outer join game as h on h.home_team_id = u.team_id
        left outer join game as a on a.away_team_id = u.team_id
group by
    u.team_id

If you want to reduce your table scans even further (the above will produce four), you can add more code, but it will cost you. You can get a list of rows with the team_id, and whether or not the game was played at home or away:

如果你想进一步减少你的表扫描(上面会产生四个),你可以添加更多的代码,但这会花费你。您可以使用 team_id 获取行列表,以及游戏是在主场还是客场进行的:

select
    case ha.home when 0 then g.away_team_id else g.home_team_id end as team_id,
    case ha.home when 0 then 0 else 1 end as home_games,
    case ha.home when 0 then 1 else 0 end as away_games
from
    game as g, (select 0 as home union select 1 as home) as ha

From there, you can simply sum up the games at home and away for each team:

从那里,您可以简单地总结每支球队的主场和客场比赛:

select
    t.team_id, sum(t.home_games) as home_games, sum(t.away_games) as away_games
from
    (
        select
            case ha.home when 0 then g.away_team_id else g.home_team_id end as team_id,
            case ha.home when 0 then 0 else 1 end as home_games,
            case ha.home when 0 then 1 else 0 end as away_games
        from
            game as g, (select 0 as home union select 1 as home) as ha
    ) as t
group by
    t.team_id

This will result in a single table scan.

这将导致单表扫描。

回答by Stew S

Greg,

格雷格,

I think your ultimate solution will be language-specific. But if you were doing this in Oracle, you could query the table only once with the following:

我认为您的最终解决方案将是特定于语言的。但是,如果您在 Oracle 中执行此操作,则只能使用以下内容查询该表一次:

SELECT game.home_team_id AS team_id,
       SUM(CASE WHEN game.home_team_id = game.away_team_id
                THEN 1
                ELSE 0 END) AS home_games,
       SUM(CASE WHEN game.home_team_id <> game.away_team_id
                THEN 1
                ELSE 0 END) AS away_games
  FROM game
GROUP BY game.home_team_id
ORDER BY game.home_team_id;

You don't say what flavor of SQL you're using so this is the best I can do.

你没有说你使用的是什么风格的 SQL,所以这是我能做的最好的。

Best of luck,

祝你好运,

Stew

p.s. It looks like I've given the same solution as MarlonRibunal. I just didn't have a handy link so had to create the code by hand. :-/

ps 看起来我已经给出了与 MarlonRibunal 相同的解决方案。我只是没有方便的链接,所以不得不手动创建代码。:-/

回答by Charles Bretana

Try this:

尝试这个:

Select Z.teamId, 
    Count(H.Game_Id) HomeGames, 
    Count(A.Game_Id) AwayGames
From (Select Distinct home_team_id TeamId From Game
        Union 
      Select Distinct away_team_id TeamId From Game) Z
   Left Join Game H On H.home_team_id = Z.TeamId
   Left Join Game A On A.away_team_id = Z.TeamId
Group By Z.TeamId

回答by Austin Salonen

declare @ts table

(
    team_id int
)

declare @t table
(
    id int,
    h int,
    a int
)

insert into @ts values (100)
insert into @ts values (200)
insert into @ts values (300)
insert into @ts values (400)

insert into @t values (1, 100, 200)
insert into @t values (2, 200, 300)
insert into @t values (3, 200, 400)
insert into @t values (4, 300, 100)
insert into @t values (5, 100, 400)

select s.team_id, t0.home, t1.away
from @ts s
    left outer join (select team_id, count(h) as [home] from @ts inner join @t on h = team_id group by team_id) t0 on t0.team_id = s.team_id
    left outer join (select team_id, count(a) as away from @ts inner join @t on a = team_id group by team_id) t1 on t1.team_id = s.team_id

回答by Nathan Feger

Here is another example.

这是另一个例子。

I would point out though that you should start your from clause from the teams table, so that you'll be sure to include all the teams, even if they haven't played a game yet.

不过我要指出的是,您应该从团队表中开始您的 from 子句,这样您就可以确保包括所有团队,即使他们还没有玩过游戏。

This query does your two queries as joins instead of subselects, which should perform better.

此查询将您的两个查询作为连接而不是子选择执行,这应该会更好。

-- note: coalesce is like ifnull in case you are using mysql.

-- 注意:如果您使用的是 mysql,coalesce 就像 ifnull。

SELECT  
team_id as team_id,  
coalesce(home_game_counts.games,0) home_games,  
coalesce(away_game_counts.games,0) away_games  
FROM teams  
left join (select home_team_id, count(*) games from games group by home_team_id) as   home_game_counts on home_game_counts.home_team_id = teams.team_id  
left join (select away_team_id, count(*) games from games group by away_team_id) as   away_game_counts on away_game_counts.away_team_id = teams.team_id  
GROUP BY teams.team_id, home_game_counts.games ,  
away_game_counts.games   

回答by JosephStyons

This solution is rather ugly, but it should work quickly across large datasets:

这个解决方案相当丑陋,但它应该可以在大型数据集上快速运行:

select
  teams.team_id
 ,case when home.home_game_count is null
       then 0
       else home.home_game_count
  end home_game_count  
 ,case when away.away_game_count is null
       then 0
       else away.away_game_count
  end as away_game_count
from
  ( 
  select home_team_id as team_id from games
  union
  select away_team_id as team_id from games  
  ) teams
  left outer join
  (  
  select home_team_id as team_id, count(*) as home_game_count
  from games
  group by home_team_id
  ) home
  on teams.team_id = home.team_id
  left outer join
  (
  select away_team_id as team_id, count(*) as away_game_count
  from games
  group by away_team_id
  ) away  
  on teams.team_id = away.team_id  

回答by Stew S

Sorry, my mistake in the away_games clause. I changed the comparison operator (to <>) instead of changing the resulting value. I had to create additional data to see the problem.

抱歉,我在 away_games 条款中犯了错误。我更改了比较运算符(到 <>),而不是更改结果值。我必须创建额外的数据才能看到问题。

SELECT team_id,
       teams.team_name,
       SUM(CASE
               WHEN game.home_team_id = game.away_team_id THEN
                1
               ELSE
                0
           END) AS home_games,
       SUM(CASE
               WHEN game.home_team_id = game.away_team_id THEN
                0
               ELSE
                1
           END) AS away_games
  FROM teams
  LEFT OUTER JOIN game ON game.home_team_id = teams.team_id
 GROUP BY team_id, teams.team_name