使用 MySQL 使用 JOIN 在 GROUP BY 中获取 SUM
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3320863/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get SUM in GROUP BY with JOIN using MySQL
提问by ryanb
I have two tables in MySQL 5.1.38.
我在 MySQL 5.1.38 中有两个表。
products
+----+------------+-------+------------+
| id | name | price | department |
+----+------------+-------+------------+
| 1 | Fire Truck | 15.00 | Toys |
| 2 | Bike | 75.00 | Toys |
| 3 | T-Shirt | 18.00 | Clothes |
| 4 | Skirt | 18.00 | Clothes |
| 5 | Pants | 22.00 | Clothes |
+----+------------+-------+------------+
ratings
+------------+--------+
| product_id | rating |
+------------+--------+
| 1 | 5 |
| 2 | 5 |
| 2 | 3 |
| 2 | 5 |
| 3 | 5 |
| 4 | 5 |
| 5 | 4 |
+------------+--------+
My goal is to get the total price of all products which have a 5 star rating in each department. Something like this.
我的目标是获得每个部门中具有 5 星评级的所有产品的总价。像这样的东西。
+------------+-------------+
| department | total_price |
+------------+-------------+
| Clothes | 36.00 | /* T-Shirt and Skirt */
| Toys | 90.00 | /* Fire Truck and Bike */
+------------+-------------+
I would like to do this without a subquery if I can. At first I tried a join with a sum().
如果可以的话,我想在没有子查询的情况下做到这一点。起初我尝试用 sum() 进行连接。
select department, sum(price) from products
join ratings on product_id=products.id
where rating=5 group by department;
+------------+------------+
| department | sum(price) |
+------------+------------+
| Clothes | 36.00 |
| Toys | 165.00 |
+------------+------------+
As you can see the price for the Toys department is incorrect because there are two 5 star ratings for the Bike and therefore counting that price twice due to the join.
正如您所看到的,玩具部门的价格是不正确的,因为自行车有两个 5 星评级,因此由于加入而将该价格计算了两次。
I then tried adding distinct to the sum.
然后我尝试在总和中添加不同。
select department, sum(distinct price) from products
join ratings on product_id=products.id where rating=5
group by department;
+------------+---------------------+
| department | sum(distinct price) |
+------------+---------------------+
| Clothes | 18.00 |
| Toys | 90.00 |
+------------+---------------------+
But then the clothes department is off because two products share the same price.
但是后来服装部门关闭了,因为两种产品的价格相同。
Currently my work-around involves taking something unique about the product (the id) and using that to make the price unique.
目前,我的解决方法包括对产品(id)进行一些独特的处理,并使用它来使价格独一无二。
select department, sum(distinct price + id * 100000) - sum(id * 100000) as total_price
from products join ratings on product_id=products.id
where rating=5 group by department;
+------------+-------------+
| department | total_price |
+------------+-------------+
| Clothes | 36.00 |
| Toys | 90.00 |
+------------+-------------+
But this feels like such a silly hack. Is there a better way to do this without a subquery? Thanks!
但这感觉就像一个愚蠢的黑客。没有子查询,有没有更好的方法来做到这一点?谢谢!
回答by OMG Ponies
Use:
用:
SELECT p.department,
SUM(p.price) AS total_price
FROM PRODUCTS p
JOIN (SELECT DISTINCT
r.product_id,
r.rating
FROM RATINGS r) x ON x.product_id = p.id
AND x.rating = 5
GROUP BY p.department
Technically, this does not use a subquery - it uses a derived table/inline view.
从技术上讲,这不使用子查询 - 它使用派生表/内联视图。
回答by Thomas
The primary reason you are having trouble finding a solution is that the schema as presented is fundamentally flawed. You shouldn't allow a table to have two rows that are complete duplicates of each other. Everytable should have a means to uniquely identify each row even if it is the combination of all columns. Now, if we change the ratings
table so that it has an AUTO_INCREMENT
column called Id
, the problem is easier:
您无法找到解决方案的主要原因是所呈现的模式存在根本性缺陷。您不应该允许一个表有两行彼此完全重复。每个表都应该有一种方法来唯一标识每一行,即使它是所有列的组合。现在,如果我们更改ratings
表以使其具有AUTO_INCREMENT
名为的列Id
,则问题会更容易:
Select products.department, Sum(price) As total_price
From products
Left Join ratings As R1
On R1.product_id = products.id
And R1.rating = 5
Left Join ratings As R2
On R2.product_id = R1.product_id
And R2.rating = R1.rating
And R2.Id > R1.Id
Where R2.Id Is Null
Group By products.department
回答by Erick Robertson
You can do two queries. First query:
你可以做两个查询。第一个查询:
SELECT DISTINCT product_id FROM ratings WHERE rating = 5;
Then, take each of those ID's and manually put them in the second query:
然后,获取每个 ID 并手动将它们放入第二个查询中:
SELECT department, Sum(price) AS total_price FROM products WHERE product_id In (1,2,3,4) GROUP BY department;
This is the work-around for not being able to use subqueries. Without them, there is no way to eliminate the duplicate records caused by the join.
这是无法使用子查询的解决方法。没有它们,就无法消除由连接引起的重复记录。
回答by Tom H
I can't think of any way to do it without a subquery somewherein the query. You could perhaps use a View to mask the use of a subquery.
如果在查询中的某处没有子查询,我想不出任何方法来做到这一点。您也许可以使用视图来屏蔽子查询的使用。
Barring that, your best bet is probably to find the minimum data set needed to make the calculation and do that in the front end. Whether or not that's possible depends on your specific data - how many rows, etc.
除此之外,您最好的选择可能是找到进行计算所需的最小数据集并在前端进行。这是否可能取决于您的特定数据 - 有多少行等。
The other option (actually, maybe this is the best one...) would be to get a new ORM or do without it altogether ;)
另一种选择(实际上,也许这是最好的选择......)是获得一个新的 ORM 或完全不使用它 ;)
This view would allow you to bypass the subquery:
此视图将允许您绕过子查询:
CREATE VIEW Distinct_Product_Ratings
AS
SELECT DISTINCT
product_id,
rating
FROM
Ratings