MySQL MySQL子查询引用父查询中的字段

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5781349/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 19:38:41  来源:igfitidea点击:

MySQL subquery to refer to field in parent query

mysqlsubquery

提问by radrat

I am building a query that performs some filtering on rating data.

我正在构建一个对评级数据执行一些过滤的查询。

Suppose I have a simple table called ratingslike the following, storing data from an online rating tool:

假设我有一个如下所示的简单表ratings,存储来自在线评级工具的数据:

+----------------+----------------+--------+
| page_title     | timestamp      | rating |
+----------------+----------------+--------+
| Abc            | 20110417092134 | 1      |
| Abc            | 20110418110831 | 2      |
| Def            | 20110417092205 | 3      |
+----------------+----------------+--------+

I need to extract pages with a high frequency of low values in the latest 10 ratings, and limit this query to pages that produced a volume of at least 20 ratings in the preceding week. This is the ridiculously longquery I came up with:

我需要提取最近 10 个评分中低值频率高的页面,并将此查询限制为在前一周产生至少 20 个评分的页面。这是我想出的可笑的长查询:

SELECT a1.page_title, COUNT(*) AS rvol, AVG(a1.rating) AS theavg, 
(
     SELECT COUNT(*) FROM
     (
         SELECT * FROM ratings a2 WHERE a2.page_title = a1.page_title 
         AND DATE(timestamp) <= '2011-04-24' ORDER BY timestamp DESC LIMIT 10
     ) 
     AS latest WHERE rating >=1 AND rating <=2 ORDER BY timestamp DESC
)
AS lowest FROM ratings a1
WHERE DATE(a1.timestamp) <= "2011-04-24" AND DATE(a1.timestamp) >= "2011-04-17" 
GROUP BY a1.page_title HAVING COUNT(*) > 20

the top level query looks for pages with more than 20 ratings in the week terminating on 2011-04-24, the subquery is supposed to retrieve the number of ratings with values between [1,2] from the latest 10 ratings of each article from the top level query.

顶级查询在 2011 年 4 月 24 日终止的一周内查找评分超过 20 的页面,子查询应该从每篇文章的最新 10 个评分中检索值在 [1,2] 之间的评分数量顶级查询。

MySQL complains that a1.page_title in the WHERE clause of the subsubquery is an unknown column, I suspect this is because a1 is not defined as an alias in the second-level query, but only in the top-level query, but I am clueless how to fix this.

MySQL抱怨子查询的WHERE子句中的a1.page_title是一个未知列,我怀疑这是因为a1在二级查询中没有定义为别名,而只是在顶级查询中,但我无能为力如何解决这个问题。

(edited)

(已编辑)

I am adding as an explanation of my suspect above regarding cross-level referencing another query which works absolutely fine, note that here a1 is not defined in the subquery but it is in the immediate parent:

我正在添加作为对我上面关于跨级别引用另一个绝对正常的查询的怀疑的解释,请注意,这里 a1 未在子查询中定义,但它在直接父查询中:

SELECT a1.page_title, COUNT(*) AS rvol, AVG(a1.rating) AS theavg, 
(
    SELECT COUNT(*) FROM ratings a2 WHERE DATE(timestamp) <= '2011-04-24'
    AND DATE(timestamp) >= '2011-04-17' AND rating >=1 
    AND rating <=2 AND a2.page_title = a1.page_title
) AS lowest FROM ratings a1 
WHERE DATE(a1.timestamp) <= '2011-04-17' AND DATE(a1.aa_timestamp) >= '2011-04-11' 
GROUP BY a1.page_title HAVING COUNT(*) > 20

采纳答案by Conrad Frix

I think you might consider joining two in line views it might make things eaiser.

我认为您可能会考虑加入两个行视图,这可能会使事情变得更容易。

SELECT * 
FROM   (SELECT COUNT(*), 
               a2.page_title 
        FROM   ratings a2 
        WHERE  DATE(timestamp) <= '2011-04-24' 
               AND DATE(timestamp) >= '2011-04-17' 
               AND rating >= 1 
               AND rating <= 2 

        GROUP  BY a2.page_title) current 
       JOIN 
        (SELECT a1.page_title, 
                    COUNT(*)       AS rvol, 
                    AVG(a1.rating) AS theavg 
             FROM   ratings a1 
             WHERE  DATE(a1.timestamp) <= '2011-04-17' 
                    AND DATE(a1.a_timestamp) >= '2011-04-11' 
             GROUP  BY a1.page_title 
             HAVING COUNT(*) > 20) morethan20 
         ON current .page_title = morethan20.page_title 

回答by CenterOrbit

If all you have is this one simple table, I have no idea where you are pulling all of these other table names from, such as: a1, a2, ratings. I feel like either your SQL is quite a bit off, or your leaving out information.

如果你只有这个简单的表格,我不知道你从哪里提取所有这些其他表格名称,例如:a1、a2、 ratings。我觉得要么你的 SQL 有点偏离,要么你遗漏了信息。

The reason your having the error you do is because in your sub-sub-query you do not include a1 in your "FROM" statement... as so that table is not included, it cannot be referenced in your WHERE clause in that sub query.

您遇到错误的原因是因为在您的子子查询中,您没有在“FROM”语句中包含 a1 ......因此该表不包括在内,它不能在您的 WHERE 子句中引用询问。

SELECT * 
FROM
    (SELECT *
        FROM a1
        WHERE a1.timestamp <= (NOW()-604800)
            AND a1.timestamp >= (NOW()-1209600)
        GROUP BY a1.page_title
        HAVING COUNT(a1.page_title)>20)
    AS priorWeekCount
WHERE
    rating <= 2
ORDER BY timestamp DESC
LIMIT 10

as I dont have a full table to test this... I THINK this is what your looking for.. but it is untested, and knowing my coding habits, very rarely is what I type 100% perfect first time ;)

因为我没有一个完整的表格来测试这个......我认为这就是你想要的......但它未经测试,并且知道我的编码习惯,很少是我第一次输入100%完美的;)

回答by Andrew Lazarus

Your analysis of the error is correct: lowestis known in the subquery, a1 is not.

您对错误的分析是正确的:lowest在子查询中是已知的,而 a1 不是。

I think the logic is inside-out. The following probably isn't the best but the optimizer may be smart enough to combine the two subqueries in the outermost SELECT. (If it isn't, at the risk of readability you can introduce another level of subquery.)

我认为逻辑是由内而外的。以下可能不是最好的,但优化器可能足够聪明,可以在最外面的 SELECT 中组合两个子查询。(如果不是,冒着可读性的风险,您可以引入另一个级别的子查询。)

SELECT r20plus.page_title,
 AVG((SELECT rating 
      FROM ratings r WHERE r.page_title=r20plus.page_title 
      ORDER BY timestamp DESC LIMIT 10) ) as av,
 SUM((SELECT CASE WHEN rating BETWEEN 1 AND 2 THEN 1 ELSE 0 END 
      FROM ratings r WHERE r.page_title=r20plus.page_title
      ORDER BY timestamp DESC LIMIT 10) ) as n_low,
FROM
(SELECT page_title FROM ratings  
WHERE DATE(a1.timestamp) <= "2011-04-24" AND DATE(a1.timestamp) >= "2011-04-17"
GROUP BY page_title
HAVING COUNT(rating) >= 20) AS r20plus;