PostgreSQL:如何组合多行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3827382/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PostgreSQL: how to combine multiple rows?
提问by thinzar00
I have a table like this to save the results of a medical checkup and the date of the report sent and the result. Actually the date sent is based on the clinic_visit date. A client can have one or more reports (date may varies)
我有一个像这样的表格来保存体检结果和发送报告的日期和结果。实际上发送的日期是基于 Clinic_visit 日期。一个客户可以有一份或多份报告(日期可能会有所不同)
---------------------------------------
| client_id | date_sent | result |
---------------------------------------
| 1 | 2001 | A |
| 1 | 2002 | B |
| 2 | 2002 | D |
| 3 | 2001 | A |
| 3 | 2003 | C |
| 3 | 2005 | E |
| 4 | 2002 | D |
| 4 | 2004 | E |
| 5 | 2004 | B |
---------------------------------------
I want to extract the following report from the above data.
我想从上述数据中提取以下报告。
---------------------------------------------------
| client_id | result1 | result2 | resut3 |
---------------------------------------------------
| 1 | A | B | |
| 2 | D | | |
| 3 | A | C | E |
| 4 | D | E | |
| 5 | B | | |
---------------------------------------------------
I'm working on Postgresql. the "crosstab" function won't work here because the "date_sent" is not consistent for each client.
我正在研究 Postgresql。“交叉表”功能在这里不起作用,因为每个客户端的“date_sent”不一致。
Can anyone please give a rough idea how it should be queried?
任何人都可以请给出一个粗略的想法应该如何查询?
回答by Peter Eisentraut
I suggest the following approach:
我建议采用以下方法:
SELECT client_id, array_agg(result) AS results
FROM labresults
GROUP BY client_id;
It's not exactly the same output format, but it will give you the same information much faster and cleaner.
它不是完全相同的输出格式,但它会更快、更清晰地为您提供相同的信息。
If you want the results in separate columns, you can always do this:
如果您希望在单独的列中显示结果,您可以随时执行以下操作:
SELECT client_id,
results[1] AS result1,
results[2] AS result2,
results[3] AS result3
FROM
(
SELECT client_id, array_agg(result) AS results
FROM labresults
GROUP BY client_id
) AS r
ORDER BY client_id;
although that will obviously introduce a hardcoded number of possible results.
尽管这显然会引入硬编码的可能结果数量。
回答by thinzar00
While I was reading about "simulating row_number", I tried to figure out another way to do this.
当我在阅读“模拟 row_number”时,我试图找出另一种方法来做到这一点。
SELECT client_id,
MAX( CASE seq WHEN 1 THEN result ELSE '' END ) AS result1,
MAX( CASE seq WHEN 2 THEN result ELSE '' END ) AS result2,
MAX( CASE seq WHEN 3 THEN result ELSE '' END ) AS result3,
MAX( CASE seq WHEN 4 THEN result ELSE '' END ) AS result4,
MAX( CASE seq WHEN 5 THEN result ELSE '' END ) AS result5
FROM ( SELECT p1.client_id,
p1.result,
( SELECT COUNT(*)
FROM labresults p2
WHERE p2.client_id = p1.client_id
AND p2.result <= p1.result )
FROM labresults p1
) D ( client_id, result, seq )
GROUP BY client_id;
but the query took 10 minutes (500,000 ms++). for 30,000 records. This is too long..
但查询耗时 10 分钟(500,000 毫秒++)。30,000 条记录。这太长了..