PostgreSQL 创建一个新列,其值以其他列为条件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12184409/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 00:14:56  来源:igfitidea点击:

PostgreSQL create a new column with values conditioned on other columns

postgresqlinsert-updateconditional-statements

提问by Zhubarb

I use PostgreSQL 9.1.2 and I have a basic table as below, where I have the Survival status of an entry as a boolean (Survival)and also in number of days (Survival(Days)).

我使用 PostgreSQL 9.1.2 并且我有一个基本表,如下所示,其中我将条目的生存状态作为布尔值 (Survival)以及天数(Survival(Days))

I have manually added a new column named 1-yr Survivaland now I want to fill in the values of this column for each entry in the table, conditioned on that entry's Survivaland Survival (Days)column values. Once , completed the database table would look something like this:

我手动添加了一个名为的新列1-yr Survival,现在我想为表中的每个条目填充该列的值,条件是该条目SurvivalSurvival (Days)列值。完成后,数据库表将如下所示:

Survival    Survival(Days)    1-yr Survival
----------  --------------    -------------
Dead            200                NO
Alive            -                 YES
Dead            1200               YES

The pseudo code to input the conditioned values of 1-yr Survivalwould be something like:

输入条件值的伪代码1-yr Survival类似于:

ALTER TABLE mytable ADD COLUMN "1-yr Survival" text
for each row
if ("Survival" = Dead & "Survival(Days)" < 365) then Update "1-yr Survival" = NO
else Update "1-yr Survival" = YES
end 

I believe this is a basic operation however I failed to find the postgresql syntax to execute it. Some search results return "adding a trigger", but I am not sure that is what I neeed. I think my situation here is a lot simpler. Any help/advice would be greatly appreciated.

我相信这是一个基本操作,但是我没有找到执行它的 postgresql 语法。一些搜索结果返回“添加触发器”,但我不确定这是我需要的。我认为我这里的情况要简单得多。任何帮助/建议将不胜感激。

采纳答案by Erwin Brandstetter

The one-time operation can be achieved with a plain UPDATE:

一次性操作可以通过一个简单的来实现UPDATE

UPDATE tbl
SET    one_year_survival = (survival OR survival_days >= 365);

I would advise not to use camel-case, white-space and parenthesis in your names. While allowed between double-quotes, it often leads to complications and confusion. Consider the chapter about identifiers and key words in the manual.

我建议不要在你的名字中使用驼峰式大小写、空格和括号。虽然允许在双引号之间,但它通常会导致复杂化和混乱。考虑手册中关于标识符和关键字的章节。

Are you aware that you can export the results of a queryas CSV with COPY?
Example:

您是否知道可以使用将查询结果导出为 CSV 文件COPY
例子:

COPY (SELECT *, (survival OR survival_days >= 365) AS one_year_survival FROM tbl)
TO '/path/to/file.csv';

You wouldn't need the redundant column this way to begin with.

您不需要以这种方式开始的冗余列。



Additional answer to comment

对评论的补充回答

To avoid empty updates:

为避免空更新:

UPDATE tbl
SET    "Dead after 1-yr" = (dead AND my_survival_col < 365)
      ,"Dead after 2-yrs" = (dead AND my_survival_col < 730)
....
WHERE  "Dead after 1-yr" IS DISTINCT FROM (dead AND my_survival_col < 365)
   OR  "Dead after 2-yrs" IS DISTINCT FROM (dead AND my_survival_col < 730)
...

Personally, I would only add such redundant columns if I had a compelling reason. Normally I wouldn't. If it's about performance: are you aware of indexes on expressions and partial indexes?

就我个人而言,如果我有令人信服的理由,我只会添加这些多余的列。通常我不会。如果是关于性能:你知道表达式和部分索引的索引吗?

回答by Chris Travers

Honestly, I think you are better off not storing data in the db which is quickly and easily calculated from stored data. A better option would be to simulate a calculated field (gotchas noted below however). In this case you would 9changing spaces etc to underscores for easier maintenance:

老实说,我认为您最好不要将数据存储在 db 中,这是从存储的数据中快速轻松地计算出来的。更好的选择是模拟计算字段(但是下面会提到一些问题)。在这种情况下,您将 9 更改空格等以下划线以便于维护:

CREATE FUNCTION one_yr_survival(mytable)
RETURNS BOOL
IMMUTABLE
LANGUAGE SQL AS $$
select .survival OR .survival_days >= 365;
$$;

then you can actually:

那么你实际上可以:

SELECT *, m.one_year_survival from mytable m;

and it will "just work." Note the following gotchas:

它会“正常工作”。请注意以下问题:

  • mytable.1_year_survival will not be returned by the default column list, and
  • you cannot omit the table identifier (m in the above example) because the parser converts this into one_year_survival(m).
  • 默认列列表不会返回 mytable.1_year_survival,并且
  • 您不能省略表标识符(在上面的示例中为 m),因为解析器将其转换为 one_year_survival(m)。

However the benefit is that the value can be proven never to get out of sync with the other values. Otherwise you end up with a rats nest of check constraints.

然而,好处是可以证明该值永远不会与其他值不同步。否则你最终会得到一堆检查约束。

You can actually take this approach quite far. See http://ledgersmbdev.blogspot.com/2012/08/postgresql-or-modelling-part-2-intro-to.html

您实际上可以采用这种方法。见http://ledgersmbdev.blogspot.com/2012/08/postgresql-or-modelling-part-2-intro-to.html