postgresql 你如何在 Postgres 中找到所有表的行数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2596670/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you find the row count for all your tables in Postgres
提问by mmrobins
I'm looking for a way to find the row count for all my tables in Postgres. I know I can do this one table at a time with:
我正在寻找一种方法来查找 Postgres 中所有表的行数。我知道我可以一次完成一张桌子:
SELECT count(*) FROM table_name;
but I'd like to see the row count for all the tables and then order by that to get an idea of how big all my tables are.
但我想查看所有表的行数,然后按它排序以了解我所有的表有多大。
回答by Greg Smith
There's three ways to get this sort of count, each with their own tradeoffs.
有三种方法可以获得这种计数,每种方法都有自己的权衡。
If you want a true count, you have to execute the SELECT statement like the one you used against each table. This is because PostgreSQL keeps row visibility information in the row itself, not anywhere else, so any accurate count can only be relative to some transaction. You're getting a count of what that transaction sees at the point in time when it executes. You could automate this to run against every table in the database, but you probably don't need that level of accuracy or want to wait that long.
如果你想要一个真正的计数,你必须像你对每个表使用的那样执行 SELECT 语句。这是因为 PostgreSQL 将行可见性信息保存在行本身中,而不是其他任何地方,因此任何准确的计数都只能与某个事务相关。您正在计算该事务在执行时看到的内容。您可以自动执行此操作以针对数据库中的每个表运行,但您可能不需要这种级别的准确性或想要等待那么长时间。
The second approach notes that the statistics collector tracks roughly how many rows are "live" (not deleted or obsoleted by later updates) at any time. This value can be off by a bit under heavy activity, but is generally a good estimate:
第二种方法指出,统计收集器随时大致跟踪有多少行是“活动的”(未被删除或被后续更新废弃)。这个值在大量活动下可能会稍微偏离,但通常是一个很好的估计:
SELECT schemaname,relname,n_live_tup
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
That can also show you how many rows are dead, which is itself an interesting number to monitor.
这也可以显示有多少行已死,这本身就是一个值得监控的有趣数字。
The third way is to note that the system ANALYZE command, which is executed by the autovacuum process regularly as of PostgreSQL 8.3 to update table statistics, also computes a row estimate. You can grab that one like this:
第三种方法是注意系统 ANALYZE 命令,它从 PostgreSQL 8.3 开始由 autovacuum 进程定期执行以更新表统计信息,也计算行估计。你可以像这样抓住那个:
SELECT
nspname AS schemaname,relname,reltuples
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE
nspname NOT IN ('pg_catalog', 'information_schema') AND
relkind='r'
ORDER BY reltuples DESC;
Which of these queries is better to use is hard to say. Normally I make that decision based on whether there's more useful information I also want to use inside of pg_class or inside of pg_stat_user_tables. For basic counting purposes just to see how big things are in general, either should be accurate enough.
很难说这些查询中哪个更好用。通常我会根据是否有更多有用的信息在 pg_class 或 pg_stat_user_tables 中使用来做出决定。出于基本的计数目的,只是为了查看一般事物有多大,两者都应该足够准确。
回答by a_horse_with_no_name
Here is a solution that does not require functions to get an accurate count for each table:
这是一个不需要函数来获得每个表的准确计数的解决方案:
select table_schema,
table_name,
(xpath('/row/cnt/text()', xml_count))[1]::text::int as row_count
from (
select table_name, table_schema,
query_to_xml(format('select count(*) as cnt from %I.%I', table_schema, table_name), false, true, '') as xml_count
from information_schema.tables
where table_schema = 'public' --<< change here for the schema you want
) t
query_to_xml
will run the passed SQL query and return an XML with the result (the row count for that table). The outer xpath()
will then extract the count information from that xml and convert it to a number
query_to_xml
将运行传递的 SQL 查询并返回带有结果的 XML(该表的行数)。然后外部xpath()
将从该 xml 中提取计数信息并将其转换为数字
The derived table is not really necessary, but makes the xpath()
a bit easier to understand - otherwise the whole query_to_xml()
would need to be passed to the xpath()
function.
派生表并不是真正必要的,但会使它xpath()
更容易理解 - 否则query_to_xml()
需要将整个表传递给xpath()
函数。
回答by Daniel Vérité
To get estimates, see Greg Smith's answer.
要获得估计值,请参阅Greg Smith 的回答。
To get exact counts, the other answers so far are plagued with some issues, some of them serious (see below). Here's a version that's hopefully better:
为了获得确切的数量,到目前为止,其他答案都受到一些问题的困扰,其中一些问题很严重(见下文)。这是一个希望更好的版本:
CREATE FUNCTION rowcount_all(schema_name text default 'public')
RETURNS table(table_name text, cnt bigint) as
$$
declare
table_name text;
begin
for table_name in SELECT c.relname FROM pg_class c
JOIN pg_namespace s ON (c.relnamespace=s.oid)
WHERE c.relkind = 'r' AND s.nspname=schema_name
LOOP
RETURN QUERY EXECUTE format('select cast(%L as text),count(*) from %I.%I',
table_name, schema_name, table_name);
END LOOP;
end
$$ language plpgsql;
It takes a schema name as parameter, or public
if no parameter is given.
它需要一个模式名称作为参数,或者public
如果没有给出参数。
To work with a specific list of schemas or a list coming from a query without modifying the function, it can be called from within a query like this:
要在不修改函数的情况下使用特定的模式列表或来自查询的列表,可以像这样从查询中调用它:
WITH rc(schema_name,tbl) AS (
select s.n,rowcount_all(s.n) from (values ('schema1'),('schema2')) as s(n)
)
SELECT schema_name,(tbl).* FROM rc;
This produces a 3-columns output with the schema, the table and the rows count.
这会产生一个 3 列的输出,其中包含模式、表和行数。
Now here are some issues in the other answers that this function avoids:
现在这里是这个函数避免的其他答案中的一些问题:
Table and schema names shouldn't be injected into executable SQL without being quoted, either with
quote_ident
or with the more modernformat()
function with its%I
format string. Otherwise some malicious person may name their tabletablename;DROP TABLE other_table
which is perfectly valid as a table name.Even without the SQL injection and funny characters problems, table name may exist in variants differing by case. If a table is named
ABCD
and another oneabcd
, theSELECT count(*) FROM...
must use a quoted name otherwise it will skipABCD
and countabcd
twice. The%I
of format does this automatically.information_schema.tables
lists custom composite types in addition to tables, even when table_type is'BASE TABLE'
(!). As a consequence, we can't iterate oninformation_schema.tables
, otherwise we risk havingselect count(*) from name_of_composite_type
and that would fail. OTOHpg_class where relkind='r'
should always work fine.The type of COUNT() is
bigint
, notint
. Tables with more than 2.15 billion rows may exist (running a count(*) on them is a bad idea, though).A permanent type need not to be created for a function to return a resultset with several columns.
RETURNS TABLE(definition...)
is a better alternative.
表和模式名称不应该被注入到可执行 SQL 中而不被引用,无论是使用
quote_ident
或使用更现代的format()
函数及其%I
格式字符串。否则,一些恶意的人可能会将他们的表命名tablename;DROP TABLE other_table
为完全有效的表名。即使没有 SQL 注入和有趣的字符问题,表名也可能存在大小写不同的变体。如果一个表
ABCD
和另一个表被命名abcd
,则SELECT count(*) FROM...
必须使用带引号的名称,否则它将跳过ABCD
并计数abcd
两次。该%I
格式的自动执行此操作。information_schema.tables
列出除表之外的自定义复合类型,即使 table_type 是'BASE TABLE'
(!)。因此,我们不能对 进行迭代information_schema.tables
,否则我们可能select count(*) from name_of_composite_type
会冒着失败的风险。OTOHpg_class where relkind='r'
应该始终正常工作。COUNT() 的类型是
bigint
,不是int
。可能存在超过 21.5 亿行的表(不过,对它们运行 count(*) 是个坏主意)。不需要为函数创建永久类型以返回具有多个列的结果集。
RETURNS TABLE(definition...)
是更好的选择。
回答by ig0774
If you don't mind potentially stale data, you can access the same statistics used by the query optimizer.
如果您不介意潜在的陈旧数据,您可以访问查询优化器使用的相同统计信息。
Something like:
就像是:
SELECT relname, n_tup_ins - n_tup_del as rowcount FROM pg_stat_all_tables;
回答by Aur Saraf
The hacky, practical answer for people trying to evaluate which Heroku plan they need and can't wait for heroku's slow row counter to refresh:
对于那些试图评估他们需要哪个 Heroku 计划并且迫不及待地等待 heroku 的慢行计数器刷新的人来说,这是一个实用的答案:
Basically you want to run \dt
in psql
, copy the results to your favorite text editor (it will look like this:
基本上你想运行\dt
中psql
,将结果复制到您喜爱的文本编辑器(它看起来就像这样:
public | auth_group | table | axrsosvelhutvw
public | auth_group_permissions | table | axrsosvelhutvw
public | auth_permission | table | axrsosvelhutvw
public | auth_user | table | axrsosvelhutvw
public | auth_user_groups | table | axrsosvelhutvw
public | auth_user_user_permissions | table | axrsosvelhutvw
public | background_task | table | axrsosvelhutvw
public | django_admin_log | table | axrsosvelhutvw
public | django_content_type | table | axrsosvelhutvw
public | django_migrations | table | axrsosvelhutvw
public | django_session | table | axrsosvelhutvw
public | exercises_assignment | table | axrsosvelhutvw
), then run a regex search and replace like this:
),然后运行正则表达式搜索并像这样替换:
^[^|]*\|\s+([^|]*?)\s+\| table \|.*$
to:
到:
select '', count(*) from union/g
which will yield you something very similar to this:
这会给你带来与此非常相似的东西:
select 'auth_group', count(*) from auth_group union
select 'auth_group_permissions', count(*) from auth_group_permissions union
select 'auth_permission', count(*) from auth_permission union
select 'auth_user', count(*) from auth_user union
select 'auth_user_groups', count(*) from auth_user_groups union
select 'auth_user_user_permissions', count(*) from auth_user_user_permissions union
select 'background_task', count(*) from background_task union
select 'django_admin_log', count(*) from django_admin_log union
select 'django_content_type', count(*) from django_content_type union
select 'django_migrations', count(*) from django_migrations union
select 'django_session', count(*) from django_session
;
(You'll need to remove the last union
and add the semicolon at the end manually)
(您需要删除最后一个union
并在末尾手动添加分号)
Run it in psql
and you're done.
运行它psql
,你就完成了。
?column? | count
--------------------------------+-------
auth_group_permissions | 0
auth_user_user_permissions | 0
django_session | 1306
django_content_type | 17
auth_user_groups | 162
django_admin_log | 9106
django_migrations | 19
[..]
回答by Stew-au
Not sure if an answer in bashis acceptable to you, but FWIW...
不确定你是否可以接受bash 中的答案,但是 FWIW ......
PGCOMMAND=" psql -h localhost -U fred -d mydb -At -c \"
SELECT table_name
FROM information_schema.tables
WHERE table_type='BASE TABLE'
AND table_schema='public'
\""
TABLENAMES=$(export PGPASSWORD=test; eval "$PGCOMMAND")
for TABLENAME in $TABLENAMES; do
PGCOMMAND=" psql -h localhost -U fred -d mydb -At -c \"
SELECT '$TABLENAME',
count(*)
FROM $TABLENAME
\""
eval "$PGCOMMAND"
done
回答by Yuri Levinsky
I usually don't rely on statistics, especially in PostgreSQL.
我通常不依赖统计数据,尤其是在 PostgreSQL 中。
SELECT table_name, dsql2('select count(*) from '||table_name) as rownum
FROM information_schema.tables
WHERE table_type='BASE TABLE'
AND table_schema='livescreen'
ORDER BY 2 DESC;
CREATE OR REPLACE FUNCTION dsql2(i_text text)
RETURNS int AS
$BODY$
Declare
v_val int;
BEGIN
execute i_text into v_val;
return v_val;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
回答by Gnanam
I don't remember the URL from where I collected this. But hope this should help you:
我不记得我收集这个的 URL。但希望这可以帮助你:
CREATE TYPE table_count AS (table_name TEXT, num_rows INTEGER);
CREATE OR REPLACE FUNCTION count_em_all () RETURNS SETOF table_count AS '
DECLARE
the_count RECORD;
t_name RECORD;
r table_count%ROWTYPE;
BEGIN
FOR t_name IN
SELECT
c.relname
FROM
pg_catalog.pg_class c LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE
c.relkind = ''r''
AND n.nspname = ''public''
ORDER BY 1
LOOP
FOR the_count IN EXECUTE ''SELECT COUNT(*) AS "count" FROM '' || t_name.relname
LOOP
END LOOP;
r.table_name := t_name.relname;
r.num_rows := the_count.count;
RETURN NEXT r;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;
Executing select count_em_all();
should get you row count of all your tables.
执行select count_em_all();
应该让你得到所有表的行数。
回答by Raju Sah
Simple Two Steps:
(Note : No need to change anything - just copy paste)
1. create function
简单的两个步骤:(
注意:无需更改任何内容 - 只需复制粘贴)
1. 创建函数
create function
cnt_rows(schema text, tablename text) returns integer
as
$body$
declare
result integer;
query varchar;
begin
query := 'SELECT count(1) FROM ' || schema || '.' || tablename;
execute query into result;
return result;
end;
$body$
language plpgsql;
2. Run this query to get rows count for all the tables
2. 运行此查询以获取所有表的行数
select sum(cnt_rows) as total_no_of_rows from (select
cnt_rows(table_schema, table_name)
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE') as subq;
or
To get rows counts tablewise
或
以表格方式获取行数
select
table_schema,
table_name,
cnt_rows(table_schema, table_name)
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE'
order by 3 desc;
回答by Paul
I made a small variation to include all tables, also for non-public tables.
我做了一个小的变化以包括所有表,也包括非公共表。
CREATE TYPE table_count AS (table_schema TEXT,table_name TEXT, num_rows INTEGER);
CREATE OR REPLACE FUNCTION count_em_all () RETURNS SETOF table_count AS '
DECLARE
the_count RECORD;
t_name RECORD;
r table_count%ROWTYPE;
BEGIN
FOR t_name IN
SELECT table_schema,table_name
FROM information_schema.tables
where table_schema !=''pg_catalog''
and table_schema !=''information_schema''
ORDER BY 1,2
LOOP
FOR the_count IN EXECUTE ''SELECT COUNT(*) AS "count" FROM '' || t_name.table_schema||''.''||t_name.table_name
LOOP
END LOOP;
r.table_schema := t_name.table_schema;
r.table_name := t_name.table_name;
r.num_rows := the_count.count;
RETURN NEXT r;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;
use select count_em_all();
to call it.
使用select count_em_all();
调用它。
Hope you find this usefull. Paul
希望你觉得这很有用。保罗