postgresql 将多个 CSV 文件复制到 postgres
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18533625/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Copy multiple CSV files into postgres
提问by More Than Five
I am writing a SQL script to copy multiple .CSV files into a postgres database like this:
我正在编写一个 SQL 脚本来将多个 .CSV 文件复制到一个 postgres 数据库中,如下所示:
COPY product(title, department) from 'ys.csv' CSV HEADER;
I have multiple files I want to copy in. I don't want:
我有多个文件要复制。我不想要:
COPY product(title, department) from 'ys1.csv' CSV HEADER;
COPY product(title, department) from 'ys2.csv' CSV HEADER;
COPY product(title, department) from 'ys3.csv' CSV HEADER;
COPY product(title, department) from 'ys4.csv' CSV HEADER;
COPY product(title, department) from 'ys5.csv' CSV HEADER;
I would like to use a for loop for this instead of multiple copy commands. Is this possible? Thanks
我想为此使用 for 循环而不是多个复制命令。这可能吗?谢谢
回答by Clodoaldo Neto
In linux pipe the output of the files listing to psql. Make copy
use the standard input:
在 linux 中,将文件列表的输出通过管道传送到 psql。请copy
使用标准输入:
cat /path_to/ys*.csv | psql -c 'COPY product(title, department) from stdin CSV HEADER'
Look for the equivalent in other OSs
在其他操作系统中寻找等价物
回答by Rooie3000
I tried the answer above but I got an error when working with more then one file. I think on the second file it didn't cut off the header.
我尝试了上面的答案,但是在处理多个文件时出现错误。我认为在第二个文件中它没有切断标题。
This worked fot me:
这对我有用:
# get filenames
IMPFILES=(path/FileNamepart.csv)
# import the files
for i in ${IMPFILES[@]}
do
psql -U user -d database -c "\copy TABLE_NAME from '$i' DELIMITER ';' CSV HEADER"
# move the imported file
mv $i /FilePath
done
In my case I move every file afer it is imported. If an error occours I know where to look. And I can run the script again if there are new files put in that location.
在我的情况下,我会在导入每个文件后移动它。如果发生错误,我知道该往哪里看。如果该位置有新文件,我可以再次运行该脚本。
回答by Jeff
You can loop through the filenames using pg_ls_dir.
您可以使用 pg_ls_dir 遍历文件名。
DO $$
DECLARE file_path TEXT; -- Path where your CSV files are
DECLARE fn_i TEXT; -- Variable to hold name of current CSV file being inserted
DECLARE mytable TEXT; -- Variable to hold name of table to insert data into
BEGIN
file_path := 'C:/Program Files/PostgreSQL/9.6/data/my_csvs/'; -- Declare the path to your CSV files. You probably need to put this in your PostgreSQL file path to avoid permission issues.
mytable := 'product(title,department)'; -- Declare table to insert data into. You can give columns too since it's just going into an execute statement.
CREATE TEMP TABLE files AS
SELECT file_path || pg_ls_dir AS fn -- get all of the files in the directory, prepending with file path
FROM pg_ls_dir(file_path);
LOOP
fn_i := (select fn from files limit 1); -- Pick the first file
raise notice 'fn: %', fn_i;
EXECUTE 'COPY ' || mytable || ' from ''' || fn_i || ''' with csv header';
DELETE FROM files WHERE fn = fn_i; -- Delete the file just inserted from the queue
EXIT WHEN (SELECT COUNT(*) FROM files) = 0;
END LOOP;
END $$;
回答by Ryan
Starting with Postgres 9.3, you can run a shell command using the PROGRAM
keyword within the COPY
command.
从 Postgres 9.3 开始,您可以使用命令中的PROGRAM
关键字运行 shellCOPY
命令。
COPY product(title, department) from PROGRAM 'cat ys*.csv' FORMAT CSV HEADER
回答by Alexey Chibisov
Just one option more, using pg_ls_dir and format(). Inserting all files from 'E:\Online_Monitoring\Processed\' folder into ONLMON_T_Online_Monitoring table.
还有一种选择,使用 pg_ls_dir 和 format()。将“E:\Online_Monitoring\Processed\”文件夹中的所有文件插入到 ONLMON_T_Online_Monitoring 表中。
DO $$
DECLARE
directory_path VARCHAR(500);
rec RECORD;
BEGIN
directory_path := 'E:\Online_Monitoring\Processed\';
FOR rec IN SELECT pg_ls_dir(directory_path) AS file_name
LOOP
EXECUTE format(
'
COPY ONLMON_T_Online_Monitoring
(
item
, storeCode
, data
)
FROM %L
WITH (FORMAT CSV, HEADER);
', directory_path || rec.file_name
);
END LOOP;
END; $$;
回答by Blamas
If you want to use the PROGRAM
(Postgres > 9.3) keyword but you have the header in each csv file you can use awk
:
如果您想使用PROGRAM
(Postgres > 9.3) 关键字,但您可以使用每个 csv 文件中的标题awk
:
COPY product(title, department) FROM PROGRAM 'awk FNR-1 ys*.csv | cat' DELIMITER ',' CSV HEADER;