SQL 将 Excel 数据导入 PostgreSQL 9.3

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20039856/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 23:05:03  来源:igfitidea点击:

Import Excel Data into PostgreSQL 9.3

sqlexcelpostgresql

提问by green

I've developed a huge table in excel and now facing problem in transferring it into the postgresql database. I've downloaded the odbc software and I'm able to open table created in postgresql with excel. However, I'm not able to do it in a reverse manner which is creating a table in excel and open it in the postgresql. So I would like to know it is can be done in this way or is there any alternative ways that can create a large table with pgAdmin III cause inserting the data raw by raw is quite tedious.

我在 excel 中开发了一个巨大的表,现在在将它传输到 postgresql 数据库时遇到了问题。我已经下载了 odbc 软件,我可以用 excel 打开在 postgresql 中创建的表。但是,我无法以相反的方式执行此操作,即在 excel 中创建一个表并在 postgresql 中打开它。所以我想知道它可以通过这种方式完成,或者是否有任何替代方法可以使用 pgAdmin III 创建一个大表,因为通过 raw 插入原始数据非常乏味。

Thanks in advance for all the helps!

在此先感谢所有帮助!

回答by Chris Travers

The typical answer is this:

典型的答案是这样的:

  1. In Excel, File/Save As, select CSV, save your current sheet.

  2. transfer to a holding directory on the Pg server the postgres user can access

  3. in PostgreSQL:

    COPY mytable FROM '/path/to/csv/file' WITH CSV HEADER; -- must be superuser
    
  1. 在 Excel 中,文件/另存为,选择 CSV,保存当前工作表。

  2. 转移到 postgres 用户可以访问的 Pg 服务器上的保留目录

  3. 在 PostgreSQL 中:

    COPY mytable FROM '/path/to/csv/file' WITH CSV HEADER; -- must be superuser
    

But there are other ways to do this too. PostgreSQL is an amazingly programmable database. These include:

但也有其他方法可以做到这一点。PostgreSQL 是一个惊人的可编程数据库。这些包括:

  1. Write a module in pl/javaU, pl/perlU, or other untrusted language to access file, parse it, and manage the structure.

  2. Use CSV and the fdw_file to access it as a pseudo-table

  3. Use DBILink and DBD::Excel

  4. Write your own foreign data wrapper for reading Excel files.

  1. 使用 pl/javaU、pl/perlU 或其他不受信任的语言编写模块来访问文件、解析文件并管理结构。

  2. 使用 CSV 和 fdw_file 将其作为伪表访问

  3. 使用 DBILink 和 DBD::Excel

  4. 编写自己的外部数据包装器来读取 Excel 文件。

The possibilities are literally endless....

从字面上看,可能性是无穷无尽的......

回答by alexkovelsky

You can also use psql console to execute \copy without need to send file to Postgresql server machine. The command is the same:

您还可以使用 psql 控制台执行 \copy 而无需将文件发送到 Postgresql 服务器机器。命令是一样的:

\copy mytable [ ( column_list ) ] FROM '/path/to/csv/file' WITH CSV HEADER

回答by sayth

For python you could use openpyxl for all 2010 and newer file formats (xlsx).

对于 python,您可以将 openpyxl 用于所有 2010 和更新的文件格式 (xlsx)。

Al Sweigart has a full tutorial from automate the boring parts on working with excel spreadsheetsits very indepth and the whole book and accompanying Udemy course are great resources.

Al Sweigart 有一个完整的教程,从自动化处理 excel 电子表格的枯燥部分,它非常深入,整本书和随附的 Udemy 课程都是很好的资源。

From his example

从他的例子

>>> import openpyxl
>>> wb = openpyxl.load_workbook('example.xlsx')
>>> wb.get_sheet_names()
['Sheet1', 'Sheet2', 'Sheet3']
>>> sheet = wb.get_sheet_by_name('Sheet3')
>>> sheet
<Worksheet "Sheet3">

Understandably once you have this access you can now use psycopgto parse the data to postgres as you normally would do.

可以理解,一旦您拥有此访问权限,您现在就可以像往常一样使用psycopg将数据解析为 postgres。

This is a link to a list of python resources at python-excelalso xlwingsprovides a large array of features for using python in place of vba in excel.

这是python-excel中python资源列表的链接,xlwings还提供了大量使用python代替excel中的vba的功能。

回答by Mike T

A method that I use is to load the table into Ras a data.frame, then use dbWriteTableto push it to PostgreSQL. These two steps are shown below.

我使用的一种方法是将表作为 a加载到Rdata.frame,然后用于dbWriteTable将其推送到 PostgreSQL。这两个步骤如下所示。

Load Excel data into R

将 Excel 数据加载到 R

R's data.frameobjects are database-like, where named columns have explicit types, such as text or numbers. There are several ways to get a spreadsheet into R, such as XLConnect. However, a really simple method is to select the range of the Excel table (including the header), copy it (i.e. CTRL+C), then in Ruse this command to get it from the clipboard:

Rdata.frame对象类似于数据库,其中命名列具有显式类型,例如文本或数字。有多种方法可以将电子表格导入R,例如XLConnect。但是,一个非常简单的方法是选择 Excel 表格的范围(包括标题),将其复制(即CTRL+ C),然后在R 中使用此命令从剪贴板中获取它:

d <- read.table("clipboard", header=TRUE, sep="\t", quote="\"", na.strings="", as.is=TRUE)

If you have RStudio, you can easily view the dobject to make sure it is as expected.

如果您有 RStudio,则可以轻松查看d对象以确保其符合预期。

Push it to PostgreSQL

推送到 PostgreSQL

Ensure you have RPostgreSQLinstalled from CRAN, then make a connection and send the data.frameto the database:

确保您从 CRAN 安装了RPostgreSQL,然后建立连接并将其发送data.frame到数据库:

library(RPostgreSQL)
conn <- dbConnect(PostgreSQL(), dbname="mydb")

dbWriteTable(conn, "some_table_name", d)

Now some_table_nameshould appear in the database.

现在some_table_name应该出现在数据库中。

Some common clean-up steps can be done from pgAdmin or psql:

一些常见的清理步骤可以从 pgAdmin 或 psql 完成:

ALTER TABLE some_table_name RENAME "row.names" TO id;
ALTER TABLE some_table_name ALTER COLUMN id TYPE integer USING id::integer;
ALTER TABLE some_table_name ADD PRIMARY KEY (id);

回答by Ben

As explained here http://www.postgresonline.com/journal/categories/journal/archives/339-OGR-foreign-data-wrapper-on-Windows-first-taste.html

如此处所述http://www.postgresonline.com/journal/categories/journal/archives/339-OGR-foreign-data-wrapper-on-Windows-first-taste.html

With ogr_fdw module, its possible to open the excel sheet as foreign table in pgsql and query it directly like any other regular tables in pgsql. This is useful for reading data from the same regularly updated table

使用 ogr_fdw 模块,可以在 pgsql 中将 excel 表作为外部表打开,并像 pgsql 中的任何其他常规表一样直接查询它。这对于从同一个定期更新的表中读取数据很有用

To do this, the table header in your spreadsheet must be clean, the current ogr_fdw driver can't deal with wide-width character or new lines etc. with these characters, you will probably not be able to reference the column in pgsql due to encoding issue. (Major reason I can't use this wonderful extension.)

为此,您的电子表格中的表头必须是干净的,当前的 ogr_fdw 驱动程序无法处理宽字符或新行等这些字符,您可能无法引用 pgsql 中的列,因为编码问题。(我不能使用这个美妙的扩展的主要原因。)

The ogr_fdw pre-build binaries for windows are located here http://winnie.postgis.net/download/windows/pg96/buildbot/extras/change the version number in link to download corresponding builds. extract the file to pgsql folder to overwrite the same name sub-folders. restart pgsql. Before the test drive, the module needs to be installed by executing:

Windows 的 ogr_fdw 预构建二进制文件位于此处http://winnie.postgis.net/download/windows/pg96/buildbot/extras/更改链接中的版本号以下载相应的构建。将文件解压到 pgsql 文件夹以覆盖同名子文件夹。重新启动 pgsql。试驾前,需要通过执行以下命令安装模块:

CREATE EXTENSION ogr_fdw;

Usage in brief:

用法简述:

  1. use ogr_fdw_info.exe to prob the excel file for sheet name list

    ogr_fdw_info -s "C:/excel.xlsx"
    
  2. use "ogr_fdw_info.exe -l" to prob a individual sheet and generate a table definition code.

    ogr_fdw_info -s "C:/excel.xlsx" -l "sheetname"
    
  1. 使用 ogr_fdw_info.exe 探测工作表名称列表的 excel 文件

    ogr_fdw_info -s "C:/excel.xlsx"
    
  2. 使用“ogr_fdw_info.exe -l”来探测单个工作表并生成表定义代码。

    ogr_fdw_info -s "C:/excel.xlsx" -l "sheetname"
    

Execute the generated definition code in pgsql, a foreign table is created and mapped to your excel file. it can be queried like regular tables.

在 pgsql 中执行生成的定义代码,创建一个外部表并映射到您的 excel 文件。它可以像普通表一样查询。

This is especially useful, if you have many small files with the same table structure. Just change the path and name in definition, and update the definition will be enough.

如果您有许多具有相同表结构的小文件,这尤其有用。只需更改定义中的路径和名称,然后更新定义就足够了。

This plugin supports both XLSX and XLS file. According to the document it also possible to write data back into the spreadsheet file, but all the fancy formatting in your excel will be lost, the file is recreated on write.

这个插件支持 XLSX 和 XLS 文件。根据文档,也可以将数据写回电子表格文件,但 excel 中的所有花哨格式都将丢失,文件会在写入时重新创建。

If the excel file is huge. This will not work. which is another reason I didn't use this extension. It load data in one time. But this extension also support ODBC interface, it should be possible to use windows' ODBC excel file driver to create a ODBC source for the excel file and use ogr_fdw or any other pgsql's ODBC foreign data wrapper to query this intermediate ODBC source. This should be fairly stable.

如果excel文件很大。这是行不通的。这是我没有使用这个扩展的另一个原因。它一次性加载数据。但是这个扩展也支持 ODBC 接口,应该可以使用 windows 的 ODBC excel 文件驱动程序为 excel 文件创建 ODBC 源,并使用 ogr_fdw 或任何其他 pgsql 的 ODBC 外部数据包装器来查询这个中间 ODBC 源。这应该是相当稳定的。

The downside is that you can't change file location or name easily within pgsql like in the previous approach.

缺点是您不能像以前的方法一样在 pgsql 中轻松更改文件位置或名称。

A friendly reminder. The permission issue applies to this fdw extensions. since its loaded into pgsql service. pgsql must have access privileged to the excel files.

友情提示。权限问题适用于此 fdw 扩展。因为它加载到 pgsql 服务中。pgsql 必须具有访问 excel 文件的特权。

回答by d.a.yantis

I have used Excel/PowerPivot to create the postgreSQL insertstatement. Seems like overkill, except when you need to do it over and over again. Once the data is in the PowerPivot window, I add successive columns with concatenatestatements to 'build' the insertstatement. I create a flattened pivot table with that last and final column. Copy and paste the resulting insertstatement into my EXISTING postgreSQL table with pgAdmin.

我使用 Excel/PowerPivot 来创建 postgreSQLinsert语句。看起来有点矫枉过正,除非你需要一遍又一遍地做。一旦数据出现在 PowerPivot 窗口中,我就会添加包含concatenate语句的连续列以“构建”insert语句。我用最后一列和最后一列创建了一个扁平的数据透视表。insert使用 pgAdmin将结果语句复制并粘贴到我的 EXISTING postgreSQL 表中。

Example two column table (my table has 30 columns from which I import successive contents over and over with the same Excel/PowerPivot.)

示例两列表(我的表有 30 列,我使用相同的 Excel/PowerPivot 一遍又一遍地从中导入连续的内容。)

Column1 {a,b,...} Column2 {1,2,...}

列 1 {a,b,...} 列 2 {1,2,...}

In PowerPivot I add calculated columns with the following commands:

在 PowerPivot 中,我使用以下命令添加计算列:

Calculated Column 1 has "insert into table_name values ('"

计算的第 1 列有 "insert into table_name values ('"

Calculated Column 2 has CONCATENATE([Calculated Column 1],CONCATENATE([Column1],"','"))

计算的第 2 列有 CONCATENATE([Calculated Column 1],CONCATENATE([Column1],"','"))

...until you get to the last column and you need to terminate the insert statement:

...直到到达最后一列并且需要终止插入语句:

Calculated Column 3 has CONCATENATE([Calculated Column 2],CONCATENATE([Column2],"');"

计算的第 3 列有 CONCATENATE([Calculated Column 2],CONCATENATE([Column2],"');"

Then in PowerPivot I add a flattened pivot table and have all of the insert statement that I just copy and paste to pgAgent.

然后在 PowerPivot 中,我添加了一个扁平的数据透视表,并将所有插入语句复制并粘贴到 pgAgent。

Resulting insert statements:

结果插入语句:

insert into table_name values ('a','1');

insert into table_name values ('b','2');

insert into table_name values ('c','3');

NOTE: If you are familiar with the power pivot CONCATENATE statement, you know that it can only handle 2 arguments (nuts). Would be nice if it allowed more.

注意:如果您熟悉 power pivot CONCATENATE 语句,您就会知道它只能处理 2 个参数(坚果)。如果允许更多就好了。

回答by Andrey

You can handle loading the excel file content by writing Java code using Apache POI library (https://poi.apache.org/). The library is developed for working with MS office application data including Excel.

您可以通过使用 Apache POI 库 ( https://poi.apache.org/)编写 Java 代码来处理加载 excel 文件内容。该库是为处理包括 Excel 在内的 MS Office 应用程序数据而开发的。

I have recently created the application based on the technology that will help you to load Excel files to the Postgres database. The application is available under http://www.abespalov.com/. The application is tested only for Windows, but should work for Linux as well.

我最近创建了基于技术的应用程序,可以帮助您将 Excel 文件加载到 Postgres 数据库。该应用程序可在http://www.abespalov.com/下获得 。该应用程序仅针对 Windows 进行了测试,但也应该适用于 Linux。

The application automatically creates necessary tables with the same columns as in the Excel files and populate the tables with content. You can export several files in parallel. You can skip the step to convert the files into the CSV format. The application handles the xls and xlsx formats.

应用程序会自动创建必要的表格,其列与 Excel 文件中的列相同,并用内容填充表格。您可以并行导出多个文件。您可以跳过将文件转换为 CSV 格式的步骤。该应用程序处理 xls 和 xlsx 格式。

Overall application stages are :

整体应用阶段是:

  1. Load the excel file content. Here is the code depending on file extension:
  1. 加载excel文件内容。这是取决于文件扩展名的代码:

{

{

fileExtension = FilenameUtils.getExtension(inputSheetFile.getName());
    if (fileExtension.equalsIgnoreCase("xlsx")) {
        workbook = createWorkbook(openOPCPackage(inputSheetFile));
    } else {
        workbook =     
        createWorkbook(openNPOIFSFileSystemPackage(inputSheetFile));
    }

sheet = workbook.getSheetAt(0);

}

}

  1. Establish Postgres JDBC connection
  2. Create a Postgres table
  3. Iterate over the sheet and inset rows into the table. Here is a piece of Java code :
  1. 建立 Postgres JDBC 连接
  2. 创建 Postgres 表
  3. 遍历工作表并将行插入表中。这是一段Java代码:

{

{

Iterator<Row> rowIterator = InitInputFilesImpl.sheet.rowIterator();

//skip a header
if (rowIterator.hasNext()) {
    rowIterator.next();
}
while (rowIterator.hasNext()) {
    Row row = (Row) rowIterator.next();
    // inserting rows
}  

}

}

Here you can find all Java code for the application created for exporting excel to Postgres (https://github.com/palych-piter/Excel2DB).

在这里,您可以找到为将 excel 导出到 Postgres ( https://github.com/palych-piter/Excel2DB) 而创建的应用程序的所有 Java 代码。

回答by BERA

It is possible using ogr2ogr:

可以使用 ogr2ogr:

C:\Program Files\PostgreSQL\bin\ogr2ogr.exe -f "PostgreSQL" PG:"host=someip user=someuser dbname=somedb password=somepw" C:/folder/excelfile.xlsx -nln newtablenameinpostgres -oo AUTODETECT_TYPE=YES

(Not sure if ogr2ogr is included in postgres installation or if I got it with postgis extension.)

(不确定 ogr2ogr 是否包含在 postgres 安装中,或者我是否使用 postgis 扩展名获得它。)

回答by Hoque MD Zahidul

You can do that easily by DataGrip.

您可以通过DataGrip轻松做到这一点

  1. First save your excel file as csv formate . Open the excel file then SaveAs as csv format
  2. Go to datagrip then create the table structure according to the csv file . Suggested create the column name as the column name as Excel column
  3. right click on the table name from the list of table name of your database then click of the import data from file . Then select the converted csv file . enter image description here
  1. 首先将您的 excel 文件另存为 csv formate 。打开excel文件,然后另存为csv格式
  2. 转到 datagrip 然后根据 csv 文件创建表结构。建议创建列名作为列名作为Excel列
  3. 右键单击数据库表名列表中的表名,然后单击从文件导入数据。然后选择转换后的 csv 文件。 在此处输入图片说明

.

.