Oracle PL/SQL 加载 XML
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5298896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Oracle PL/SQL to Load XML
提问by diagonalbatman
I don't know the best way to ask this, but let me explain the issue and it may help.
我不知道问这个问题的最佳方式,但让我解释一下这个问题,它可能会有所帮助。
We currently have a feed of data that is distributed to us using an Oracle OAI hub. The data is fed to us using DBLinks.
我们目前有一个使用 Oracle OAI 中心分发给我们的数据源。数据是使用 DBLinks 提供给我们的。
The information provider is going to be upgrading to Oracle ODI, and for numerous reasons are mandating that all data transfer is conducted using encrypted XML files over SFTP.
信息提供者将升级到 Oracle ODI,并且出于多种原因要求所有数据传输都使用加密的 XML 文件通过 SFTP 进行。
Now this introduces a new issue for ourselves as data recipient, as we now need to amend our systems to load XML data into the tables that once were populated by DBLinks.
现在这给我们作为数据接收者带来了一个新问题,因为我们现在需要修改我们的系统以将 XML 数据加载到曾经由 DBLink 填充的表中。
The set-up we currently have is:
我们目前的设置是:
- Oracle 10g (10.2.0.4)
- Oracle is running on Unix (HP-UX)
- Numerous Win2k3 servers controlling interface / ETL flow.
- Oracle 10g (10.2.0.4)
- Oracle 在 Unix (HP-UX) 上运行
- 许多 Win2k3 服务器控制接口/ETL 流程。
So currently, the OAI hub will place data into a number of tables in our ETL layer. Say for example "PERSON". The structure of this table i dont believe is relevant...
所以目前,OAI 中心会将数据放入我们 ETL 层的多个表中。比如说“人”。我不认为这张表的结构是相关的...
When the data has been loaded into the PERSON table, the OAI delivery will transmit an End-of-file marker to the database, this is stored in table "EOF" - it holds a count of recieved records expected in "PERSON".
当数据已加载到 PERSON 表中时,OAI 交付会将文件结束标记传输到数据库,该标记存储在表“EOF”中——它保存了预期在“PERSON”中接收到的记录的计数。
The windows server has a batch process that polls every 30 seconds, this checks to see if an EOF record exists, if it does then it kicks off our ETL processing.
Windows 服务器有一个批处理,每 30 秒轮询一次,这会检查是否存在 EOF 记录,如果存在,则启动我们的 ETL 处理。
I want to avoid changing this part of the system if possible, so what i am proposing as a solution is to parse the new XML files and load them into the Oracle database, the question is what is the best way to do this:
如果可能,我想避免更改系统的这一部分,因此我提出的解决方案是解析新的 XML 文件并将它们加载到 Oracle 数据库中,问题是最好的方法是什么:
- The XML files will be delivered to the Windows Servers
- Preference would be to use PL/SQL to load the data
- XML 文件将传送到 Windows 服务器
- 优先使用 PL/SQL 加载数据
What is the best way to load the data into the Oracle (Unix) database, with the source data being on the Windows side.
将数据加载到 Oracle (Unix) 数据库中的最佳方法是什么,源数据在 Windows 端。
I want to avoid having to use any Unix scripting if possible - as my development team don't have enough Unix experience to let them loose on this.
如果可能的话,我想避免使用任何 Unix 脚本——因为我的开发团队没有足够的 Unix 经验来让他们放松。
Any suggestions will be grateful.
任何建议将不胜感激。
The XML format is as below - and element names map to column names on DB:
XML 格式如下 - 元素名称映射到 DB 上的列名称:
<PERSON>
<HEADER>
<Creator>~</Creator>
<DigitalSigniture>~</DigitalSigniture>
<Owner>~</Owner>
<Title>~</Title>
<Marking>~</Marking>
</HEADER>
<PERSONS>
<EMPLOYEE_NUMBER>~</EMPLOYEE_NUMBER>
<FIRST_NAME>~</FIRST_NAME>
<LAST_NAME>~</LAST_NAME>
......
</PERSONS>
<PERSON>
EDIT: I am also conscious of volumes, on a normal (average) day i will be processing about 80,000 XML records, and on an exceptional day i will processing up to 300,000 (typically once per year).
编辑:我也知道数量,在正常(平均)的一天我将处理大约 80,000 个 XML 记录,在特殊的一天我将处理多达 300,000(通常每年一次)。
回答by Rob van Wijk
To shred your XML and load them into Oracle-tables, you can use the technique I described in this blogpost: http://rwijk.blogspot.com/2010/03/shredding-xml-into-multiple-tables-in.html
要粉碎 XML 并将它们加载到 Oracle 表中,您可以使用我在这篇博文中描述的技术:http://rwijk.blogspot.com/2010/03/shredding-xml-into-multiple-tables-in.html
Regards,
Rob.
问候,
罗伯。
Some links for loading the files:
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14259/xdb25loa.htm#ADXDB2900
http://www.oracle-developer.net/display.php?id=416
加载文件的一些链接:http:
//download.oracle.com/docs/cd/B19306_01/appdev.102/b14259/xdb25loa.htm#ADXDB2900
http://www.oracle-developer.net/display.php?编号=416
回答by Anton Prokofiev
I would offer to forget about build-in Oracle XML features and use any normal language to process XML files on the "client side" and submit ready to use data to the DB. (Look for Python or .NET)
我愿意忘记内置的 Oracle XML 特性,并使用任何普通语言在“客户端”处理 XML 文件并将准备使用的数据提交给数据库。(寻找 Python 或 .NET)
It could be a security risk, if you allows to access DB server file system from network.
如果您允许从网络访问数据库服务器文件系统,则可能存在安全风险。
80 000 records a day should not be a big problem for Oracle. We do much more close to real time.
每天 80 000 条记录对于 Oracle 来说应该不是什么大问题。我们做得更接近实时。