C# NHibernate 批量插入或更新
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/560584/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
NHibernate bulk insert or update
提问by Pedro Santos
Hi I'm working a project where we need to process several xml files once a day and populate a Database with the information contained in those files.
嗨,我正在做一个项目,我们需要每天处理几个 xml 文件,并使用这些文件中包含的信息填充数据库。
Each file is roughly 1Mb and contains about 1000 records; we usually need to process between 12 and 25 of these files. I've seen some information regarding bulk inserts using NHibernate but our problem is somehow trickier since the xml files contain new records mixed with updated records.
每个文件大约 1Mb,包含大约 1000 条记录;我们通常需要处理 12 到 25 个这些文件。我已经看到了一些关于使用 NHibernate 进行批量插入的信息,但我们的问题在某种程度上更棘手,因为 xml 文件包含与更新记录混合的新记录。
In the xml there is a flag that tells us is a specific record is a new one or an update to an existing record, but not what information has changed. The xml records do not contain our DB identifier, but we can use an identifier from the xml record to uniquely locate a record in our DB.
在 xml 中有一个标志告诉我们特定记录是新记录还是对现有记录的更新,而不是哪些信息发生了变化。xml 记录不包含我们的数据库标识符,但我们可以使用 xml 记录中的标识符来唯一地定位我们数据库中的记录。
Our strategy so far has been to identify if the current record is an insert or an update and based on that we either perform an insert on the DB or we do a search, then we update the information on the object with the information coming from the xml record and finally we do an update on the DB.
到目前为止,我们的策略是确定当前记录是插入还是更新,并基于此对数据库执行插入或搜索,然后我们使用来自对象的信息更新对象的信息xml 记录,最后我们对数据库进行更新。
The problem with our current approach is that we are having issues with DB locks and our performance degrades really fast. We have thought about some alternatives like having separate tables for the distinct operations or even separate DB's but doing such a move would mean a big effort so before any decisions I would like to ask for the community opinion on this matter, thanks in advance.
我们当前方法的问题在于我们遇到了 DB 锁问题,而且我们的性能下降得非常快。我们已经考虑了一些替代方案,例如为不同的操作使用单独的表,甚至是单独的数据库,但这样做意味着付出很大的努力,因此在做出任何决定之前,我想征求社区对此事的意见,在此先感谢。
采纳答案by Mauricio Scheffer
A couple of ideas:
几个想法:
- Always try to use IStatelessSession for bulk operations.
- If you're still not happy with the performance, just skip NHibernate and use a stored procedure or parameterized query specific to this, or use IQuery.ExecuteUpdate()
- If you're using SQL Server, you could convert your xml format to BCPFORMAT xml then run BULK INSERT on it (only for insertions)
- If you're having too many DB locks, try grouping the operations (i.e. first find out what needs to be inserted and what updated, then get PKs for the updates, then run BULK INSERT for insertions, then run updates)
- If parsing the source files is a performance issue (i.e. it maxes out a CPU core), try doing it in parallel (you could use Parallel Extensions)
- 始终尝试使用 IStatelessSession 进行批量操作。
- 如果您仍然对性能不满意,只需跳过 NHibernate 并使用特定于此的存储过程或参数化查询,或使用IQuery.ExecuteUpdate()
- 如果您使用的是 SQL Server,则可以将 xml 格式转换为 BCPFORMAT xml,然后在其上运行 BULK INSERT(仅用于插入)
- 如果您有太多的 DB 锁,请尝试对操作进行分组(即首先找出需要插入的内容和更新的内容,然后获取更新的 PK,然后运行 BULK INSERT 进行插入,然后运行更新)
- 如果解析源文件是一个性能问题(即它最大化 CPU 内核),请尝试并行执行(您可以使用Parallel Extensions)