Vba 将休 csv 文件的子部分导入 excel 2010
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21790795/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Vba to import a sub-portion of a hugh csv file into excel 2010
提问by Fantrf
I have a csv file that has approx 600 fields and approx 100k of rows, i would like to import only select fields and only certian rows where a select set of fields match a certain set of criteria into an existing excel worksheet tab
我有一个 csv 文件,它有大约 600 个字段和大约 100k 行,我想只导入选择字段和仅某些行,其中一组选择的字段与一组特定的条件匹配到现有的 excel 工作表选项卡
I attempted to use ms query within excel but it stops at 255 columns, i can import the whole file in excel 2010 (250m) but it is a memory hog and by the time i remove the unneeded fields and rows it locks up my computer.
我试图在 excel 中使用 ms 查询,但它在 255 列处停止,我可以在 excel 2010 (250m) 中导入整个文件,但它是一个内存猪,当我删除不需要的字段和行时,它锁定了我的计算机。
I would like to kick the import process off with an excel vba macro. I have all the front end code of file selection, etc.... But need some assistance in the text read query convert to excel area of vba
我想用 excel vba 宏启动导入过程。我有文件选择的所有前端代码,等等......但在文本读取查询转换为vba的excel区域需要一些帮助
Any assitance would be greatly appreciated
任何帮助将不胜感激
Thanks
谢谢
Tom
汤姆
回答by Brandon R. Gates
For that many records you would be better off importing the .csv into Microsoft Access, indexing some fields, writing a query that contains only what you want, and then exporting to Excel from the query.
对于这么多记录,最好将 .csv 导入 Microsoft Access,为某些字段编制索引,编写仅包含所需内容的查询,然后从查询导出到 Excel。
If you really need an Excel-only solution, do the following:
如果您确实需要仅限 Excel 的解决方案,请执行以下操作:
Open up the VBA editor. Navigate to Tools -> References. Select the most recent ActiveX Data Objects Library. (ADO for short). On my XP machine running Excel 2003, it's version 2.8.
Create a module if you don't have one already. Or create one anyway to contain the code at the bottom of this post.
In any blank worksheet paste the following values starting at cell A1:
打开 VBA 编辑器。导航到工具 -> 参考。选择最新的 ActiveX 数据对象库。(简称ADO)。在我运行 Excel 2003 的 XP 机器上,它是 2.8 版。
如果您还没有模块,请创建一个模块。或者无论如何创建一个来包含这篇文章底部的代码。
在任何空白工作表中,从单元格 A1 开始粘贴以下值:
SELECT Field1, Field2 FROM C:\Path\To\file.csv WHERE Field1 = 'foo' ORDER BY Field2
SELECT Field1, Field2 FROM C:\Path\To\file.csv WHERE Field1 = 'foo' ORDER BY Field2
(Formatting issues here. select
from
, etc should each be in their own row in col A for reference. The other stuff are the important bits and should go in column B.)
(这里的格式问题。 select
from
等都应该在列 A 中各自的行中以供参考。其他东西是重要的部分,应该放在 B 列中。)
Amend the input fields as appropriate for your filename and query requirements, then run thegetCsv()
subroutine. It will put the results in a QueryTable object starting at cell C6.
根据文件名和查询要求修改输入字段,然后运行getCsv()
子例程。它将结果放在从单元格 C6 开始的 QueryTable 对象中。
I personally hate QueryTables but the .CopyFromRecordset
method I prefer to use with ADO doesn't give you field names. I left the code for that method in, commented out, so you can investigate that way. If you use it, you can get rid of the call to deleteQueryTables()
because it's a really ugly hack, it deletes whole columns which you may not like, etc.
我个人讨厌 QueryTables,但.CopyFromRecordset
我更喜欢与 ADO 一起使用的方法不会为您提供字段名称。我留下了那个方法的代码,注释掉了,这样你就可以调查了。如果你使用它,你可以摆脱对的调用,deleteQueryTables()
因为它是一个非常丑陋的黑客,它会删除你可能不喜欢的整个列,等等。
Happy coding.
快乐编码。
Option Explicit
Function ExtractFileName(filespec) As String
' Returns a filename from a filespec
Dim x As Variant
x = Split(filespec, Application.PathSeparator)
ExtractFileName = x(UBound(x))
End Function
Function ExtractPathName(filespec) As String
' Returns the path from a filespec
Dim x As Variant
x = Split(filespec, Application.PathSeparator)
ReDim Preserve x(0 To UBound(x) - 1)
ExtractPathName = Join(x, Application.PathSeparator) & Application.PathSeparator
End Function
Sub getCsv()
Dim cnCsv As New ADODB.Connection
Dim rsCsv As New ADODB.Recordset
Dim strFileName As String
Dim strSelect As String
Dim strWhere As String
Dim strOrderBy As String
Dim strSql As String
Dim qtData As QueryTable
strSelect = ActiveSheet.Range("B1").Value
strFileName = ActiveSheet.Range("B2").Value
strWhere = ActiveSheet.Range("B3").Value
strOrderBy = ActiveSheet.Range("B4").Value
strSql = "SELECT " & strSelect
strSql = strSql & vbCrLf & "FROM " & ExtractFileName(strFileName)
If strWhere <> "" Then strSql = strSql & vbCrLf & "WHERE " & strWhere
If strOrderBy <> "" Then strSql = strSql & vbCrLf & "ORDER BY " & strOrderBy
With cnCsv
.Provider = "Microsoft.Jet.OLEDB.4.0"
.ConnectionString = "Data Source=" & ExtractPathName(strFileName) & ";" & _
"Extended Properties=""text;HDR=yes;FMT=Delimited(,)"";Persist Security Info=False"
.Open
End With
rsCsv.Open strSql, cnCsv, adOpenForwardOnly, adLockReadOnly, adCmdText
'ActiveSheet.Range("C6").CopyFromRecordset rsCsv
Call deleteQueryTables
Set qtData = ActiveSheet.QueryTables.Add(rsCsv, ActiveSheet.Range("C6"))
qtData.Refresh
rsCsv.Close
Set rsCsv = Nothing
cnCsv.Close
Set cnCsv = Nothing
End Sub
Sub deleteQueryTables()
On Error Resume Next
With Application
.ScreenUpdating = False
.Calculation = xlCalculationManual
End With
Dim qt As QueryTable
Dim qtName As String
Dim nName As Name
For Each qt In ActiveSheet.QueryTables
qtName = qt.Name
qt.Delete
For Each nName In Names
If InStr(1, nName.Name, qtName) > 0 Then
Range(nName.Name).EntireColumn.Delete
nName.Delete
End If
Next nName
Next qt
With Application
.ScreenUpdating = True
.Calculation = xlCalculationAutomatic
End With
End Sub
回答by Graham Anderson
You can parse your input file extracting the lines that conform to your criteria. The following code uses the split function on each line of the CSV file to separate the fields and then checks to see if it matches the required criteria. If all the criteria match then selected fields are saved in a new CSV file then you can just open the smaller file. You will need to set the microsoft scripting runtime reference in the VBA editor for this to work.
您可以解析您的输入文件,提取符合您标准的行。下面的代码在 CSV 文件的每一行上使用 split 函数来分隔字段,然后检查它是否符合所需的条件。如果所有条件都匹配,则所选字段将保存在新的 CSV 文件中,然后您只需打开较小的文件即可。您需要在 VBA 编辑器中设置 microsoft 脚本运行时引用才能使其工作。
This method should use little memory as it processes 1 line at a time, I tested it on data of 600 fields and 100000 lines and it took about 45 seconds to process the file with no noticable increase in RAM usage in windows task manager. It is CPU intensive and the time taken would increase as the complexity data, conditions and the number of fields copied increases.
这种方法应该使用很少的内存,因为它一次处理 1 行,我在 600 个字段和 100000 行的数据上对其进行了测试,处理文件大约需要 45 秒,而 Windows 任务管理器中的 RAM 使用率没有明显增加。它是 CPU 密集型的,并且随着复杂性数据、条件和复制的字段数量的增加,所花费的时间也会增加。
If you prefer to write directly to an existing sheet this can be easily acheived, but you would have to rememove any old data there first.
如果您更喜欢直接写入现有工作表,这很容易实现,但您必须先删除那里的所有旧数据。
Sub Extract()
Dim fileHandleInput As Scripting.TextStream
Dim fileHandleExtract As Scripting.TextStream
Dim fsoObject As Scripting.FileSystemObject
Dim sPath As String
Dim sFilenameExtract As String
Dim sFilenameInput As String
Dim myVariant As Variant
Dim bParse As Boolean 'To check if the line should be written
sFilenameExtract = "Exctract1.CSV"
sFilenameInput = "Input.CSV"
Set fsoObject = New FileSystemObject
sPath = ThisWorkbook.Path & "\"
'Check if this works ie overwrites existing file
If fsoObject.FileExists(sPath & sFilenameExtract) Then
Set fileHandleExtract = fsoObject.OpenTextFile(sPath & sFilenameExtract, ForWriting)
Else
Set fileHandleExtract = fsoObject.CreateTextFile((sPath & sFilenameExtract), True)
End If
Set fileHandleInput = fsoObject.OpenTextFile(sPath & sFilenameInput, ForReading)
'extracting headers for selected fields in this case the 1st, 2nd and 124th fields
myVariant = Split(fileHandleInput.ReadLine, ",")
fileHandleExtract.WriteLine (myVariant(0) & "," & _
myVariant(1) & "," & _
myVariant(123))
'Parse each line (row) of the inputfile
Do While Not fileHandleInput.AtEndOfStream
myVariant = Split(fileHandleInput.ReadLine, ",")
'Set bParse initially to true
bParse = True
'Check if the first element is greater than 123
If Not myVariant(0) > 123 Then bParse = False
'Check if second element is one of allowed values
'Trim used to remove pesky leading or lagging values when checking
Select Case Trim(myVariant(1))
Case "Red", "Yellow", "Green", "Blue", "Black"
'Do nothing as value found
Case Else
bParse = False 'As wasn't a value in the condition
End Select
'If the conditions were met by the line then write specific fields to extract file
If bParse Then
fileHandleExtract.WriteLine (myVariant(0) & "," & _
myVariant(1) & "," & _
myVariant(123))
End If
Loop
'close files and cleanup
fileHandleExtract.Close
fileHandleInput.Close
Set fileHandleExtract = Nothing
Set fileHandleInput = Nothing
Set fsoObject = Nothing
End Sub