VBA:使用没有 BOM 的 UTF-8 保存文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31435662/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 09:46:40  来源:igfitidea点击:

VBA : save a file with UTF-8 without BOM

vbaexcel-vbautf-8vbscriptvb6

提问by Julien

it's probably sthg simple, here is what I tried :

这可能很简单,这是我尝试过的:

 Set objStream = CreateObject("ADODB.Stream")
 Set objStreamNoBOM = CreateObject("ADODB.Stream")

 With objStream
        .Open
        .Charset = "UTF-8"
        .WriteText "aaaaaa"
        .Position = 0
    End With

    With objStreamNoBOM
      '.Charset = "Windows-1252"   ' WORK
       .Charset = "UTF-8"          ' DOESN'T WORK!!
       .Open
       .Type = 2
       .WriteText objStream.ReadText
       .SaveToFile "toto.php", 2
       .Close
    End With
    objStream.Close

if the charset is UTF-8, then there is ?? at the beginning of the file.

如果字符集是 UTF-8,那么有 ?? 在文件的开头。

Any idea on how to save a file with UTF-8 and without BOM?

关于如何使用 UTF-8 和不使用 BOM 保存文件的任何想法?

回答by Ekkehard.Horner

In the best of all possible worlds the Related list would contain a reference to this questionwhich I found as the first hit for "vbscript adodb.stream bom vbscript site:stackoverflow.com".

在所有可能的世界中,相关列表将包含对这个问题的引用,我发现它是“vbscript adodb.stream bom vbscript site:stackoverflow.com”的第一次命中。

Based on the second strategy from boost's answer:

基于boost 答案中的第二种策略:

Option Explicit

Const adSaveCreateNotExist = 1
Const adSaveCreateOverWrite = 2
Const adTypeBinary = 1
Const adTypeText   = 2

Dim objStreamUTF8      : Set objStreamUTF8      = CreateObject("ADODB.Stream")
Dim objStreamUTF8NoBOM : Set objStreamUTF8NoBOM = CreateObject("ADODB.Stream")

With objStreamUTF8
  .Charset = "UTF-8"
  .Open
  .WriteText "a??"
  .Position = 0
  .SaveToFile "toto.php", adSaveCreateOverWrite
  .Type     = adTypeText
  .Position = 3
End With

With objStreamUTF8NoBOM
  .Type    = adTypeBinary
  .Open
  objStreamUTF8.CopyTo objStreamUTF8NoBOM
  .SaveToFile "toto-nobom.php", adSaveCreateOverWrite
End With

objStreamUTF8.Close
objStreamUTF8NoBOM.Close

Evidence:

证据:

chcp
Active code page: 65001

dir
 ...
15.07.2015  18:48                 5 toto-nobom.php
15.07.2015  18:48                 8 toto.php

type toto-nobom.php
a??

回答by Nigel Heffernan

I knew that the Scripting File System Object's stream inserted a Byte Order Mark, but I haven't seen that with the ADODB Stream.

我知道脚本文件系统对象的流插入了一个字节顺序标记,但我没有在 ADODB 流中看到这一点。

Or at least, not yet: I rarely use the ADODB stream object...

或者至少,还没有:我很少使用 ADODB 流对象......

But I do remember putting this remark into some code a few years ago:

但我确实记得几年前把这句话放到了一些代码中:

'   ****   WHY THIS IS COMMENTED OUT   **** **** **** **** **** **** **** ****
'
'   Microsoft ODBC and OLEDB database drivers cannot read the field names from
'   the header when a unicode byte order mark (&HFF & &HFE) is inserted at the
'   start of the text by Scripting.FileSystemObject 'Write' methods. Trying to
'   work around this by writing byte arrays will fail; FSO 'Write' detects the
'   string encoding automatically, and won't let you hack around it by writing
'   the header as UTF-8 (or 'Narrow' string) and appending the rest as unicode
'
'   (Yes, I tried some revolting hacks to get around it: don't *ever* do that)
'
'   **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****
'
'    With FSO.OpenTextFile(FilePath, ForWriting, True, TristateTrue)
'        .Write Join(arrTemp1, EOROW)
'        .Close
'    End With ' textstream object from objFSO.OpenTextFile
'
'   **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****

You can tell I was having a bad day.

你可以说我今天过得很糟糕。

Next, using prehistoric PUT commands from the days before file-handling had emerged from the primordial C:

接下来,使用原始 C 中出现文件处理之前的史前 PUT 命令:

'   **** WHY WE 'PUT' A BYTE ARRAY INSTEAD OF A VBA STRING VARIABLE  **** ****
'
'       Put #hndFile, , StrConv(Join(arrTemp1, EOROW), vbUnicode)
'       Put #hndFile, , Join(arrTemp1, EOROW)
'
'   If you pass unicode, Wide or UTF-16 string variables to PUT, it prepends a
'   Unicode Byte Order Mark to the data which, when written to your file, will
'   render the field names illegible to Microsoft's JET ODBC and ACE-OLEDB SQL
'   drivers (which can actually read unicode field names, if the helpful label
'   isn't in the way). However, the 'PUT' statements writes a Byte array as-is
'
'   **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****

So there's the code that actually does it:

所以有实际执行它的代码:

Dim arrByte() As Byte
Dim strText   As String
Dim hndFile   As String


    strText = "Y'all knew that strings are actually byte arrays?"
    arrByte = strText 

    hndFile = FreeFile
    Open FilePath For Binary As #hndFile

    Put #hndFile, , arrByte
    Close #hndFile

    Erase arrByte

I'm assuming that strText is actually UTF-8. I mean, we're in VBA, in Microsoft Office, and we absolutelyknow that this is always going to be UTF-8, even we use it in a foreign country...

我假设 strText 实际上是 UTF-8。我的意思是,我们在 VBA 中,在 Microsoft Office 中,我们绝对知道这将始终是 UTF-8,即使我们在国外使用它...

...Right?

...对?