windows 如何将存储在 utf-8 中的批处理文件转换为通过另一个批处理文件工作的文件并运行它
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13130214/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert a batch file stored in utf-8 to something that works via another batch file and run it
提问by C.O.
I have a program I use to create a batch file. My problem is that the program's output is UTF-8 so as soon as any diacritical marks like é,à,?,? are in my batch file it fails. It seems I can't figure out a way to convert my output to anything but UTF-8 in the program that creates the batch file.
我有一个用于创建批处理文件的程序。我的问题是程序的输出是 UTF-8,所以只要有像 é,à,?,? 在我的批处理文件中它失败了。在创建批处理文件的程序中,我似乎无法找到一种方法将我的输出转换为除 UTF-8 以外的任何内容。
So I was thinking of creating two bach files. The actual one and another that converts the actual one from UTF-8 to ANSI (Windows Codepage 1252, or maybe cp 850) and then executes it after that. Of course I'd add a chcp xxxx as the first command of the actual batch file.
所以我想创建两个 bach 文件。实际的一个和另一个将实际的从 UTF-8 转换为 ANSI(Windows 代码页 1252,或者 cp 850),然后在此之后执行它。当然,我会添加一个 chcp xxxx 作为实际批处理文件的第一个命令。
So my question is is there an alternative to iconv on Windows - or how does one convert a UTF-8 text file to a windows codepage using a second batch file. Is there anything built into Win XP and up that I could use or is there a free and redistributable tool I might use for this?
所以我的问题是在 Windows 上有没有 iconv 的替代方案 - 或者如何使用第二个批处理文件将 UTF-8 文本文件转换为 Windows 代码页。是否有任何内置于 Win XP 及更高版本中我可以使用的东西,或者是否有我可以使用的免费且可重新分发的工具?
Note:
笔记:
chcp 65001
does not work for batch files.
不适用于批处理文件。
EDIT 1:
编辑 1:
on windows XP I created two batch files to test the first answer.
在 Windows XP 上,我创建了两个批处理文件来测试第一个答案。
1.bat encoded to UTF-8 without BOM contains:
1.bat 编码为 UTF-8 没有 BOM 包含:
chcp 1252
cd ü??
2.bat also encoded to UTF-8 without BOM - but without any special characters contains:
2.bat 也编码为没有 BOM 的 UTF-8 - 但没有任何特殊字符包含:
chcp 1252
type "1.bat" >"ansi_file.bat"
The resulting ansi_file.bat created when one executes 2.bat will still be utf-8 encoded and not ansi encoded.
执行 2.bat 时创建的 ansi_file.bat 仍然是 utf-8 编码而不是 ansi 编码。
EDIT 2:
编辑2:
The mentioned reverse process works.
提到的反向过程有效。
chcp 1252
echo ü > ansi.txt
cmd /u /c type ansi.txt > unicode.txt
but neither of the following subsequent lines
但以下任何一行都没有
cmd /a /c type unicode.txt > back2ansi.txt
type unicode.txt > back2ansi_v2.txt
gets me back to ANSI. I tried this both on Win XP and Win 7. Can anyone help?
让我回到ANSI。我在 Win XP 和 Win 7 上都试过了。有人可以帮忙吗?
NOTE:
笔记:
I'm aware of how to use the Windows Script Host and VBS. I'd like to avoid depending on the script host though. The VBS method is detailed here: http://msdn.microsoft.com/en-us/library/windows/desktop/aa368046%28v=vs.85%29.aspx
我知道如何使用 Windows Script Host 和 VBS。不过,我想避免依赖于脚本主机。VBS 方法详述如下:http: //msdn.microsoft.com/en-us/library/windows/desktop/aa368046%28v=vs.85%29.aspx
EDIT 3:
编辑 3:
The text file created containing a unicode ü above is not utf-8
上面创建的包含 unicode ü 的文本文件不是 utf-8
The Windows unicode file is HEX:
Windows unicode 文件是十六进制的:
FC 00 20 00 0D 00 0A 00
UTF-8 without BOM would be HEX:
没有 BOM 的 UTF-8 将是十六进制:
C3 BC 20 0D 0A
The VBS solution linked to only works with the unicode form but fails on the UTF-8 form. I need to convert UTF-8 to another code page so not even that one seems to work for me...
链接到的 VBS 解决方案仅适用于 unicode 形式,但在 UTF-8 形式上失败。我需要将 UTF-8 转换为另一个代码页,所以即使那个代码页似乎也不适合我......
回答by dbenham
You have stated you don't want to rely on the script host, but there is no native batch command that can do what you want. You are going to have to use somethingbeyond pure batch. The script host is native to Windows, so I should think it would not be a problem.
您已经声明不想依赖脚本主机,但是没有可以执行您想要的操作的本机批处理命令。您将不得不使用纯批处理以外的东西。脚本宿主是 Windows 原生的,所以我认为这不会有问题。
The following UTF8toANSI.vbsscript converts UTF-8 (with or without BOM) into ISO-8859-1, (basically the same as code page 1252). It is adapted from VB6/VbScsript change file / write file with encoding to ansii.
以下UTF8toANSI.vbs脚本将 UTF-8(带或不带 BOM)转换为 ISO-8859-1(与代码页 1252 基本相同)。它改编自VB6/VbScsript 更改文件/写入文件,编码为 ansii。
Option Explicit
Private Const adReadAll = -1
Private Const adSaveCreateOverWrite = 2
Private Const adTypeBinary = 1
Private Const adTypeText = 2
Private Const adWriteChar = 0
Private Sub UTF8toANSI(ByVal UTF8FName, ByVal ANSIFName)
Dim strText
With CreateObject("ADODB.Stream")
.Open
.Type = adTypeBinary
.LoadFromFile UTF8FName
.Type = adTypeText
.Charset = "utf-8"
strText = .ReadText(adReadAll)
.Position = 0
.SetEOS
.Charset = "iso-8859-1"
.WriteText strText, adWriteChar
.SaveToFile ANSIFName, adSaveCreateOverWrite
.Close
End With
End Sub
UTF8toANSI WScript.Arguments(0), WScript.Arguments(1)
The VBS script would need to be in your current directory or your path.
VBS 脚本需要位于您的当前目录或路径中。
A batch script to convert and run your UTF8 encoded script could look something like this:
用于转换和运行 UTF8 编码脚本的批处理脚本可能如下所示:
@echo off
UTF8toANSI "utf8.bat" "ansi.bat"
ansi.bat
Original Answer:below is my original answer that works for UTF-16 with BOM, but not for UTF-8
原始答案:以下是我的原始答案,适用于带 BOM 的 UTF-16,但不适用于 UTF-8
The output of internal commands is automatically converted to ANSI if output is piped or redirected to a file.
如果输出通过管道传输或重定向到文件,则内部命令的输出会自动转换为 ANSI。
chcp 1252
type "utf_file.bat" >"ansi_file.bat"
The process can go in reverse if CMD is started with the /U
option, but unfortunately the unicode header bytes will be missing. But of course that is a non-issue for your situation.
如果使用该/U
选项启动 CMD,该过程可以反向进行,但不幸的是 unicode 标头字节将丢失。但当然,这对您的情况来说不是问题。
回答by Kenneth Yrke Joergensen
In Unix I would use the "iconv" tool for converting between encodings:
在 Unix 中,我会使用“iconv”工具在编码之间进行转换:
iconv --from-code UTF-8 --to-code iso-8859-1 -c inputfile > outputfile
It seems a build for Windows is avialable at http://gnuwin32.sourceforge.net/packages/libiconv.htm
似乎可以在http://gnuwin32.sourceforge.net/packages/libiconv.htm 上获得 Windows 版本