如何在 Windows 机器上的 perl 脚本中将 Unicode 文件转换为 ASCII 文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8142826/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert Unicode file to ASCII file in perl script on windows machine
提问by ashokbabuy
I have a file in Unicode format on a windows machine. Is there any way to convert it to ASCII format on a windows machine using perl script
我在 Windows 机器上有一个 Unicode 格式的文件。有什么方法可以在 Windows 机器上使用 perl 脚本将其转换为 ASCII 格式
It's UTF-16 BOM.
它是 UTF-16 BOM。
回答by Karsten S.
If you want to convert unicode to ascii, you must be aware that some characters can't be converted, because they just don't exist in ascii. If you can live with that, you can try this:
如果你想把unicode转换成ascii,你必须知道有些字符是不能转换的,因为它们在ascii中是不存在的。如果你能忍受,你可以试试这个:
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
use open IN => ':encoding(UTF-16)';
use open OUT => ':encoding(ascii)';
my $buffer;
open(my $ifh, '<', 'utf16bom.txt');
read($ifh, $buffer, -s $ifh);
close($ifh);
open(my $ofh, '>', 'ascii.txt');
print($ofh $buffer);
close($ofh);
If you do not have autodie, just remove that line - you should then change your open/close statements with a
如果您没有 autodie,只需删除该行 - 然后您应该使用
open(...) or die "error: $!\n";
If you have characters that can't be converted, you will get warnings on the console and your output file will have e.g. text like
如果您有无法转换的字符,您将在控制台上收到警告,并且您的输出文件将包含例如文本
\x{00e4}\x{00f6}\x{00fc}\x{00df}
in it. BTW: If you don't have a mom but know it is Big Endian (Little Endian), you can change the encoding line to
在里面。BTW:如果你没有妈妈但知道它是大端(小端),你可以将编码行更改为
use open IN => ':encoding(UTF-16BE)';
or
或者
use open IN => ':encoding(UTF-16LE)';
Hope it works under Windows as well. I can't give it a try right now.
希望它也能在 Windows 下运行。我现在不能试一试。
回答by David W.
Take a look at the encoding option on the Perl opencommand. You can specify the encoding when opening a file for reading or writing:
查看 Perl打开命令上的编码选项。您可以在打开文件进行读取或写入时指定编码:
It'd be something like this would work:
它会是这样的:
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say switch);
use Data::Dumper;
use autodie;
open (my $utf16_fh, "<:encoding(UTF-16BE)", "test.utf16.txt");
open (my $ascii_fh, ">:encoding(ASCII)", ".gvimrc");
while (my $line = <$utf16_fh>) {
print $ascii_fh $line;
}
close $utf16_fh;
close $ascii_fh;