用于在全角和半角形式之间转换 UTF8 字符串的 JavaScript 函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20486551/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-27 18:36:23  来源:igfitidea点击:

JavaScript function to convert UTF8 string between fullwidth and halfwidth forms

javascriptencodingutf-8characterconverter

提问by xpt

EDIT: Thanks to GOTO 0, I now know exactly what I my question is called.

编辑:感谢GOTO 0,我现在确切地知道我的问题叫什么。

I need a JavaScript function to convert from UTF-8 fullwidth form to halfwidth form.

我需要一个 JavaScript 函数来从 UTF-8 fullwidth form转换为 halfwidth form

回答by GOTO 0

Apperently, you want to convert halfwidth and fullwidth formcharacters to their equivalent basic latin forms. If this is correct, you can do a replacement using a regular expression. Something like this should work:

显然,您希望将半角和全角形式的字符转换为它们等效的基本拉丁形式。如果这是正确的,您可以使用正则表达式进行替换。这样的事情应该工作:

var x = "!abc ABC!";
var y = x.replace(
    /[\uff01-\uff5e]/g,
    function(ch) { return String.fromCharCode(ch.charCodeAt(0) - 0xfee0); }
    );

Where x is your input string and y is the output.

其中 x 是您的输入字符串,y 是输出。

回答by 7vujy0f0hy

Year 2018 answer

2018年答案

Many years later – and it's still impossible to find on the Internet a function that does this. So I wrote mine. (Nearly learned Japanese and Korean to get to this point.)

许多年后——仍然不可能在 Internet 上找到执行此操作的功能。所以我写了我的。(几乎学会了日语和韩语才能达到这一点。)

Simple version

简易版

Latin range only.

仅限拉丁语范围。

var shiftCharCode = Δ => c => String.fromCharCode(c.charCodeAt(0) + Δ);
var toFullWidth = str => str.replace(/[!-~]/g, shiftCharCode(0xFEE0));
var toHalfWidth = str => str.replace(/[!-~]/g, shiftCharCode(-0xFEE0));

Complete version

完整版

Let me know if I missed any character.

如果我错过了任何角色,请告诉我。

(function () {
    let charsets = {
        latin: {halfRE: /[!-~]/g, fullRE: /[!-~]/g, delta: 0xFEE0},
        hangul1: {halfRE: /[?-?]/g, fullRE: /[?-?]/g, delta: -0xEDF9},
        hangul2: {halfRE: /[?-?]/g, fullRE: /[?-?]/g, delta: -0xEE61},
        kana: {delta: 0,
            half: "???????????????????????????????????????????????????????????????", 
            full: "。「」、?ヲァィゥェォャュョッーアイウエオカキクケコサシ" + 
                "スセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワン゛゜"},
        extras: {delta: 0,
            half: "¢£?ˉ|¥?\u0020|←↑→↓■°", 
            full: "¢£¬ ̄¦¥?\u3000???????"}
    };
    let toFull = set => c => set.delta ? 
        String.fromCharCode(c.charCodeAt(0) + set.delta) : 
        [...set.full][[...set.half].indexOf(c)];
    let toHalf = set => c => set.delta ? 
        String.fromCharCode(c.charCodeAt(0) - set.delta) : 
        [...set.half][[...set.full].indexOf(c)];
    let re = (set, way) => set[way + "RE"] || new RegExp("[" + set[way] + "]", "g");
    let sets = Object.keys(charsets).map(i => charsets[i]);
    window.toFullWidth = str0 => 
        sets.reduce((str,set) => str.replace(re(set, "half"), toFull(set)), str0);
    window.toHalfWidth = str0 => 
        sets.reduce((str,set) => str.replace(re(set, "full"), toHalf(set)), str0);
})();

/* Example starts here: */
var set = prompt("Enter a couple of comma-separated strings (half or full-width):", 
    ["aou??ü123", "'\"?:", "¢£¥?↑→", "?????", "????"].join()).split(",");
var steps = [set, set.map(toFullWidth), set.map(toFullWidth).map(toHalfWidth)];
var tdHTML = str => `<td>${str}</td>`;
var stepsHTML = steps.map(step => step.map(tdHTML).join(""));
var rows = document.getElementsByTagName("tr");
[...rows].forEach((row,i) => row.insertAdjacentHTML("beforeEnd", stepsHTML[i]));
th, td {border: 1px solid lightgrey; padding: 0.2em;}
th {text-align: left;}
table {border-collapse: collapse;}
<table>
    <tr><th scope="row">Input:</th></tr>
    <tr><th scope="row">Full-width:</th></tr>
    <tr><th scope="row">Half-width:</th></tr>
</table>

回答by Rezigned

Try this

试试这个

function toASCII(chars) {
    var ascii = '';
    for(var i=0, l=chars.length; i<l; i++) {
        var c = chars[i].charCodeAt(0);

        // make sure we only convert half-full width char
        if (c >= 0xFF00 && c <= 0xFFEF) {
           c = 0xFF & (c + 0x20);
        }

        ascii += String.fromCharCode(c);
    }

    return ascii;
}

// example
toASCII("ABC"); // returns 'ABC' 0x41

回答by Peter Chen

The answer of GOTO 0is very useful, but I also need convert spacefrom fullwidth to halfwidth.

GOTO 0 的答案非常有用,但我还需要将空格从全角转换为半角。

So below is my code:

所以下面是我的代码:

const halfwidthValue = value
      .replace(/[\uff01-\uff5e]/g, fullwidthChar => String.fromCharCode(fullwidthChar.charCodeAt(0) - 0xfee0))
      .replace(/\u3000/g, '\u0020');

回答by Lav Shinde

The given solutions do not work for all the cases of Full-Width to Half-Width conversion of Kana (eg. デジタル is not converted properly). I have made a function for converting Zenkaku to Hankaku Katakana, Hope it helps.

给定的解决方案不适用于假名的全宽到半宽转换的所有情况(例如,デジタル 未正确转换)。我已经制作了一个将 Zenkaku 转换为 Hankaku 片假名的函数,希望它有所帮助。

function convertToHalfWidth(string) {
  let characters = getCharacters(string);
  let halfWidthString = ''
  characters.forEach(character => {
    halfWidthString += mapToHankaku(character);
  });
  return halfWidthString;
}

function getCharacters(string) {
   return string.split("");
}

function mapToHankaku(character) {
  let zenHanMap = getZenkakuToHankakuMap();
  if (typeof zenHanMap[character] === 'undefined') {
    return character;
  } else {
    return zenHanMap[character];
  }
}

function getZenkakuToHankakuMap() {
  let zenHanMap = {
    'ァ': '?',
    'ア': '?',
    'ィ': '?',
    'イ': '?',
    'ゥ': '?',
    'ウ': '?',
    'ェ': '?',
    'エ': '?',
    'ォ': '?',
    'オ': '?',
    'カ': '?',
    'ガ': '??',
    'キ': '?',
    'ギ': '??',
    'ク': '?',
    'グ': '??',
    'ケ': '?',
    'ゲ': '??',
    'コ': '?',
    'ゴ': '??',
    'サ': '?',
    'ザ': '??',
    'シ': '?',
    'ジ': '??',
    'ス': '?',
    'ズ': '??',
    'セ': '?',
    'ゼ': '??',
    'ソ': '?',
    'ゾ': '??',
    'タ': '?',
    'ダ': '??',
    'チ': '?',
    'ヂ': '??',
    'ッ': '?',
    'ツ': '?',
    'ヅ': '??',
    'テ': '?',
    'デ': '??',
    'ト': '?',
    'ド': '??',
    'ナ': '?',
    'ニ': '?',
    'ヌ': '?',
    'ネ': '?',
    'ノ': '?',
    'ハ': '?',
    'バ': '??',
    'パ': '??',
    'ヒ': '?',
    'ビ': '??',
    'ピ': '??',
    'フ': '?',
    'ブ': '??',
    'プ': '??',
    'ヘ': '?',
    'ベ': '??',
    'ペ': '??',
    'ホ': '?',
    'ボ': '??',
    'ポ': '??',
    'マ': '?',
    'ミ': '?',
    'ム': '?',
    'メ': '?',
    'モ': '?',
    'ャ': '?',
    'ヤ': '?',
    'ュ': '?',
    'ユ': '?',
    'ョ': '?',
    'ヨ': '?',
    'ラ': '?',
    'リ': '?',
    'ル': '?',
    'レ': '?',
    'ロ': '?',
    'ヮ': '',
    'ワ': '?',
    // 'ヰ': '?  ?',
    // 'ヱ': '',
    'ヲ': '?',
    'ン': '?',
    'ヴ': '??',
    // 'ヵ': '',
    // 'ヶ': '',
    // '?': '',
    // '?': '',
    // '?': '',
    // '?': '',
    '?': '?',
    'ー': '?',
    // 'ヽ': '',
    // 'ヾ': '',
    // '?': '',
  };
  return zenHanMap;
}

Use as follows convertToHalfWidth('デジタル');

使用如下 convertToHalfWidth('デジタル');

You can pass the result of this function to the function mentioned by GOTO 0and get the complete Half width result for the Japanese Language

您可以将此函数的结果传递给所提到的函数GOTO 0并获得日语的完整半角结果

Reference: https://en.wikipedia.org/wiki/Katakana#Unicode

参考:https: //en.wikipedia.org/wiki/Katakana#Unicode