bash 如何在 linux 中重新添加 unicode 字节顺序标记？

Question

提问by Neil Trodden

I have a rather large SQL file which starts with the byte order marker of FFFE. I have split this file using the unicode aware linux split tool into 100,000 line chunks. But when passing these back to windows, it does notlike any of the parts other than the first one as only it has the FFFE byte order marker on.

我有一个相当大的 SQL 文件，它以 FFFE 的字节顺序标记开头。我已使用 unicode 感知 linux 拆分工具将此文件拆分为 100,000 行块。但是经过这些回窗口时，它不喜欢任何不是只它对FFFE字节顺序标记的第一个其他的部件。

How can I add this two byte code using echo (or any other bash command)?

如何使用 echo（或任何其他 bash 命令）添加这两个字节的代码？

Answer 1

采纳答案by Matthew Flaschen

Something like (backup first)):

类似（先备份））：

for i in $(ls *.sql)
do
  cp "$i" "$i.temp"
  printf '\xFF\xFE' > "$i"
  cat "$i.temp" >> "$i"
  rm "$i.temp"
done

Answer 2

回答by brillout

Based on sed's solution of Anonymous, sed -i '1s/^/\xef\xbb\xbf/' fooadds the BOM to the UTF-8 encoded file foo. Usefull is that it also converts ASCII files to UTF8 with BOM

基于 sed 的Anonymous 解决方案，sed -i '1s/^/\xef\xbb\xbf/' foo将 BOM 添加到 UTF-8 编码文件中foo。有用的是它还可以使用 BOM 将 ASCII 文件转换为 UTF8

Answer 3

回答by yingted

To add BOMs to the all the files that start with "foo-", you can use sed. sedhas an option to make a backup.

要将 BOM 添加到所有以“foo-”开头的文件中，您可以使用sed. sed可以选择进行备份。

sed -i '1s/^\(\xff\xfe\)\?/\xff\xfe/' foo-*

straceing this shows sed creates a temp file with a name starting with "sed". If you know for sure there is no BOM already, you can simplify the command:

strace这表明 sed 创建了一个名称以“sed”开头的临时文件。如果您确定已经没有 BOM，您可以简化命令：

sed -i '1s/^/\xff\xfe/' foo-*

Make sure you need to set UTF-16, because i.e. UTF-8 is different.

确保你需要设置 UTF-16，因为 ie UTF-8 是不同的。

Answer 4

回答by andrewdotn

For a general-purpose solution—something that sets the correct byte-order mark regardless of whether the file is UTF-8, UTF-16, or UTF-32—I would use vim's 'bomb'option:

对于通用解决方案——无论文件是 UTF-8、UTF-16 还是 UTF-32，都设置正确的字节顺序标记——我会使用 vim 的'bomb'选项：

$ echo 'hello' > foo
$ xxd < foo
0000000: 6865 6c6c 6f0a                           hello.
$ vim -e -s -c ':set bomb' -c ':wq' foo
$ xxd < foo
0000000: efbb bf68 656c 6c6f 0a                   ...hello.

(-emeans runs in ex mode instead of visual mode; -smeans don't print status messages; -cmeans “do this”)

（-e意味着在 ex 模式而不是可视模式下运行；-s意味着不打印状态消息；-c意味着“这样做”）

Answer 5

回答by Martin Wang

Try uconv

试试 uconv

uconv --add-signature

Answer 6

回答by Paused until further notice.

Matthew Flaschen's answer is a good one, however it has a couple of flaws.

Matthew Flaschen 的回答很好，但它有几个缺陷。

There's no check that the copy succeeded before the original file is truncated.It would be better to make everything contingent on a successful copy, or test for the existence of the temporary file, or to operate on the copy. If you're a belt-and-suspenders kind of person, you'd do a combo as I've illustrated below
The lsis unnecessary.
I'd use a better variable name than "i" - perhaps "file".

在原始文件被截断之前，不会检查复制是否成功。最好让一切都取决于成功的副本，或者测试临时文件的存在，或者对副本进行操作。如果你是一个腰带和吊带的人，你会做一个组合，如下图所示
该ls是不必要的。
我会使用比“i”更好的变量名——也许是“file”。

Of course, you could be veryparanoid and check for the existence of the temporary file at the beginning so you don't accidentally overwrite it and/or use a UUID or a generated file name. One of mktemp, tempfile or uuidgen would do the trick.

当然，您可能非常偏执并在开始时检查临时文件是否存在，这样您就不会意外覆盖它和/或使用 UUID 或生成的文件名。mktemp、tempfile 或 uuidgen 之一可以解决问题。

td=TMPDIR
export TMPDIR=

usertemp=~/temp            # set this to use a temp directory on the same filesystem
                           # you could use ./temp to ensure that it's one the same one
                           # you can use mktemp -d to create the dir instead of mkdir

if [[ ! -d $usertemp ]]    # if this user temp directory doesn't exist
then                       # then create it, unless you can't 
    mkdir $usertemp || export TMPDIR=$td    # if you can't create it and TMPDIR is/was
fi                                          # empty then mktemp automatically falls
                                            # back to /tmp

for file in *.sql
do
    # TMPDIR if set overrides the argument to -p
    temp=$(mktemp -p $usertemp) || { echo "$ printf '\xEF\xBB\xBF' > bom.txt
: Unable to create temp file."; exit 1; }

    { printf '\xFF\xFE' > "$temp" &&
    cat "$file" >> "$temp"; } || { echo "$ grep -rl $'\xEF\xBB\xBF' .
./bom.txt
: Write failed on $file"; exit 1; }

    { rm "$file" && 
    mv "$temp" "$file"; } || { echo "##代码##: Replacement failed for $file; exit 1; }
done
export TMPDIR=$td

Traps might be better than all the separate error handlers I've added.

陷阱可能比我添加的所有单独的错误处理程序都要好。

No doubt all this extra caution is overkill for a one-shot script, but these techniques can save you when push comes to shove, especially in a multi-file operation.

毫无疑问，所有这些额外的谨慎对于一次性脚本来说都是多余的，但是这些技术可以在紧急情况下为您提供帮助，尤其是在多文件操作中。

Answer 7

回答by sanmai

##代码##

Then check:

然后检查：

##代码##

bash 如何在 linux 中重新添加 unicode 字节顺序标记？

提问by Neil Trodden

采纳答案by Matthew Flaschen

回答by brillout

回答by yingted

回答by andrewdotn

回答by Martin Wang

回答by Paused until further notice.

回答by sanmai

相关推荐

最近更新

标签

bash 如何在 linux 中重新添加 unicode 字节顺序标记？

提问by Neil Trodden

采纳答案by Matthew Flaschen

回答by brillout

回答by yingted

回答by andrewdotn

回答by Martin Wang

回答by Paused until further notice.

回答by sanmai

相关推荐

bash ">/dev/null 2>&1" 是否有命令行快捷方式

将当前目录保存到 bash 历史记录

bash 如何让emacs shell自动执行init文件？

有用的 BASH 代码片段

相关推荐

最近更新

标签