Python 删除文件中的最后一个字符

Question

提问by user2681562

After looking all over the Internet, I've come to this.

在浏览了整个互联网之后，我来到了这个。

Let's say I have already made a text file that reads: Hello World

假设我已经制作了一个文本文件，内容如下： Hello World

Well, I want to remove the very last character (in this case d) from this text file.

好吧，我想d从这个文本文件中删除最后一个字符（在这种情况下）。

So now the text file should look like this: Hello Worl

所以现在文本文件应该是这样的： Hello Worl

But I have no idea how to do this.

但我不知道该怎么做。

All I want, more or less, is a single backspace function for text files on my HDD.

我想要的，或多或少，就是我的 HDD 上文本文件的单个退格功能。

This needs to work on Linux as that's what I'm using.

这需要在 Linux 上运行，因为这就是我正在使用的。

Answer 1

采纳答案by Martijn Pieters

Use fileobject.seek()to seek 1 position from the end, then use file.truncate()to remove the remainder of the file:

用于fileobject.seek()从末尾寻找 1 个位置，然后用于file.truncate()删除文件的其余部分：

import os

with open(filename, 'rb+') as filehandle:
    filehandle.seek(-1, os.SEEK_END)
    filehandle.truncate()

This works fine for single-byte encodings. If you have a multi-byte encoding (such as UTF-16 or UTF-32) you need to seek back enough bytes from the end to account for a single codepoint.

这适用于单字节编码。如果您有一个多字节编码（例如 UTF-16 或 UTF-32），您需要从末尾寻找足够的字节来解释单个代码点。

For variable-byte encodings, it depends on the codec if you can use this technique at all. For UTF-8, you need to find the first byte (from the end) where bytevalue & 0xC0 != 0x80is true, and truncate from that point on. That ensures you don't truncate in the middle of a multi-byte UTF-8 codepoint:

对于可变字节编码，是否可以使用此技术取决于编解码器。对于 UTF-8，您需要找到第一个字节（从末尾开始）bytevalue & 0xC0 != 0x80为真，并从该点开始截断。这确保您不会在多字节 UTF-8 代码点中间截断：

with open(filename, 'rb+') as filehandle:
    # move to end, then scan forward until a non-continuation byte is found
    filehandle.seek(-1, os.SEEK_END)
    while filehandle.read(1) & 0xC0 == 0x80:
        # we just read 1 byte, which moved the file position forward,
        # skip back 2 bytes to move to the byte before the current.
        filehandle.seek(-2, os.SEEK_CUR)

    # last read byte is our truncation point, move back to it.
    filehandle.seek(-1, os.SEEK_CUR)
    filehandle.truncate()

Note that UTF-8 is a superset of ASCII, so the above works for ASCII-encoded files too.

请注意，UTF-8 是 ASCII 的超集，因此上述内容也适用于 ASCII 编码的文件。

Answer 2

回答by dawg

with open(urfile, 'rb+') as f:
    f.seek(0,2)                 # end of file
    size=f.tell()               # the size...
    f.truncate(size-1)          # truncate at that size - how ever many characters

Be sure to use binary mode on windows since Unix file line ending many return an illegal or incorrectcharacter count.

一定要在 Windows 上使用二进制模式，因为 Unix 文件行结尾 many 返回非法或不正确的字符数。

Answer 3

回答by quasoft

Accepted answer of Martijn is simple and kind of works, but does not account for text files with:

Martijn 的公认答案很简单，也很有效，但不考虑具有以下内容的文本文件：

UTF-8 encodingcontaining non-English characters (which is the default encoding for text files in Python 3)
one newline character at the end of the file(which is the default in Linux editors like vimor gedit)

包含非英文字符的UTF-8 编码（这是 Python 3 中文本文件的默认编码）
文件末尾的一个换行符（这是 Linux 编辑器中的默认值，如vim或gedit）

If the text file contains non-English characters, neither of the answers provided so far would work.

如果文本文件包含非英文字符，则目前提供的任何答案都不起作用。

What follows is an example, that solves both problems, which also allows removing more than one character from the end of the file:

下面是一个示例，它解决了这两个问题，它还允许从文件末尾删除多个字符：

import os


def truncate_utf8_chars(filename, count, ignore_newlines=True):
    """
    Truncates last `count` characters of a text file encoded in UTF-8.
    :param filename: The path to the text file to read
    :param count: Number of UTF-8 characters to remove from the end of the file
    :param ignore_newlines: Set to true, if the newline character at the end of the file should be ignored
    """
    with open(filename, 'rb+') as f:
        last_char = None

        size = os.fstat(f.fileno()).st_size

        offset = 1
        chars = 0
        while offset <= size:
            f.seek(-offset, os.SEEK_END)
            b = ord(f.read(1))

            if ignore_newlines:
                if b == 0x0D or b == 0x0A:
                    offset += 1
                    continue

            if b & 0b10000000 == 0 or b & 0b11000000 == 0b11000000:
                # This is the first byte of a UTF8 character
                chars += 1
                if chars == count:
                    # When `count` number of characters have been found, move current position back
                    # with one byte (to include the byte just checked) and truncate the file
                    f.seek(-1, os.SEEK_CUR)
                    f.truncate()
                    return
            offset += 1

How it works:

这个怎么运作：

Reads only the last few bytes of a UTF-8 encoded text file in binary mode
Iterates the bytes backwards, looking for the start of a UTF-8 character
Once a character (different from a newline) is found, return that as the last character in the text file

以二进制模式仅读取 UTF-8 编码文本文件的最后几个字节
向后迭代字节，查找 UTF-8 字符的开头
一旦找到一个字符（不同于换行符），将其作为文本文件中的最后一个字符返回

Sample text file - bg.txt:

示例文本文件 - bg.txt：

Здравей свят

How to use:

如何使用：

filename = 'bg.txt'
print('Before truncate:', open(filename).read())
truncate_utf8_chars(filename, 1)
print('After truncate:', open(filename).read())

Outputs:

输出：

Before truncate: Здравей свят
After truncate: Здравей свя

This works with both UTF-8 and ASCII encoded files.

这适用于 UTF-8 和 ASCII 编码的文件。

Answer 4

回答by vins mv

here is a dirty way (erase & recreate)... i don't advice to use this, but, it's possible to do like this ..

这是一种肮脏的方式（擦除和重新创建）...我不建议使用它，但是，可以这样做..

x = open("file").read()
os.remove("file")
open("file").write(x[:-1])

Answer 5

回答by metinsenturk

In case you are not reading the file in binary mode, where you have only 'w' permissions, I can suggest the following.

如果您不是以二进制模式读取文件，而您只有“w”权限，我可以建议以下内容。

f.seek(f.tell() - 1, os.SEEK_SET)
f.write('')

In this code above, f.seek()will only accept f.tell()b/c you do not have 'b' access. then you can set the cursor to the starting of the last element. Then you can delete the last element by an empty string.

在上面的这段代码中，f.seek()只接受f.tell()b/c 你没有“b”访问权限。然后您可以将光标设置到最后一个元素的开头。然后您可以通过空字符串删除最后一个元素。

Answer 6

回答by Coddy

with open('file.txt', 'w') as f:
    f.seek(0, 2)              # seek to end of file; f.seek(0, os.SEEK_END) is legal
    f.seek(f.tell() - 2, 0)  # seek to the second last char of file; f.seek(f.tell()-2, os.SEEK_SET) is legal
    f.truncate()

subject to what last character of the file is, could be newline (\n) or anything else.

取决于文件的最后一个字符是什么，可以是换行符 (\n) 或其他任何东西。

Python 删除文件中的最后一个字符

提问by user2681562

采纳答案by Martijn Pieters

回答by dawg

回答by quasoft

回答by vins mv

回答by metinsenturk

回答by Coddy

相关推荐

最近更新

标签

Python 删除文件中的最后一个字符

提问by user2681562

采纳答案by Martijn Pieters

回答by dawg

回答by quasoft

回答by vins mv

回答by metinsenturk

回答by Coddy

相关推荐

Python 如何在 Windows 上安装 OpenCV 并在不使用包管理器的情况下为 PyCharm 启用它

Python XML 文件打开

Python3 Tkinter 字体不起作用

随机词生成器 - Python

相关推荐

最近更新

标签