Python 根据文件中的制表符拆分字符串

Question

提问by hjelpmig

I have file that contains values separated by tab ("\t"). I am trying to create a list and store all values of file in the list. But I get some problem. Here is my code.

我有包含由制表符（“\ t”）分隔的值的文件。我正在尝试创建一个列表并将文件的所有值存储在列表中。但我遇到了一些问题。这是我的代码。

line = "abc def ghi"
values = line.split("\t")

It works fine as long as there is only one tab between each value. But if there is one than one tab then it copies the tab to values as well. In my case mostly the extra tab will be after the last value in the file.

只要每个值之间只有一个选项卡，它就可以正常工作。但是如果有不止一个选项卡，那么它也会将该选项卡复制到值中。在我的情况下，额外的选项卡通常位于文件中的最后一个值之后。

Answer 1

采纳答案by Ashwini Chaudhary

You can use regexhere:

你可以regex在这里使用：

>>> import re
>>> strs = "foo\tbar\t\tspam"
>>> re.split(r'\t+', strs)
['foo', 'bar', 'spam']

update:

更新：

You can use str.rstripto get rid of trailing '\t'and then apply regex.

您可以使用 str.rstrip摆脱尾随'\t'然后应用正则表达式。

>>> yas = "yas\t\tbs\tcda\t\t"
>>> re.split(r'\t+', yas.rstrip('\t'))
['yas', 'bs', 'cda']

Answer 2

回答by DimmuR

You can use regexp to do this:

您可以使用正则表达式来做到这一点：

import re
patt = re.compile("[^\t]+")


s = "a\t\tbcde\t\tef"
patt.findall(s)
['a', 'bcde', 'ef']

Answer 3

回答by CornSmith

Split on tab, but then remove all blank matches.

在选项卡上拆分，然后删除所有空白匹配项。

text = "hi\tthere\t\t\tmy main man"
print [splits for splits in text.split("\t") if splits is not ""]

Outputs:

输出：

['hi', 'there', 'my main man']

Answer 4

回答by Sylvain Leroux

Python has support for CSV files in the eponymous csvmodule. It is relatively misnamed since it support much more that just commaseparated values.

Python 在 eponymouscsv模块中支持 CSV 文件。它的名称相对错误，因为它支持的不仅仅是逗号分隔值。

If you need to go beyond basic word splitting you should take a look. Say, for example, because you are in need to deal with quoted values...

如果你需要超越基本的分词，你应该看看。比如说，因为你需要处理引用的值......

Answer 5

回答by Sylvain Leroux

An other regex-based solution:

另一个regex基于的解决方案：

>>> strs = "foo\tbar\t\tspam"

>>> r = re.compile(r'([^\t]*)\t*')
>>> r.findall(strs)[:-1]
['foo', 'bar', 'spam']

Python 根据文件中的制表符拆分字符串

提问by hjelpmig

采纳答案by Ashwini Chaudhary

回答by DimmuR

回答by CornSmith

回答by Sylvain Leroux

回答by Sylvain Leroux

相关推荐

最近更新

标签

Python 根据文件中的制表符拆分字符串

提问by hjelpmig

采纳答案by Ashwini Chaudhary

回答by DimmuR

回答by CornSmith

回答by Sylvain Leroux

回答by Sylvain Leroux

相关推荐

Python 什么是 Pandas 上的 SQL“GROUP BY HAVING”的等价物？

你如何让 Python 检测到没有输入

Python 将列表中的列名分配给表

Python 如何使用 Windows 安装 pyPDF2 模块？

相关推荐

最近更新

标签