python使用分隔符读取制表符分隔的文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14229643/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 10:50:04  来源:igfitidea点击:

python reading a tab separated file using delimiter

python

提问by Rajeev

I am using the following to read a tab separated file .There are three columns in the file but the first column is being ignored when i print the column header only.how can i include the first column too

我正在使用以下内容来读取制表符分隔的文件。文件中有三列,但当我仅打印列标题时,第一列被忽略。我如何也包含第一列

f = open("/tmp/data.txt")
for l in f.readlines():
  print l.strip().split("\t")
  break
  f.close()

Output: ['session_id\t', '\tevent_id_concat']

输出:['session_id\t', '\tevent_id_concat']

The first column name is idwhere it s not printed in the above array

第一列名称是id上面数组中未打印的位置

EDIT

EDIT

print l yields the following

打印 l 产生以下结果

EDIT 1:

编辑 1:

   'id\tsession_id\tevent_id_concat\r\n'

   Output: ['id\t', '\tevent_id_concat'] 

回答by elyase

It should work but it is better to use 'with':

它应该可以工作,但最好使用“with”:

with open('/tmp/data.txt') as f:
   for l in f:
       print l.strip().split("\t")

if it doesn't then probably your file doesn't have the required format.

如果没有,那么您的文件可能没有所需的格式。

回答by wagnerpeer

I would also suggest to use the csv module. It is easy to use and fits best if you want to read in table like structures stored in a CSV like format (tab/space/something else delimited).

我还建议使用 csv 模块。如果您想读取存储在类似 CSV 格式(制表符/空格/其他分隔符)中的表结构,它易于使用且最适合。

The module documentationgives good examples where the simplest usage is stated to be:

模块文档提供了其中最简单的用法据称是很好的例子:

import csv
with open('/tmp/data.txt', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        print row

Every row is a list which is very usefull if you want to do index based manipulations.

每一行都是一个列表,如果您想进行基于索引的操作,这将非常有用。

If you want to change the delimiter there is a keyword for this but I am often fine with the predefined dialects which can also be defined via a keyword.

如果你想改变分隔符,有一个关键字,但我通常对预定义的方言很好,也可以通过关键字定义。

import csv
with open('/tmp/data.txt', 'r') as f:
    reader = csv.reader(f, dialect='excel', delimiter='\t')
    for row in reader:
        print row

I am not sure if this will fix your problems but the use of elaborated modules will ensure you that something is wrong with your file and not your code if the error will remain.

我不确定这是否会解决您的问题,但是如果错误仍然存​​在,使用精心设计的模块将确保您的文件有问题,而不是您的代码有问题。