Python 从第 2 行读取文件或跳过标题行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4796764/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 17:33:58  来源:igfitidea点击:

Read file from line 2 or skip header row

pythonfile-io

提问by super9

How can I skip the header row and start reading a file from line2?

如何跳过标题行并从第 2 行开始读取文件?

采纳答案by SilentGhost

with open(fname) as f:
    next(f)
    for line in f:
        #do something

回答by chriscauley

f = open(fname,'r')
lines = f.readlines()[1:]
f.close()

回答by Dror Hilman

f = open(fname).readlines()
firstLine = f.pop(0) #removes the first line
for line in f:
    ...

回答by Vajk Hermecz

If slicing could work on iterators...

如果切片可以在迭代器上工作......

from itertools import islice
with open(fname) as f:
    for line in islice(f, 1, None):
        pass

回答by saimadhu.polamuri

If you want the first line and then you want to perform some operation on file this code will helpful.

如果你想要第一行,然后你想对文件执行一些操作,这段代码会很有帮助。

with open(filename , 'r') as f:
    first_line = f.readline()
    for line in f:
            # Perform some operations

回答by Mauro Rementeria

# Open a connection to the file
with open('world_dev_ind.csv') as file:

    # Skip the column names
    file.readline()

    # Initialize an empty dictionary: counts_dict
    counts_dict = {}

    # Process only the first 1000 rows
    for j in range(0, 1000):

        # Split the current line into a list: line
        line = file.readline().split(',')

        # Get the value for the first column: first_col
        first_col = line[0]

        # If the column value is in the dict, increment its value
        if first_col in counts_dict.keys():
            counts_dict[first_col] += 1

        # Else, add to the dict and set value to 1
        else:
            counts_dict[first_col] = 1

# Print the resulting dictionary
print(counts_dict)

回答by Minh Tran

To generalize the task of reading multiple header lines and to improve readability I'd use method extraction. Suppose you wanted to tokenize the first three lines of coordinates.txtto use as header information.

为了概括读取多个标题行的任务并提高可读性,我将使用方法提取。假设您想对前三行进行标记coordinates.txt以用作标题信息。

Example

例子

coordinates.txt
---------------
Name,Longitude,Latitude,Elevation, Comments
String, Decimal Deg., Decimal Deg., Meters, String
Euler's Town,7.58857,47.559537,0, "Blah"
Faneuil Hall,-71.054773,42.360217,0
Yellowstone National Park,-110.588455,44.427963,0

Then method extraction allows you to specify whatyou want to do with the header information (in this example we simply tokenize the header lines based on the comma and return it as a list but there's room to do much more).

然后提取方法允许你指定什么,你想用头信息做(在这个例子中,我们简单的记号化基础上,逗号标题行并返回一个列表,但有足够的空间做更多的工作)。

def __readheader(filehandle, numberheaderlines=1):
    """Reads the specified number of lines and returns the comma-delimited 
    strings on each line as a list"""
    for _ in range(numberheaderlines):
        yield map(str.strip, filehandle.readline().strip().split(','))

with open('coordinates.txt', 'r') as rh:
    # Single header line
    #print next(__readheader(rh))

    # Multiple header lines
    for headerline in __readheader(rh, numberheaderlines=2):
        print headerline  # Or do other stuff with headerline tokens

Output

输出

['Name', 'Longitude', 'Latitude', 'Elevation', 'Comments']
['String', 'Decimal Deg.', 'Decimal Deg.', 'Meters', 'String']

If coordinates.txtcontains another headerline, simply change numberheaderlines. Best of all, it's clear what __readheader(rh, numberheaderlines=2)is doing and we avoid the ambiguity of having to figure out or comment on why author of the the accepted answer uses next()in his code.

如果coordinates.txt包含另一个标题,只需更改numberheaderlines. 最重要的是,很清楚__readheader(rh, numberheaderlines=2)正在做什么,我们避免了必须弄清楚或评论为什么被接受的答案的作者next()在他的代码中使用的歧义。

回答by u5675325

If you want to read multiple CSV files starting from line 2, this works like a charm

如果你想从第 2 行开始读取多个 CSV 文件,这就像一个魅力

for files in csv_file_list:
        with open(files, 'r') as r: 
            next(r)                  #skip headers             
            rr = csv.reader(r)
            for row in rr:
                #do something

(this is part of Parfait's answerto a different question)

(这是Parfait对另一个问题的回答的一部分)