Python 循环遍历文本文件读取数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17436709/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 08:13:07  来源:igfitidea点击:

Python loop through a text file reading data

pythonloops

提问by jealopez

I am new to python, and although I am sure this might be a trivial question, I have spent my day trying to solve this in different ways. I have a file containing data that looks like this:

我是 python 的新手,虽然我确信这可能是一个微不足道的问题,但我花了一天的时间试图以不同的方式解决这个问题。我有一个包含如下数据的文件:

<string>
<integer>
<N1>
<N2>
data
data
...
<string>
<integer>
<N3>
<N4>
data
data
...

And that extends a number of times... I need to read the "data" which for the first set (between the first and second ) contains a number N1 of X points, a number N2 of Y points and a number N1*N2 of Z points. If I had only one set of data I already know how to read all the data, then read the value N1, N2, then slice it into X, Y and Z, reshape it and use it... but if my file contains more than one sets of data, how do I read only from one string until the next one, and then repeat the same operation for the next set, and again until I reach the end of the file? I tried defining a function like:

这扩展了很多次......我需要读取第一组(第一组和第二组之间)的“数据”包含X点的数量N1,Y点的数量N2和数量N1 * N2 Z 点。如果我只有一组数据,我已经知道如何读取所有数据,然后读取值 N1、N2,然后将其切成 X、Y 和 Z,对其进行整形并使用它...但如果我的文件包含更多比一组数据,我如何只从一个字符串读取到下一个,然后对下一组重复相同的操作,直到到达文件末尾?我尝试定义一个函数,如:

def dat_fun():
    with open("inpfile.txt", "r") as ifile:
        for line in ifile:
            if isinstance('line', str) or (not line):
                break
            for line in ifile:
                yield line

but is not working, I get arrays with no data on them. Any comments will be appreciated. Thanks!

但不起作用,我得到了没有数据的数组。任何意见将不胜感激。谢谢!

回答by Martijn Pieters

Alllines are instances of str, so you break out on the first line. Remove that test, and test for an empty line by stripping away whitespace first:

所有行都是 的实例str,因此您在第一行中断。删除该测试,并通过首先去除空格来测试空行:

def dat_fun():
    with open("inpfile.txt", "r") as ifile:
        for line in ifile:
            if not line.strip():
                break
            yield line

I don't think you need to break at an empty line, really; the forloop ends on its own at the end of the file.

我认为您不需要在空行处中断,真的;将for在文件的结尾自身循环结束。

If your lines contain other sorts of data, you'd need to do the conversion yourself, coming fromstring.

如果您的行包含其他类型的数据,您需要自己进行转换,来自字符串。

回答by Rushy Panchal

def dat_fun():
    with open("inpfile.txt", "r") as ifile:
        for line in ifile:
            if isinstance('line', str) or (not line): # 'line' is always a str, and so is the line itself
                break 
            for line in ifile:
                yield line

Change this to:

将此更改为:

def dat_fun():
    with open("inpfile.txt", "r") as ifile:
        for line in ifile:
            if not line:
                break
            yield line

回答by Rob Watts

With structured data like this, I'd suggest just reading what you need. For example:

对于这样的结构化数据,我建议您只阅读您需要的内容。例如:

with open("inpfile.txt", "r") as ifile:
    first_string = ifile.readline().strip() # Is this the name of the data set?
    first_integer = int(ifile.readline()) # You haven't told us what this is, either
    n_one = int(ifile.readline())
    n_two = int(ifile.readline())

    x_vals = []
    y_vals = []
    z_vals = []

    for index in range(n_one):
         x_vals.append(ifile.readline().strip())
    for index in range(n_two):
         y_vals.append(ifile.readline().strip())
    for index in range(n_one*n_two):
         z_vals.append(ifile.readline().strip())

You can turn this into a dataset generating function by adding a loop and yielding the values:

您可以通过添加循环并生成值将其转换为数据集生成函数:

with open("inpfile.txt", "r") as ifile:
    while True:
        first_string = ifile.readline().strip() # Is this the name of the data set?
        if first_string == '':
            break
        first_integer = int(ifile.readline()) # You haven't told us what this is, either
        n_one = int(ifile.readline())
        n_two = int(ifile.readline())

        x_vals = []
        y_vals = []
        z_vals = []

        for index in range(n_one):
            x_vals.append(ifile.readline().strip())
        for index in range(n_two):
            y_vals.append(ifile.readline().strip())
        for index in range(n_one*n_two):
            z_vals.append(ifile.readline().strip())
        yield (x_vals, y_vals, z_vals) # and the first string and integer if you need those