Python 循环遍历文本文件读取数据

Question

提问by jealopez

I am new to python, and although I am sure this might be a trivial question, I have spent my day trying to solve this in different ways. I have a file containing data that looks like this:

我是 python 的新手，虽然我确信这可能是一个微不足道的问题，但我花了一天的时间试图以不同的方式解决这个问题。我有一个包含如下数据的文件：

<string>
<integer>
<N1>
<N2>
data
data
...
<string>
<integer>
<N3>
<N4>
data
data
...

And that extends a number of times... I need to read the "data" which for the first set (between the first and second ) contains a number N1 of X points, a number N2 of Y points and a number N1*N2 of Z points. If I had only one set of data I already know how to read all the data, then read the value N1, N2, then slice it into X, Y and Z, reshape it and use it... but if my file contains more than one sets of data, how do I read only from one string until the next one, and then repeat the same operation for the next set, and again until I reach the end of the file? I tried defining a function like:

这扩展了很多次......我需要读取第一组（第一组和第二组之间）的“数据”包含X点的数量N1，Y点的数量N2和数量N1 * N2 Z 点。如果我只有一组数据，我已经知道如何读取所有数据，然后读取值 N1、N2，然后将其切成 X、Y 和 Z，对其进行整形并使用它...但如果我的文件包含更多比一组数据，我如何只从一个字符串读取到下一个，然后对下一组重复相同的操作，直到到达文件末尾？我尝试定义一个函数，如：

def dat_fun():
    with open("inpfile.txt", "r") as ifile:
        for line in ifile:
            if isinstance('line', str) or (not line):
                break
            for line in ifile:
                yield line

but is not working, I get arrays with no data on them. Any comments will be appreciated. Thanks!

但不起作用，我得到了没有数据的数组。任何意见将不胜感激。谢谢！

Answer 1

回答by Martijn Pieters

Alllines are instances of str, so you break out on the first line. Remove that test, and test for an empty line by stripping away whitespace first:

所有行都是的实例str，因此您在第一行中断。删除该测试，并通过首先去除空格来测试空行：

def dat_fun():
    with open("inpfile.txt", "r") as ifile:
        for line in ifile:
            if not line.strip():
                break
            yield line

I don't think you need to break at an empty line, really; the forloop ends on its own at the end of the file.

我认为您不需要在空行处中断，真的；将for在文件的结尾自身循环结束。

If your lines contain other sorts of data, you'd need to do the conversion yourself, coming fromstring.

如果您的行包含其他类型的数据，您需要自己进行转换，来自字符串。

Answer 2

回答by Rushy Panchal

def dat_fun():
    with open("inpfile.txt", "r") as ifile:
        for line in ifile:
            if isinstance('line', str) or (not line): # 'line' is always a str, and so is the line itself
                break 
            for line in ifile:
                yield line

Change this to:

将此更改为：

def dat_fun():
    with open("inpfile.txt", "r") as ifile:
        for line in ifile:
            if not line:
                break
            yield line

Answer 3

回答by Rob Watts

With structured data like this, I'd suggest just reading what you need. For example:

对于这样的结构化数据，我建议您只阅读您需要的内容。例如：

with open("inpfile.txt", "r") as ifile:
    first_string = ifile.readline().strip() # Is this the name of the data set?
    first_integer = int(ifile.readline()) # You haven't told us what this is, either
    n_one = int(ifile.readline())
    n_two = int(ifile.readline())

    x_vals = []
    y_vals = []
    z_vals = []

    for index in range(n_one):
         x_vals.append(ifile.readline().strip())
    for index in range(n_two):
         y_vals.append(ifile.readline().strip())
    for index in range(n_one*n_two):
         z_vals.append(ifile.readline().strip())

You can turn this into a dataset generating function by adding a loop and yielding the values:

您可以通过添加循环并生成值将其转换为数据集生成函数：

with open("inpfile.txt", "r") as ifile:
    while True:
        first_string = ifile.readline().strip() # Is this the name of the data set?
        if first_string == '':
            break
        first_integer = int(ifile.readline()) # You haven't told us what this is, either
        n_one = int(ifile.readline())
        n_two = int(ifile.readline())

        x_vals = []
        y_vals = []
        z_vals = []

        for index in range(n_one):
            x_vals.append(ifile.readline().strip())
        for index in range(n_two):
            y_vals.append(ifile.readline().strip())
        for index in range(n_one*n_two):
            z_vals.append(ifile.readline().strip())
        yield (x_vals, y_vals, z_vals) # and the first string and integer if you need those

Python 循环遍历文本文件读取数据

提问by jealopez

回答by Martijn Pieters

回答by Rushy Panchal

回答by Rob Watts

相关推荐

最近更新

标签

Python 循环遍历文本文件读取数据

提问by jealopez

回答by Martijn Pieters

回答by Rushy Panchal

回答by Rob Watts

相关推荐

Python 如何解决错误：Zip 参数 #1 必须支持迭代

Python 没有名为 sympy 的模块

Python MySQL OperationalError：1045，“用户root@'localhost'的访问被拒绝

Python 从 WTForms 字段获取上传的文件

相关推荐

最近更新

标签