使用python在文本文件中的两个字符串之间提取值

Question

提问by user2790219

Lets say I have a Text file with the below content

假设我有一个包含以下内容的文本文件

fdsjhgjhg
fdshkjhk
Start
Good Morning
Hello World
End
dashjkhjk
dsfjkhk

Now I need to write a Python code which will read the text file and copy the contents between Start and end to another file.

现在我需要编写一个 Python 代码，它将读取文本文件并将开始和结束之间的内容复制到另一个文件中。

I wrote the following code.

我写了以下代码。

inFile = open("data.txt")
outFile = open("result.txt", "w")
buffer = []
keepCurrentSet = True
for line in inFile:
    buffer.append(line)
    if line.startswith("Start"):
        #---- starts a new data set
        if keepCurrentSet:
            outFile.write("".join(buffer))
        #now reset our state
        keepCurrentSet = False
        buffer = []
    elif line.startswith("End"):
        keepCurrentSet = True
inFile.close()
outFile.close()

I'm not getting the desired output as expected I'm just getting Start What I want to get is all the lines between Start and End. Excluding Start & End.

我没有按预期获得所需的输出我只是开始我想要得到的是开始和结束之间的所有线。不包括开始和结束。

Answer 1

采纳答案by inspectorG4dget

Just in case you have multiple "Start"s and "End"s in your text file, this will import all the data together, excluding all the "Start"s and "End"s.

以防万一您的文本文件中有多个“开始”和“结束”，这会将所有数据一起导入，不包括所有“开始”和“结束”。

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:
    copy = False
    for line in infile:
        if line.strip() == "Start":
            copy = True
            continue
        elif line.strip() == "End":
            copy = False
            continue
        elif copy:
            outfile.write(line)

Answer 2

回答by Rafi Kamal

I'm not a Python expert, but this code should do the job.

我不是 Python 专家，但这段代码应该可以完成这项工作。

inFile = open("data.txt")
outFile = open("result.txt", "w")
keepCurrentSet = False
for line in inFile:
    if line.startswith("End"):
        keepCurrentSet = False

    if keepCurrentSet:
        outFile.write(line)

    if line.startswith("Start"):
        keepCurrentSet = True
inFile.close()
outFile.close()

Answer 3

回答by TerryA

If the text files aren't necessarily large, you can get the whole content of the file then use regular expressions:

如果文本文件不一定很大，您可以获取文件的全部内容，然后使用正则表达式：

import re
with open('data.txt') as myfile:
    content = myfile.read()

text = re.search(r'Start\n.*?End', content, re.DOTALL).group()
with open("result.txt", "w") as myfile2:
    myfile2.write(text)

Answer 4

回答by pts

Move the outFile.writecall into the 2nd if:

将outFile.write呼叫移至第二个if：

inFile = open("data.txt")
outFile = open("result.txt", "w")
buffer = []
for line in inFile:
    if line.startswith("Start"):
        buffer = ['']
    elif line.startswith("End"):
        outFile.write("".join(buffer))
        buffer = []
    elif buffer:
        buffer.append(line)
inFile.close()
outFile.close()

Answer 5

回答by falsetru

Using itertools.dropwhile, itertools.takewhile, itertools.islice:

使用itertools.dropwhile, itertools.takewhile, itertools.islice：

import itertools

with open('data.txt') as f, open('result.txt', 'w') as fout:
    it = itertools.dropwhile(lambda line: line.strip() != 'Start', f)
    it = itertools.islice(it, 1, None)
    it = itertools.takewhile(lambda line: line.strip() != 'End', it)
    fout.writelines(it)

UPDATE: As inspectorG4dget commented, above code copies over the first block. To copy multiple blocks, use following:

更新：正如inspectorG4dget 所评论的，上面的代码复制了第一个块。要复制多个块，请使用以下命令：

import itertools

with open('data.txt', 'r') as f, open('result.txt', 'w') as fout:
    while True:
        it = itertools.dropwhile(lambda line: line.strip() != 'Start', f)
        if next(it, None) is None: break
        fout.writelines(itertools.takewhile(lambda line: line.strip() != 'End', it))

Answer 6

回答by Gaurav

import re

inFile = open("data.txt")
outFile = open("result.txt", "w")
buffer1 = ""
keepCurrentSet = True
for line in inFile:
    buffer1=buffer1+(line)

buffer1=re.findall(r"(?<=Start) (.*?) (?=End)", buffer1)  
outFile.write("".join(buffer1))  
inFile.close()
outFile.close()

Answer 7

回答by user2787688

I would handle it like this :

我会这样处理：

inFile = open("data.txt")
outFile = open("result.txt", "w")

data = inFile.readlines()

outFile.write("".join(data[data.index('Start\n')+1:data.index('End\n')]))
inFile.close()
outFile.close()

使用python在文本文件中的两个字符串之间提取值

提问by user2790219

采纳答案by inspectorG4dget

回答by Rafi Kamal

回答by TerryA

回答by pts

回答by falsetru

回答by Gaurav

回答by user2787688

相关推荐

最近更新

标签

使用python在文本文件中的两个字符串之间提取值

提问by user2790219

采纳答案by inspectorG4dget

回答by Rafi Kamal

回答by TerryA

回答by pts

回答by falsetru

回答by Gaurav

回答by user2787688

相关推荐

Python 在 RandomForestRegressor 中出现不支持连续错误

Python如何去除小数点？

Python Scrapy 非常基本的例子

Python 如何在pygame中等待一段时间？

相关推荐

最近更新

标签