Python使用re模块解析导入的文本文件

Question

提问by user1478335

def regexread():
    import re

    result = ''
    savefileagain = open('sliceeverfile3.txt','w')

    #text=open('emeverslicefile4.txt','r')
    text='09,11,14,34,44,10,11,  27886637,    0\n561, Tue, 5,Feb,2013, 06,25,31,40,45,06,07,  19070109,    0\n560, Fri, 1,Feb,2013, 05,21,34,37,38,01,06,  13063500,    0\n559, Tue,29,Jan,2013,'

    pattern='\d\d,\d\d,\d\d,\d\d,\d\d,\d\d,\d\d'
    #with open('emeverslicefile4.txt') as text:     
    f = re.findall(pattern,text)

    for item in f:
        print(item)

    savefileagain.write(item)
    #savefileagain.close()

The above function as written parses the text and returns sets of seven numbers. I have three problems.

上面写的函数解析文本并返回七个数字的集合。我有三个问题。

Firstly the 'read' file which contains exactly the same text as text='09,...etc' returns a TypeError expected string or buffer, which I cannot solve even by reading some of the posts.
Secondly, when I try to write results to the 'write' file, nothing is returned and
thirdly, I am not sure how to get the same output that I get with the print statement, which is three lines of seven numbers each which is the output that I want.

首先，包含与 text='09,...etc' 完全相同文本的“read”文件返回 a TypeError expected string or buffer，即使阅读一些帖子我也无法解决。
其次，当我尝试将结果写入“写入”文件时，没有返回任何内容并且
第三，我不确定如何获得与打印语句相同的输出，它是三行，每行七个数字，这是我想要的输出。

This is the first time that I have used regex, so be gentle please!

这是我第一次使用正则表达式，所以请温柔点！

Answer 1

采纳答案by OmegaOuter

This should do the trick, check comments for explanation about what Im doing here =) Good luck

这应该可以解决问题，检查评论以解释我在这里做什么=）祝你好运

import re
filename = 'sliceeverfile3.txt'
pattern  = '\d\d,\d\d,\d\d,\d\d,\d\d,\d\d,\d\d'
new_file = []

# Make sure file gets closed after being iterated
with open(filename, 'r') as f:
   # Read the file contents and generate a list with each line
   lines = f.readlines()

# Iterate each line
for line in lines:

    # Regex applied to each line 
    match = re.search(pattern, line)
    if match:
        # Make sure to add \n to display correctly when we write it back
        new_line = match.group() + '\n'
        print new_line
        new_file.append(new_line)

with open(filename, 'w') as f:
     # go to start of file
     f.seek(0)
     # actually write the lines
     f.writelines(new_file)

Answer 2

回答by brwnj

You're sort of on the right track...

你有点走在正确的轨道上......

You'll iterate over the file: How to iterate over the file in python

您将遍历文件： How to iterate over the file in python

and apply the regex to each line. The link above should really answer all 3 of your questions when you realize you're trying to write 'item', which doesn't exist outside of that loop.

并将正则表达式应用于每一行。当您意识到您正在尝试编写在该循环之外不存在的“项目”时，上面的链接应该真正回答您的所有 3 个问题。

Python使用re模块解析导入的文本文件

提问by user1478335

采纳答案by OmegaOuter

回答by brwnj

相关推荐

最近更新

标签

Python使用re模块解析导入的文本文件

提问by user1478335

采纳答案by OmegaOuter

回答by brwnj

相关推荐

Python 使用 NLTK 和 WordNet；如何将简单时态动词转换为其现在、过去或过去分词形式？

Python for循环中matplotlib中的多个图例

Python 在 Django 中序列化外键对象

列表中的 Python os.path.join()

相关推荐

最近更新

标签