Python 我如何只读取文本文件每一行的第一个单词?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23372086/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 02:51:03  来源:igfitidea点击:

How would I read only the first word of each line of a text file?

pythonpython-3.x

提问by Hamzah Akhtar

I wanted to know how I could read ONLY the FIRST WORD of each line in a text file. I tried various codes and tried altering codes but can only manage to read whole lines from a text file. The code I used is as shown below:

我想知道如何只读取文本文件中每一行的第一个单词。我尝试了各种代码并尝试更改代码,但只能设法从文本文件中读取整行。我使用的代码如下所示:

QuizList = []
with open('Quizzes.txt','r') as f:
            for line in f:
                QuizList.append(line)
        line = QuizList[0]
        for word in line.split():
            print(word)

This refers to an attempt to extract only the first word from the first line. In order to repeat the process for every line i would do the following:

这是指尝试仅从第一行中提取第一个单词。为了对每一行重复该过程,我将执行以下操作:

QuizList = []
with open('Quizzes.txt','r') as f:
            for line in f:
                QuizList.append(line)
capacity = len(QuizList)
capacity = capacity-1
index = 0
while index!=capacity:
    line = QuizList[index]
    for word in line.split():
        print(word)
        index = index+1

采纳答案by jonrsharpe

You are using splitat the wrong point, try:

split在错误的地方使用,请尝试:

for line in f:
    QuizList.append(line.split(None, 1)[0]) # add only first word

回答by Matthew

Changed to a one-liner that's also more efficient with the strip as Jon Clements suggested in a comment.

正如乔恩·克莱门茨 (Jon Clements) 在评论中所建议的那样,改为使用单线也更高效。

with open('Quizzes.txt', 'r') as f:
    wordlist = [line.split(None, 1)[0] for line in f]


This is pretty irrelevant to your question, but just so the line.split(None, 1) doesn't confuse you, it's a bit more efficient because it only splits the line 1 time.

这与您的问题非常无关,但是 line.split(None, 1) 不会让您感到困惑,它的效率更高一些,因为它只将行拆分 1 次。

From the str.split([sep[, maxsplit]])docs

str.split([sep[, maxsplit]])文档

If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns [].

如果未指定 sep 或为 None ,则应用不同的拆分算法:将连续空格的运行视为单个分隔符,如果字符串有前导或尾随空格,则结果将在开头或结尾不包含空字符串。因此,拆分空字符串或仅由空格组成的字符串与 None 分隔符将返回 []。

' 1 2 3 '.split()returns ['1', '2', '3']

' 1 2 3 '.split()返回 ['1', '2', '3']

and

' 1 2 3 '.split(None, 1)returns ['1', '2 3 '].

' 1 2 3 '.split(None, 1)返回['1', '2 3 ']

回答by Samy Arous

You should read one character at a time:

您应该一次读取一个字符:

import string

QuizList = []
with open('Quizzes.txt','r') as f:
    for line in f:
        for i, c in enumerate(line):
            if c not in string.letters:
                print line[:i]
                break

回答by user3570335

with Open(filename,"r") as f:
    wordlist = [r.split()[0] for r in f]

回答by Jon Clements

I'd go for the str.splitand similar approaches, but for completness here's one that uses a combination of mmapand reif you needed to extract more complicated data:

我会采用str.split和 类似的方法,但为了完整起见,这里有一个使用mmapre如果您需要提取更复杂的数据的组合:

import mmap, re

with open('quizzes.txt') as fin:
    mf = mmap.mmap(fin.fileno(), 0, access=mmap.ACCESS_READ)
    wordlist = re.findall('^(\w+)', mf, flags=re.M)

回答by A.Harish kumar

l=[] with open ('task-1.txt', 'rt') as myfile:

l=[] with open ('task-1.txt', 'rt') 作为我的文件:

for x in myfile:                
    l.append(x)

for i in l: print[i.split()[0] ]

对于 i 在 l: 打印 [i.split()[0] ]