Python 我如何只读取文本文件每一行的第一个单词?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23372086/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How would I read only the first word of each line of a text file?
提问by Hamzah Akhtar
I wanted to know how I could read ONLY the FIRST WORD of each line in a text file. I tried various codes and tried altering codes but can only manage to read whole lines from a text file. The code I used is as shown below:
我想知道如何只读取文本文件中每一行的第一个单词。我尝试了各种代码并尝试更改代码,但只能设法从文本文件中读取整行。我使用的代码如下所示:
QuizList = []
with open('Quizzes.txt','r') as f:
for line in f:
QuizList.append(line)
line = QuizList[0]
for word in line.split():
print(word)
This refers to an attempt to extract only the first word from the first line. In order to repeat the process for every line i would do the following:
这是指尝试仅从第一行中提取第一个单词。为了对每一行重复该过程,我将执行以下操作:
QuizList = []
with open('Quizzes.txt','r') as f:
for line in f:
QuizList.append(line)
capacity = len(QuizList)
capacity = capacity-1
index = 0
while index!=capacity:
line = QuizList[index]
for word in line.split():
print(word)
index = index+1
采纳答案by jonrsharpe
回答by Matthew
Changed to a one-liner that's also more efficient with the strip as Jon Clements suggested in a comment.
正如乔恩·克莱门茨 (Jon Clements) 在评论中所建议的那样,改为使用单线也更高效。
with open('Quizzes.txt', 'r') as f:
wordlist = [line.split(None, 1)[0] for line in f]
This is pretty irrelevant to your question, but just so the line.split(None, 1) doesn't confuse you, it's a bit more efficient because it only splits the line 1 time.
这与您的问题非常无关,但是 line.split(None, 1) 不会让您感到困惑,它的效率更高一些,因为它只将行拆分 1 次。
From the str.split([sep[, maxsplit]])
docs
从str.split([sep[, maxsplit]])
文档
If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns [].
如果未指定 sep 或为 None ,则应用不同的拆分算法:将连续空格的运行视为单个分隔符,如果字符串有前导或尾随空格,则结果将在开头或结尾不包含空字符串。因此,拆分空字符串或仅由空格组成的字符串与 None 分隔符将返回 []。
' 1 2 3 '.split()
returns ['1', '2', '3']
' 1 2 3 '.split()
返回 ['1', '2', '3']
and
和
' 1 2 3 '.split(None, 1)
returns ['1', '2 3 ']
.
' 1 2 3 '.split(None, 1)
返回['1', '2 3 ']
。
回答by Samy Arous
You should read one character at a time:
您应该一次读取一个字符:
import string
QuizList = []
with open('Quizzes.txt','r') as f:
for line in f:
for i, c in enumerate(line):
if c not in string.letters:
print line[:i]
break
回答by user3570335
with Open(filename,"r") as f:
wordlist = [r.split()[0] for r in f]
回答by Jon Clements
I'd go for the str.split
and similar approaches, but for completness here's one that uses a combination of mmap
and re
if you needed to extract more complicated data:
我会采用str.split
和 类似的方法,但为了完整起见,这里有一个使用mmap
和re
如果您需要提取更复杂的数据的组合:
import mmap, re
with open('quizzes.txt') as fin:
mf = mmap.mmap(fin.fileno(), 0, access=mmap.ACCESS_READ)
wordlist = re.findall('^(\w+)', mf, flags=re.M)
回答by A.Harish kumar
l=[] with open ('task-1.txt', 'rt') as myfile:
l=[] with open ('task-1.txt', 'rt') 作为我的文件:
for x in myfile:
l.append(x)
for i in l: print[i.split()[0] ]
对于 i 在 l: 打印 [i.split()[0] ]