Python 如何从文本文件中随机选择一行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14924721/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to choose a random line from a text file
提问by Suit Boy Apps
I am trying to make a lottery program for my school (we have an economic system).
我正在尝试为我的学校制定一个彩票计划(我们有一个经济体系)。
My program generates numbers and saves it off into a text file. When I want to "pull" numbers out of my generator I want it to ensure that there is a winner.
我的程序生成数字并将其保存到文本文件中。当我想从我的生成器中“拉出”数字时,我希望它确保有一个赢家。
Q: How do I have Python select a random line out of my text file and give my output as that number?
问:我如何让 Python 从我的文本文件中随机选择一行并将我的输出作为该数字?
采纳答案by NPE
How do I have python select a random line out of my text file and give my output as that number?
我如何让 python 从我的文本文件中随机选择一行并将我的输出作为该数字?
Assuming the file is relatively small, the following is perhaps the easiest way to do it:
假设文件相对较小,以下可能是最简单的方法:
import random
line = random.choice(open('data.txt').readlines())
回答by Srdjan Grubor
Off the top of my head:
在我的头顶:
import random
def pick_winner(self):
lines = []
with open("file.txt", "r") as f:
lines = f.readlines();
random_line_num = random.randrange(0, len(lines))
return lines[random_lines_num]
回答by Fredrik Pihl
another approach:
另一种方法:
import random, fileinput
text = None
for line in fileinput.input('data.txt'):
if random.randrange(fileinput.lineno()) == 0:
text = line
print text
Distribution:
分配:
$ seq 1 10 > data.txt
# run for 100000 times
$ ./select.py > out.txt
$ wc -l out.txt
100000 out.txt
$ sort out.txt | uniq -c
10066 1
10004 10
10023 2
9979 3
9926 4
9936 5
9878 6
10023 7
10154 8
10011 9
I don't see the skewnes but perhaps the dataset is too small...
我没有看到偏斜,但可能数据集太小了......
回答by chepner
With a slight modification to your input file (store the number of items in the first line), you can choose a number uniformly without having to read the entire file into memory first.
对您的输入文件稍作修改(将项目数存储在第一行),您可以统一选择一个数字,而无需先将整个文件读入内存。
import random
def choose_number( frame ):
with open(fname, "r") as f:
count = int(f.readline().strip())
for line in f:
if not random.randrange(0, count):
return int(line.strip())
count-=1
Say you have 100 numbers. The probability of choosing the first number is 1/100. The probability of choosing the second number is (99/100)(1/99) = 1/100. The probability of choosing the third number is (99/100)(98/99)(1/98) = 1/100. I'll skip the formal proof, but the odds of choosing any of the 100 numbers is 1/100.
假设你有 100 个号码。选择第一个数字的概率是 1/100。选择第二个数字的概率是 (99/100)(1/99) = 1/100。选择第三个数字的概率是 (99/100)(98/99)(1/98) = 1/100。我将跳过正式证明,但选择 100 个数字中的任何一个的几率是 1/100。
It's not strictly necessary to store the count in the first line, but it saves you the trouble of having to read the entire file just to count the lines. Either way, you don't need to store the entire file in memory to choose any single line with equal probability.
将计数存储在第一行并不是绝对必要的,但它可以省去您为了计算行数而必须读取整个文件的麻烦。无论哪种方式,您都不需要将整个文件存储在内存中来以相等的概率选择任何一行。
回答by Ali-Akber Saifee
If the file is very large - you could seek to a random location in the file given the file size and then get the next full line:
如果文件非常大 - 您可以在给定文件大小的文件中查找随机位置,然后获取下一个完整行:
import os, random
def get_random_line(file_name):
total_bytes = os.stat(file_name).st_size
random_point = random.randint(0, total_bytes)
file = open(file_name)
file.seek(random_point)
file.readline() # skip this line to clear the partial line
return file.readline()
回答by Suit Boy Apps
I saw a python tutorials and found this snippet:
我看到了一个python教程并找到了这个片段:
def randomLine(filename):
#Retrieve a random line from a file, reading through the file once
fh = open("KEEP-IMPORANT.txt", "r")
lineNum = 0
it = ''
while 1:
aLine = fh.readline()
lineNum = lineNum + 1
if aLine != "":
#
# How likely is it that this is the last line of the file ?
if random.uniform(0,lineNum)<1:
it = aLine
else:
break
nmsg=it
return nmsg
#this is suposed to be a var pull = randomLine(filename)
回答by iankit
def random_line():
line_num = 0
selected_line = ''
with open(filename) as f:
while 1:
line = f.readline()
if not line: break
line_num += 1
if random.uniform(0, line_num) < 1:
selected_line = line
return selected_line.strip()
Although most of the approaches given here would work, but they tend to load the whole file in the memory at once. But not this approach. So even if the files are big, this would work.
尽管这里给出的大多数方法都有效,但它们往往会一次性将整个文件加载到内存中。但不是这种方法。因此,即使文件很大,这也行得通。
The approach is not very intuitive at first glance. The theorem behind this states that when we have seen N lines in there is a probability of exactly 1/N that each of them is selected so far.
乍一看,这种方法不是很直观。这背后的定理指出,当我们看到 N 行时,到目前为止,每行被选中的概率恰好是 1/N。

