如何使用python从文本文件制作字典
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14505898/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to make a dictionary from a text file with python
提问by user2007220
My file looks like this:
我的文件看起来像这样:
aaien 12 13 39
aan 10
aanbad 12 13 14 57 58 38
aanbaden 12 13 14 57 58 38
aanbeden 12 13 14 57 58 38
aanbid 12 13 14 57 58 39
aanbidden 12 13 14 57 58 39
aanbidt 12 13 14 57 58 39
aanblik 27 28
aanbreken 39
...
I want to make a dictionary with key = the word (like 'aaien') and the value should be a list of the numbers that are next to it. So it has to look this way: {'aaien': ['12, 13, 39'], 'aan': ['10']}
我想用 key = 单词(如'aaien')制作一个字典,值应该是它旁边的数字列表。所以它必须看起来像这样:{'aaien': ['12, 13, 39'], 'aan': ['10']}
This code doesn't seem to work.
这段代码似乎不起作用。
document = open('LIWC_words.txt', 'r')
liwcwords = document.read()
dictliwc = {}
for line in liwcwords:
k, v = line.strip().split(' ')
answer[k.strip()] = v.strip()
liwcwords.close()
python gives this error:
python给出了这个错误:
ValueError: need more than 1 value to unpack
采纳答案by Martijn Pieters
You are splitting your line into a list of words, but only giving it one key and value.
您正在将您的行拆分为一个单词列表,但只给它一个键和值。
This will work:
这将起作用:
with open('LIWC_words.txt', 'r') as document:
answer = {}
for line in document:
line = line.split()
if not line: # empty line?
continue
answer[line[0]] = line[1:]
Note that you don't need to give .split()an argument; without arguments it'll both split on whitespace and strip the results for you. That saves you having to explicitly call .strip().
请注意,您不需要提供.split()参数;没有参数,它会在空白处拆分并为您剥离结果。这样您就不必显式调用.strip().
The alternative is to split only on the first whitespace:
另一种方法是仅在第一个空格上拆分:
with open('LIWC_words.txt', 'r') as document:
answer = {}
for line in document:
if line.strip(): # non-empty line?
key, value = line.split(None, 1) # None means 'all whitespace', the default
answer[key] = value.split()
The second argument to .split()limits the number of splits made, guaranteeing that there at most 2 elements are returned, making it possible to unpack the values in the assignment to keyand value.
第二个参数.split()限制进行的拆分次数,保证最多返回 2 个元素,从而可以解压缩赋值给key和的值value。
Either method results in:
任何一种方法都会导致:
{'aaien': ['12', '13', '39'],
'aan': ['10'],
'aanbad': ['12', '13', '14', '57', '58', '38'],
'aanbaden': ['12', '13', '14', '57', '58', '38'],
'aanbeden': ['12', '13', '14', '57', '58', '38'],
'aanbid': ['12', '13', '14', '57', '58', '39'],
'aanbidden': ['12', '13', '14', '57', '58', '39'],
'aanbidt': ['12', '13', '14', '57', '58', '39'],
'aanblik': ['27', '28'],
'aanbreken': ['39']}
If you still see only onekey and the rest of the file as the (split) value, your input file is using a non-standard line separator perhaps. Open the file with universal line ending support, by adding the Ucharacter to the mode:
如果您仍然只看到一个键和文件的其余部分作为(拆分)值,则您的输入文件可能正在使用非标准行分隔符。通过将字符添加到模式来打开具有通用行尾支持的文件U:
with open('LIWC_words.txt', 'rU') as document:
回答by binish
>liwcwords = document.read()
>dictliwc = {}
>for line in liwcwords:
You are iterating over a string here, which is not what you want. Try document.readlines(). Here is a another solution.
你在这里迭代一个字符串,这不是你想要的。试试document.readlines()。这是另一个解决方案。
from pprint import pprint
with open('LIWC_words.txt') as fd:
d = {}
for i in fd:
entry = i.split()
if entry: d.update({entry[0]: entry[1:]})
pprint(d)
Here is how the output looks like
这是输出的样子
{'aaien': ['12', '13', '39'],
'aan': ['10'],
'aanbad': ['12', '13', '14', '57', '58', '38'],
'aanbaden': ['12', '13', '14', '57', '58', '38'],
'aanbeden': ['12', '13', '14', '57', '58', '38'],
'aanbid': ['12', '13', '14', '57', '58', '39'],
'aanbidden': ['12', '13', '14', '57', '58', '39'],
'aanbidt': ['12', '13', '14', '57', '58', '39'],
'aanblik': ['27', '28'],
'aanbreken': ['39']}

