如何使用python从文本文件制作字典

Question

提问by user2007220

My file looks like this:

我的文件看起来像这样：

aaien 12 13 39
aan 10
aanbad 12 13 14 57 58 38
aanbaden 12 13 14 57 58 38
aanbeden 12 13 14 57 58 38
aanbid  12 13 14 57 58 39
aanbidden 12 13 14 57 58 39
aanbidt 12 13 14 57 58 39
aanblik 27 28
aanbreken 39
...

I want to make a dictionary with key = the word (like 'aaien') and the value should be a list of the numbers that are next to it. So it has to look this way: {'aaien': ['12, 13, 39'], 'aan': ['10']}

我想用 key = 单词（如'aaien'）制作一个字典，值应该是它旁边的数字列表。所以它必须看起来像这样：{'aaien': ['12, 13, 39'], 'aan': ['10']}

This code doesn't seem to work.

这段代码似乎不起作用。

document = open('LIWC_words.txt', 'r')
liwcwords = document.read()
dictliwc = {}
for line in liwcwords:
    k, v = line.strip().split(' ')
    answer[k.strip()] = v.strip()

liwcwords.close()

python gives this error:

python给出了这个错误：

ValueError: need more than 1 value to unpack

Answer 1

采纳答案by Martijn Pieters

You are splitting your line into a list of words, but only giving it one key and value.

您正在将您的行拆分为一个单词列表，但只给它一个键和值。

This will work:

这将起作用：

with open('LIWC_words.txt', 'r') as document:
    answer = {}
    for line in document:
        line = line.split()
        if not line:  # empty line?
            continue
        answer[line[0]] = line[1:]

Note that you don't need to give .split()an argument; without arguments it'll both split on whitespace and strip the results for you. That saves you having to explicitly call .strip().

请注意，您不需要提供.split()参数；没有参数，它会在空白处拆分并为您剥离结果。这样您就不必显式调用.strip().

The alternative is to split only on the first whitespace:

另一种方法是仅在第一个空格上拆分：

with open('LIWC_words.txt', 'r') as document:
    answer = {}
    for line in document:
        if line.strip():  # non-empty line?
            key, value = line.split(None, 1)  # None means 'all whitespace', the default
            answer[key] = value.split()

The second argument to .split()limits the number of splits made, guaranteeing that there at most 2 elements are returned, making it possible to unpack the values in the assignment to keyand value.

第二个参数.split()限制进行的拆分次数，保证最多返回 2 个元素，从而可以解压缩赋值给key和的值value。

Either method results in:

任何一种方法都会导致：

{'aaien': ['12', '13', '39'],
 'aan': ['10'],
 'aanbad': ['12', '13', '14', '57', '58', '38'],
 'aanbaden': ['12', '13', '14', '57', '58', '38'],
 'aanbeden': ['12', '13', '14', '57', '58', '38'],
 'aanbid': ['12', '13', '14', '57', '58', '39'],
 'aanbidden': ['12', '13', '14', '57', '58', '39'],
 'aanbidt': ['12', '13', '14', '57', '58', '39'],
 'aanblik': ['27', '28'],
 'aanbreken': ['39']}

If you still see only onekey and the rest of the file as the (split) value, your input file is using a non-standard line separator perhaps. Open the file with universal line ending support, by adding the Ucharacter to the mode:

如果您仍然只看到一个键和文件的其余部分作为（拆分）值，则您的输入文件可能正在使用非标准行分隔符。通过将字符添加到模式来打开具有通用行尾支持的文件U：

with open('LIWC_words.txt', 'rU') as document:

Answer 2

回答by binish

>liwcwords = document.read()  
>dictliwc = {}    
>for line in liwcwords:

You are iterating over a string here, which is not what you want. Try document.readlines(). Here is a another solution.

你在这里迭代一个字符串，这不是你想要的。试试document.readlines()。这是另一个解决方案。

from pprint import pprint
with open('LIWC_words.txt') as fd:
    d = {}
    for i in fd:
        entry = i.split()
        if entry: d.update({entry[0]: entry[1:]})

pprint(d)

Here is how the output looks like

这是输出的样子

{'aaien': ['12', '13', '39'],
 'aan': ['10'],
 'aanbad': ['12', '13', '14', '57', '58', '38'],
 'aanbaden': ['12', '13', '14', '57', '58', '38'],
 'aanbeden': ['12', '13', '14', '57', '58', '38'],
 'aanbid': ['12', '13', '14', '57', '58', '39'],
 'aanbidden': ['12', '13', '14', '57', '58', '39'],
 'aanbidt': ['12', '13', '14', '57', '58', '39'],
 'aanblik': ['27', '28'],
 'aanbreken': ['39']}

如何使用python从文本文件制作字典

提问by user2007220

采纳答案by Martijn Pieters

回答by binish

相关推荐

最近更新

标签

如何使用python从文本文件制作字典

提问by user2007220

采纳答案by Martijn Pieters

回答by binish

相关推荐

Python Web 爬虫和“获取”html 源代码

将 n 个工作日添加到给定日期，忽略 python 中的假期和周末

Python 与应用程序的烧瓶混淆

Python 正则表达式，多行匹配模式.. 为什么这不起作用？

相关推荐

最近更新

标签