ValueError：在 Pandas 数据帧上使用 itertuples() 时解包的值太多

Question

提问by spaine

I am trying to convert a simple pandas dataframe into a nested JSON file based on the answer I found here: pandas groupby to nested json

我正在尝试根据我在此处找到的答案将简单的 Pandas 数据帧转换为嵌套的 JSON 文件： pandas groupby tonested json

My grouped dataframe looks like this:

我的分组数据框如下所示：

                  firstname lastname  orgname         phone        mobile  email
teamname members                                                           
1        0            John      Doe     Anon  916-555-1234          none   [email protected] 
         1            Jane      Doe     Anon  916-555-4321  916-555-7890   [email protected]
2        0          Mickey    Moose  Moosers  916-555-0000  916-555-1111   [email protected]
         1           Minny    Moose  Moosers  916-555-2222          none   [email protected]

My code is:

我的代码是：

data = pandas.read_excel(inputExcel, sheetname = 'Sheet1', encoding = 'utf8')
grouped = data.groupby(['teamname', 'members']).first()

results = defaultdict(lambda: defaultdict(dict))

for index, value in grouped.itertuples():
    for i, key in enumerate(index):
        if i ==0:
            nested = results[key]
        elif i == len(index) -1:
            nested[key] = value
        else:
            nested = nested[key]

print json.dumps(results, indent = 4)

I get the following error on the first "for" loop. What causes this error in this circumstance and what would it take to fix it to output the nested json?

我在第一个“for”循环中收到以下错误。在这种情况下是什么导致了这个错误，如何修复它以输出嵌套的 json？

    for index, value in grouped.itertuples():
ValueError: too many values to unpack

Answer 1

采纳答案by root

When using itertuples(), the index is included as part of the tuple, so the for index, value in grouped.itertuples():doesn't really make sense. In fact, itertuples()uses namedtuplewith Indexbeing one of the names.

使用时itertuples()，索引作为元组的一部分包含在内，因此for index, value in grouped.itertuples():实际上没有意义。事实上，itertuples()uses namedtuplewithIndex是名称之一。

Consider the following setup:

考虑以下设置：

data = {'A': list('aabbc'), 'B': [0, 1, 0, 1, 0], 'C': list('vwxyz'), 'D': range(5,10)}
df = pd.DataFrame(data).set_index(['A', 'B'])

Yielding the following DataFrame:

产生以下数据帧：

Then printing each tuple in df.itertuples()yields:

然后打印每个元组的df.itertuples()产量：

Pandas(Index=('a', 0), C='v', D=5)
Pandas(Index=('a', 1), C='w', D=6)
Pandas(Index=('b', 0), C='x', D=7)
Pandas(Index=('b', 1), C='y', D=8)
Pandas(Index=('c', 0), C='z', D=9)

So, what you'll probably want to do is something like the code below, with valuebeing replaced by t[1:]:

所以，你可能想要做的是类似于下面的代码，value被替换为t[1:]：

for t in grouped.itertuples():
    for i, key in enumerate(t.Index):
        ...

If you want to access components of the namedtuple, you can access things positionally, or by name. So, in the case of your DataFrame, t[1]and t.firstnameshould be equivalent. Just remember that t[0]is the index, so your first column starts at 1.

如果要访问的组件namedtuple，可以按位置或按名称访问事物。所以，在你的数据帧的情况下，t[1]并且t.firstname应该是等价的。只要记住那t[0]是索引，所以你的第一列从1.

Answer 2

回答by bravosierra99

As I understand itertuples, it will return a tuple with the first value being the index and the remaining values being all of the columns. You only have for index, value in grouped.itertuples()which means it's trying to unpack all of the columns into a single variable, which won't work. The groupbyprobably comes into play as well but it should still contain all the values within the result which means you still have too many columns being unpacked.

据我了解 itertuples，它将返回一个元组，其中第一个值是索引，其余值是所有列。您只拥有index, value in grouped.itertuples()which 意味着它试图将所有列解压缩到一个变量中，这是行不通的。在groupby可能进场时很好，但它仍然应该包含这意味着你仍然有太多的列被解压结果中的所有值。

ValueError：在 Pandas 数据帧上使用 itertuples() 时解包的值太多

提问by spaine

采纳答案by root

回答by bravosierra99

相关推荐

最近更新

标签

ValueError：在 Pandas 数据帧上使用 itertuples() 时解包的值太多

提问by spaine

采纳答案by root

回答by bravosierra99

相关推荐

pandas 当时间戳未被归类为索引时，如何按时间戳对数据帧进行切片？

pandas 大熊猫如何计算偏斜

pandas 使用来自python中另一列的值根据条件创建一个新列

pandas 展平双重嵌套的 JSON

相关推荐

最近更新

标签