ValueError:在 Pandas 数据帧上使用 itertuples() 时解包的值太多
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37819622/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
ValueError: too many values to unpack when using itertuples() on pandas dataframe
提问by spaine
I am trying to convert a simple pandas dataframe into a nested JSON file based on the answer I found here: pandas groupby to nested json
我正在尝试根据我在此处找到的答案将简单的 Pandas 数据帧转换为嵌套的 JSON 文件: pandas groupby tonested json
My grouped dataframe looks like this:
我的分组数据框如下所示:
firstname lastname orgname phone mobile email
teamname members
1 0 John Doe Anon 916-555-1234 none [email protected]
1 Jane Doe Anon 916-555-4321 916-555-7890 [email protected]
2 0 Mickey Moose Moosers 916-555-0000 916-555-1111 [email protected]
1 Minny Moose Moosers 916-555-2222 none [email protected]
My code is:
我的代码是:
data = pandas.read_excel(inputExcel, sheetname = 'Sheet1', encoding = 'utf8')
grouped = data.groupby(['teamname', 'members']).first()
results = defaultdict(lambda: defaultdict(dict))
for index, value in grouped.itertuples():
for i, key in enumerate(index):
if i ==0:
nested = results[key]
elif i == len(index) -1:
nested[key] = value
else:
nested = nested[key]
print json.dumps(results, indent = 4)
I get the following error on the first "for" loop. What causes this error in this circumstance and what would it take to fix it to output the nested json?
我在第一个“for”循环中收到以下错误。在这种情况下是什么导致了这个错误,如何修复它以输出嵌套的 json?
for index, value in grouped.itertuples():
ValueError: too many values to unpack
采纳答案by root
When using itertuples()
, the index is included as part of the tuple, so the for index, value in grouped.itertuples():
doesn't really make sense. In fact, itertuples()
uses namedtuple
with Index
being one of the names.
使用 时itertuples()
,索引作为元组的一部分包含在内,因此for index, value in grouped.itertuples():
实际上没有意义。事实上,itertuples()
uses namedtuple
withIndex
是名称之一。
Consider the following setup:
考虑以下设置:
data = {'A': list('aabbc'), 'B': [0, 1, 0, 1, 0], 'C': list('vwxyz'), 'D': range(5,10)}
df = pd.DataFrame(data).set_index(['A', 'B'])
Yielding the following DataFrame:
产生以下数据帧:
C D
A B
a 0 v 5
1 w 6
b 0 x 7
1 y 8
c 0 z 9
Then printing each tuple in df.itertuples()
yields:
然后打印每个元组的df.itertuples()
产量:
Pandas(Index=('a', 0), C='v', D=5)
Pandas(Index=('a', 1), C='w', D=6)
Pandas(Index=('b', 0), C='x', D=7)
Pandas(Index=('b', 1), C='y', D=8)
Pandas(Index=('c', 0), C='z', D=9)
So, what you'll probably want to do is something like the code below, with value
being replaced by t[1:]
:
所以,你可能想要做的是类似于下面的代码,value
被替换为t[1:]
:
for t in grouped.itertuples():
for i, key in enumerate(t.Index):
...
If you want to access components of the namedtuple
, you can access things positionally, or by name. So, in the case of your DataFrame, t[1]
and t.firstname
should be equivalent. Just remember that t[0]
is the index, so your first column starts at 1
.
如果要访问 的组件namedtuple
,可以按位置或按名称访问事物。所以,在你的数据帧的情况下,t[1]
并且t.firstname
应该是等价的。只要记住那t[0]
是索引,所以你的第一列从1
.
回答by bravosierra99
As I understand itertuples, it will return a tuple with the first value being the index and the remaining values being all of the columns. You only have for index, value in grouped.itertuples()
which means it's trying to unpack all of the columns into a single variable, which won't work. The groupby
probably comes into play as well but it should still contain all the values within the result which means you still have too many columns being unpacked.
据我了解 itertuples,它将返回一个元组,其中第一个值是索引,其余值是所有列。您只拥有index, value in grouped.itertuples()
which 意味着它试图将所有列解压缩到一个变量中,这是行不通的。在groupby
可能进场时很好,但它仍然应该包含这意味着你仍然有太多的列被解压结果中的所有值。