pandas 如何将元组值设置为熊猫数据框?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46985607/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:42:35  来源:igfitidea点击:

How do I set tuple value to pandas dataframe?

pythonpandas

提问by user40780

What i want to do should be very simple. Essentially, I have some dataframe, I need assign some tuple value to some column.

我想做的应该很简单。本质上,我有一些数据框,我需要为某些列分配一些元组值。

for example:

例如:

pd_tmp = pd.DataFrame(np.random.rand(3,3))
pd_tmp["new_column"] = ("a",2)

I just need a new column with tuple value, what should i do?

我只需要一个带有元组值的新列,我该怎么办?

ValueError: Length of values does not match length of index

The previous code gets the error.

前面的代码得到错误。

回答by Psidom

You can wrap the tuples in a list:

您可以将元组包装在列表中:

import pandas as pd
pd_tmp = pd.DataFrame(np.random.rand(3,3))
pd_tmp["new_column"] = [("a",2)] * len(pd_tmp)

pd_tmp
#          0           1           2    new_column
#0  0.835350    0.338516    0.914184    (a, 2)
#1  0.007327    0.418952    0.741958    (a, 2)
#2  0.758607    0.464525    0.400847    (a, 2)

回答by Shihe Zhang

The doc of series.

的文档series

Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index. The basic method to create a Series is to call:

>>> s = pd.Series(data, index=index)

Here, data can be many different things:

  • a Python dict
  • an ndarray
  • a scalar value (like 5)

系列是一个一维标记数组,能够保存任何数据类型(整数、字符串、浮点数、Python 对象等)。轴标签统称为索引。创建系列的基本方法是调用:

>>> s = pd.Series(data, index=index)

在这里,数据可以是许多不同的东西:

  • 一个 Python 字典
  • 一个数组
  • 标量值(如 5)

So Serieswon't take tuple type directly.
@Psidom's answer is to make the tuple as the element of a ndarray.

所以Series不会直接取元组类型。
@Psidom 的答案是将元组作为 a 的元素ndarray

If you are asking about how to set a cell of Series/Dataframethat's an asked question.

如果您询问如何设置 Series/Dataframe 的单元格,这是一个被问到的问题。

回答by piRSquared

You can use applywith a lambdathat returns the tuple

您可以使用apply带有lambda返回的tuple

pd_tmp.assign(newc_olumn=pd_tmp.apply(lambda x: ('a', 2), 1))

          0         1         2 newc_olumn
0  0.373564  0.806956  0.106911     (a, 2)
1  0.332508  0.711735  0.230347     (a, 2)
2  0.516232  0.343266  0.813759     (a, 2)

回答by Stefano Paoli

I was looking for something similar, but in my case I wanted the tuple to be a combination of the existing columns, not just a fixed value. I found the solution below, which I share hoping it will be useful to others, like me.

我正在寻找类似的东西,但就我而言,我希望元组是现有列的组合,而不仅仅是一个固定值。我找到了下面的解决方案,我分享它,希望它对其他人有用,比如我。

In [24]: df
Out[24]:
      A     B
0     1     2
1    11    22
2   111   222
3  1111  2222

In [25]: df['D'] = df[['A','B']].apply(tuple, axis=1)

In [26]: df
Out[26]:
      A     B             D
0     1     2        (1, 2)
1    11    22      (11, 22)
2   111   222    (111, 222)
3  1111  2222  (1111, 2222)