pandas 将不同长度的列表作为新列添加到数据框中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51424453/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Adding list with different length as a new column to a dataframe
提问by Jaffer Wilson
I am willing to add or insert the list values in the dataframe. The dataframe len is 49
, whereas the length of list id 47
. I am getting the following error while implementing the code.
我愿意在数据框中添加或插入列表值。数据帧 len 是49
,而列表 id 的长度47
。执行代码时出现以下错误。
print("Lenght of dataframe: ",datasetTest.open.count())
print("Lenght of array: ",len(test_pred_list))
datasetTest['predict_close'] = test_pred_list
The error is:
错误是:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-105-68114a4e9a82> in <module>()
5 # datasetTest = datasetTest.dropna()
6 # print(datasetTest.count())
----> 7 datasetTest['predict_close'] = test_pred_list
8 # test_shifted['color_predicted'] = test_shifted.apply(determinePredictedcolor, axis=1)
9 # test_shifted['color_original'] =
c:\python35\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)
2517 else:
2518 # set column
-> 2519 self._set_item(key, value)
2520
2521 def _setitem_slice(self, key, value):
c:\python35\lib\site-packages\pandas\core\frame.py in _set_item(self, key, value)
2583
2584 self._ensure_valid_index(value)
-> 2585 value = self._sanitize_column(key, value)
2586 NDFrame._set_item(self, key, value)
2587
c:\python35\lib\site-packages\pandas\core\frame.py in _sanitize_column(self, key, value, broadcast)
2758
2759 # turn me into an ndarray
-> 2760 value = _sanitize_index(value, self.index, copy=False)
2761 if not isinstance(value, (np.ndarray, Index)):
2762 if isinstance(value, list) and len(value) > 0:
c:\python35\lib\site-packages\pandas\core\series.py in _sanitize_index(data, index, copy)
3119
3120 if len(data) != len(index):
-> 3121 raise ValueError('Length of values does not match length of ' 'index')
3122
3123 if isinstance(data, PeriodIndex):
ValueError: Length of values does not match length of index
How I can get rid of this error. Please help me.
我怎样才能摆脱这个错误。请帮我。
采纳答案by EdChum
If you convert the list to a Series then it will just work:
如果您将列表转换为系列,那么它将起作用:
datasetTest.loc[:,'predict_close'] = pd.Series(test_pred_list)
example:
例子:
In[121]:
df = pd.DataFrame({'a':np.arange(3)})
df
Out[121]:
a
0 0
1 1
2 2
In[122]:
df.loc[:,'b'] = pd.Series(['a','b'])
df
Out[122]:
a b
0 0 a
1 1 b
2 2 NaN
The docs refer to this as setting with enlargementwhich talks about adding or expanding but it also works where the length is less than the pre-existing index.
文档将此称为带有放大的设置,它谈论添加或扩展,但它也适用于长度小于预先存在的索引的情况。
To handle where the index doesn't start at 0
or in fact is not an int:
要处理索引不在0
或实际上不是 int 的位置:
In[126]:
df = pd.DataFrame({'a':np.arange(3)}, index=np.arange(3,6))
df
Out[126]:
a
3 0
4 1
5 2
In[127]:
s = pd.Series(['a','b'])
s.index = df.index[:len(s)]
s
Out[127]:
3 a
4 b
dtype: object
In[128]:
df.loc[:,'b'] = s
df
Out[128]:
a b
3 0 a
4 1 b
5 2 NaN
You can optionally replace the NaN
if you wish calling fillna
NaN
如果您愿意,您可以选择替换fillna
回答by jpp
You can add items to your list with an arbitrary filler
scalar.
您可以使用任意filler
标量将项目添加到列表中。
Data from @EdChum.
来自@EdChum 的数据。
filler = 0
lst = ['a', 'b']
df.loc[:, 'b'] = lst + [filler]*(len(df.index) - len(lst))
print(df)
a b
0 0 a
1 1 b
2 2 0
回答by YOBEN_S
You still can assign it by using loc
data from Ed
您仍然可以使用loc
来自 Ed 的数据来分配它
l = ['a','b']
df.loc[range(len(l)),'b'] = l
df
Out[546]:
a b
0 0 a
1 1 b
2 2 NaN