具有不同列的 Pandas 连接数据帧:AttributeError: 'NoneType' 对象没有属性 'is_extension'
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/54691385/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas concat dataframes with different columns: AttributeError: 'NoneType' object has no attribute 'is_extension'
提问by AaronDT
I am trying to concatenate two dataframes which have different column names along the 0 axis. I found a similar question here How to use join_axes in the column-wise axis concatenation using pandas DataFrame?however this solution does not work for me since the column-names of my two dataframes are not the same. Since my original data is too large to post here the following example should illustrates what I am trying to do:
我正在尝试连接两个沿 0 轴具有不同列名的数据框。我在这里发现了一个类似的问题How to use join_axes in the column-wise axis concatenation using pandas DataFrame? 但是这个解决方案对我不起作用,因为我的两个数据框的列名不一样。由于我的原始数据太大而无法在此处发布,因此以下示例应该说明我正在尝试做的事情:
df1 = pd.DataFrame(np.random.randint(0,100,size=(1, 4)), columns=list('ABCD'))
df2 = pd.DataFrame(np.random.randint(0,100,size=(1, 4)), columns=list('EFGH'))
#df1
A B C D
0 26 39 7 44
#df2
E F G H
0 12 44 26 64
pd.concat([df1,df2],axis=0).reset_index(drop=True)
# desired output looks like this
A B C D E F G H
0 26.0 39.0 7.0 44.0 NaN NaN NaN NaN
1 NaN NaN NaN NaN 12.0 44.0 26.0 64.0
The above code works perfectly. However, once I input my own dataframes for df1 and df2, using the exact same syntax above, I get an error.
上面的代码完美运行。但是,一旦我使用上面完全相同的语法为 df1 和 df2 输入了我自己的数据帧,我就会收到错误消息。
# my real dfs are called data1 & data2, I tried setting ignore_index=True and ignore_index=False
pd.concat([data1, data2],axis=0, ignore_index=True)
results in the following error:
导致以下错误:
Error:
错误:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-194-dbee1fd0bdea> in <module>
----> 1 pd.concat([data1, data2],axis=0, ignore_index=True)
~\AppData\Local\Continuum\anaconda3\envs\tensorflow-gpu\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
224 verify_integrity=verify_integrity,
225 copy=copy, sort=sort)
--> 226 return op.get_result()
227
228
~\AppData\Local\Continuum\anaconda3\envs\tensorflow-gpu\lib\site-packages\pandas\core\reshape\concat.py in get_result(self)
421 new_data = concatenate_block_managers(
422 mgrs_indexers, self.new_axes, concat_axis=self.axis,
--> 423 copy=self.copy)
424 if not self.copy:
425 new_data._consolidate_inplace()
~\AppData\Local\Continuum\anaconda3\envs\tensorflow-gpu\lib\site-packages\pandas\core\internals.py in concatenate_block_managers(mgrs_indexers, axes, concat_axis, copy)
5414 values = values.view()
5415 b = b.make_block_same_class(values, placement=placement)
-> 5416 elif is_uniform_join_units(join_units):
5417 b = join_units[0].block.concat_same_type(
5418 [ju.block for ju in join_units], placement=placement)
~\AppData\Local\Continuum\anaconda3\envs\tensorflow-gpu\lib\site-packages\pandas\core\internals.py in is_uniform_join_units(join_units)
5438 # no blocks that would get missing values (can lead to type upcasts)
5439 # unless we're an extension dtype.
-> 5440 all(not ju.is_na or ju.block.is_extension for ju in join_units) and
5441 # no blocks with indexers (as then the dimensions do not fit)
5442 all(not ju.indexers for ju in join_units) and
~\AppData\Local\Continuum\anaconda3\envs\tensorflow-gpu\lib\site-packages\pandas\core\internals.py in <genexpr>(.0)
5438 # no blocks that would get missing values (can lead to type upcasts)
5439 # unless we're an extension dtype.
-> 5440 all(not ju.is_na or ju.block.is_extension for ju in join_units) and
5441 # no blocks with indexers (as then the dimensions do not fit)
5442 all(not ju.indexers for ju in join_units) and
AttributeError: 'NoneType' object has no attribute 'is_extension'
I do not quite understand what this error message is trying to tell me. I've been trying to use fillna on both dataframes such that there should be no 'NoneType' anymore:
我不太明白这个错误消息试图告诉我什么。我一直在尝试在两个数据帧上使用 fillna,这样就不应该再有“NoneType”了:
data2 = data2.fillna(999)
data1 = data1.fillna(999)
However, I still get the same error message.
但是,我仍然收到相同的错误消息。
The two dataframes I am using are quite large, so I cant unfortunately post the entire example here. The content of my two dataframes are just integers, floats and strings so nothing fancy going on here that would strike a possible cause of error. Any idea on what might cause this error or what I could check to narrow down the problem?
我使用的两个数据框非常大,所以很遗憾我不能在这里发布整个示例。我的两个数据帧的内容只是整数、浮点数和字符串,所以这里没有什么花哨的事情会引起可能的错误原因。关于可能导致此错误的原因或我可以检查以缩小问题范围的任何想法?
Thank you very much!
非常感谢!
回答by AaronDT
Turns out the problem were just duplicate column namesin one of my dataframes...Getting rid of those duplicates solved to problem. Above code now works flawlessly.
原来问题只是我的一个数据帧中的重复列名......摆脱那些解决问题的重复项。上面的代码现在可以完美运行。