Pandas dataframe.append 给出错误:平面形状未对齐

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46357700/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:30:50  来源:igfitidea点击:

Pandas dataframe.append giving Error: Plan shapes are not aligned

pythonpython-2.7pandassklearn-pandas

提问by sukhwant prafullit

I have two data frames with columns mentioned below. When I try to append the second one to the first one I am getting ValueError: Plan shapes are not aligned error.

我有两个带有下面提到的列的数据框。当我尝试将第二个附加到第一个时,我收到 ValueError:计划形状未对齐错误。

Df1 columns:

Df1 列:

Index([                    u'asin',        u'view_publish_data',

                u'data_viewer',      u'relationship_viewer',
             u'parent_task_id',            u'submission_id',
                     u'source',            u'creation_date',
                 u'created_by',              u'vendor_code',
                       u'week',                u'processor',
                 u'brand_name',           u'brand_name_new',
               u'bullet_point',               u'cost_price',
          u'country_of_origin',                 u'cpu_type',
               u'cpu_type_new',                u'item_name',
          u'item_type_keyword',               u'list_price',
     u'minimum_order_quantity',                    u'model',
           u'product_category', u'product_site_launch_date',
        u'product_subcategory',          u'product_tier_id',
     u'replenishment_category',      u'product_description',
                 u'style_name',                       u'vc',
                u'vendor_code',     u'warranty_description'],
  dtype='object')

df2 columns:

df2 列:

Index([                         u'asin',             u'view_publish_data',

                     u'data_viewer',           u'relationship_viewer',
                  u'parent_task_id',                 u'submission_id',
                          u'source',                 u'creation_date',
                      u'created_by',                   u'vendor_code',
                            u'week',                    u'brand_name',
                 u'bullet_features',                    u'color_name',
                             u'itk',                     u'item_name',
                      u'list_price',                     u'new_brand',
                u'product_catagory',          u'product_sub_catagory',
                 u'product_tier_id',        u'replenishment_category',
                       u'size_name',                    u'cost_price',
               u'item_type_keyword',                     u'our_price',
          u'is_shipped_from_vendor',      u'manufacturer_vendor_code',
             u'product_description',                  u'vendor_code'],
  dtype='object')

回答by jezrael

You can use concatwith alignwhat return tuple of aligned DataFrames:

你可以用concatalign什么返回对齐的元组DataFrameS:

cols1 = pd.Index([ u'asin', u'view_publish_data',

                u'data_viewer',      u'relationship_viewer',
             u'parent_task_id',            u'submission_id',
                     u'source',            u'creation_date',
                 u'created_by',              u'vendor_code',
                       u'week',                u'processor',
                 u'brand_name',           u'brand_name_new',
               u'bullet_point',               u'cost_price',
          u'country_of_origin',                 u'cpu_type',
               u'cpu_type_new',                u'item_name',
          u'item_type_keyword',               u'list_price',
     u'minimum_order_quantity',                    u'model',
           u'product_category', u'product_site_launch_date',
        u'product_subcategory',          u'product_tier_id',
     u'replenishment_category',      u'product_description',
                 u'style_name',                       u'vc',
                u'vendor_code',     u'warranty_description'])

cols2 = pd.Index([ u'asin', u'view_publish_data',

                     u'data_viewer',           u'relationship_viewer',
                  u'parent_task_id',                 u'submission_id',
                          u'source',                 u'creation_date',
                      u'created_by',                   u'vendor_code',
                            u'week',                    u'brand_name',
                 u'bullet_features',                    u'color_name',
                             u'itk',                     u'item_name',
                      u'list_price',                     u'new_brand',
                u'product_catagory',          u'product_sub_catagory',
                 u'product_tier_id',        u'replenishment_category',
                       u'size_name',                    u'cost_price',
               u'item_type_keyword',                     u'our_price',
          u'is_shipped_from_vendor',      u'manufacturer_vendor_code',
             u'product_description',                  u'vendor_code'])


df1 = pd.DataFrame([range(len(cols1))], columns=cols1)
df2 = pd.DataFrame([range(len(cols2))], columns=cols2)

df = pd.concat(list(df1.align(df2)), ignore_index=True)
print (df)

   asin  brand_name  brand_name_new  bullet_features  bullet_point  \
0     0          12            13.0              NaN          14.0   
1     0          11             NaN             12.0           NaN   

   color_name  cost_price  country_of_origin  cpu_type  cpu_type_new  ...   \
0         NaN          15               16.0      17.0          18.0  ...    
1        13.0          23                NaN       NaN           NaN  ...    

   style_name  submission_id    vc  vendor_code  vendor_code  vendor_code  \
0        30.0              5  31.0            9            9           32   
1         NaN              5   NaN            9           29            9   

   vendor_code  view_publish_data  warranty_description  week  
0           32                  1                  33.0    10  
1           29                  1                   NaN    10  

[2 rows x 46 columns]