如何修复 AttributeError: 'DataFrame' 对象没有属性 'assign' 而不更新 Pandas?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44305253/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:42:51  来源:igfitidea点击:

How to fix AttributeError: 'DataFrame' object has no attribute 'assign' with out updating Pandas?

pandasmergeconcat

提问by user1703276

I am trying merge multiple files based on a key ('r_id') and rename the column names in the output with the name of the files. I could able to do every thing except renaming the output with the file names. I have the following error probably caused by the old version of Pandas. Does any one know how to fix this with out updating pandas to new version?

我正在尝试根据键 ('r_id') 合并多个文件,并使用文件名重命名输出中的列名。除了使用文件名重命名输出之外,我可以做所有事情。我有以下错误可能是由旧版本的 Pandas 引起的。有谁知道如何在不将Pandas更新到新版本的情况下解决这个问题?

Error

错误

    Traceback (most recent call last):                                       
      File "multijoin_2.py", line 19, in <module>                            
        result = merge_files(files).reset_index()                            
      File "multijoin_2.py", line 11, in merge_files                         
        pd.read_csv(f, sep='\t', usecols=['r_id', 'exp'])          
      File "/users/xxx/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 2007, in __getattr__
        (type(self).__name__, name))                                                                             
AttributeError: 'DataFrame' object has no attribute 'assign'  

Input

输入

$ cat test1

$猫测试1

r_id       g_id exp
r1      g1      20
r2      g1      30
r3      g1      1
r4      g1      3

$ cat test2

$猫测试2

r_id       gid exp
r1      g2      20
r2      g2      30
r3      g2      1
r4      g2      3

$ cat test3

$猫测试3

r_id       g_id exp
r1      g3      30
r2      g3      40
r3      g3      11
r4      g3      32

Desired Ouput

期望输出

  r_id  test3  test2  test1
0        r1        30        20        20
1        r2        40        30        30
2        r3        11         1         1
3        r4        32         3         3

Working code (except column naming)

工作代码(列命名除外)

import os
import glob
import pandas as pd

files = glob.glob(r'/path/test*')

def merge_files(files, **kwargs):
    dfs = []
    for f in files:
        dfs.append(
            pd.read_csv(f, sep='\t', usecols=['r_id', 'exp'])
              #.assign(col=0)
              .rename(columns={'col_name':os.path.splitext(os.path.basename(f))[0]})
              .set_index(['repeat_id'])
        )
    return pd.concat(dfs, axis=1)


result = merge_files(files).reset_index()
print(result)

采纳答案by jezrael

You need change expas column name for rename:

您需要更改exp为重命名的列名:

def merge_files(files, **kwargs):
    dfs = []
    for f in files:
        dfs.append(
            pd.read_csv(f, sep='\t', usecols=['r_id', 'exp'], index_col=['r_id'])
              .rename(columns={'exp':os.path.splitext(os.path.basename(f))[0]})
        )
    return pd.concat(dfs, axis=1)

result = merge_files(files).reset_index()
print(result)
  r_id  test1  test2  test3
0   r1     20     20     30
1   r2     30     30     40
2   r3      1      1     11
3   r4      3      3     32