如何修复 AttributeError: 'DataFrame' 对象没有属性 'assign' 而不更新 Pandas？

Question

提问by user1703276

I am trying merge multiple files based on a key ('r_id') and rename the column names in the output with the name of the files. I could able to do every thing except renaming the output with the file names. I have the following error probably caused by the old version of Pandas. Does any one know how to fix this with out updating pandas to new version?

我正在尝试根据键 ('r_id') 合并多个文件，并使用文件名重命名输出中的列名。除了使用文件名重命名输出之外，我可以做所有事情。我有以下错误可能是由旧版本的 Pandas 引起的。有谁知道如何在不将Pandas更新到新版本的情况下解决这个问题？

Error

错误

    Traceback (most recent call last):                                       
      File "multijoin_2.py", line 19, in <module>                            
        result = merge_files(files).reset_index()                            
      File "multijoin_2.py", line 11, in merge_files                         
        pd.read_csv(f, sep='\t', usecols=['r_id', 'exp'])          
      File "/users/xxx/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 2007, in __getattr__
        (type(self).__name__, name))                                                                             
AttributeError: 'DataFrame' object has no attribute 'assign'

Input

输入

$ cat test1

$猫测试1

r_id       g_id exp
r1      g1      20
r2      g1      30
r3      g1      1
r4      g1      3

$ cat test2

$猫测试2

r_id       gid exp
r1      g2      20
r2      g2      30
r3      g2      1
r4      g2      3

$ cat test3

$猫测试3

r_id       g_id exp
r1      g3      30
r2      g3      40
r3      g3      11
r4      g3      32

Desired Ouput

期望输出

  r_id  test3  test2  test1
0        r1        30        20        20
1        r2        40        30        30
2        r3        11         1         1
3        r4        32         3         3

Working code (except column naming)

工作代码（列命名除外）

import os
import glob
import pandas as pd

files = glob.glob(r'/path/test*')

def merge_files(files, **kwargs):
    dfs = []
    for f in files:
        dfs.append(
            pd.read_csv(f, sep='\t', usecols=['r_id', 'exp'])
              #.assign(col=0)
              .rename(columns={'col_name':os.path.splitext(os.path.basename(f))[0]})
              .set_index(['repeat_id'])
        )
    return pd.concat(dfs, axis=1)


result = merge_files(files).reset_index()
print(result)

Answer 1

采纳答案by jezrael

You need change expas column name for rename:

您需要更改exp为重命名的列名：

def merge_files(files, **kwargs):
    dfs = []
    for f in files:
        dfs.append(
            pd.read_csv(f, sep='\t', usecols=['r_id', 'exp'], index_col=['r_id'])
              .rename(columns={'exp':os.path.splitext(os.path.basename(f))[0]})
        )
    return pd.concat(dfs, axis=1)

result = merge_files(files).reset_index()
print(result)
  r_id  test1  test2  test3
0   r1     20     20     30
1   r2     30     30     40
2   r3      1      1     11
3   r4      3      3     32

如何修复 AttributeError: 'DataFrame' 对象没有属性 'assign' 而不更新 Pandas？

提问by user1703276

采纳答案by jezrael

相关推荐

最近更新

标签

如何修复 AttributeError: 'DataFrame' 对象没有属性 'assign' 而不更新 Pandas？

提问by user1703276

采纳答案by jezrael

相关推荐

Python Pandas 基于列计算行数

pandas 引入条件时不能使用 fillna

如何从 Pandas 数据框中特定列中的所有值中删除所有非数字字符？

Python pandas 将秒转换为时间 (hh:mm)

相关推荐

最近更新

标签