如何使用字典键和值来重命名 Pandas DataFrame 中的列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41783176/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:50:35  来源:igfitidea点击:

How do I use dictionary keys and values to rename columns in a pandas DataFrame?

pythonpandasdictionarydataframe

提问by ZacAttack

I am building functions to help me load data from the web. The problem I am trying to solve as far as loading data is that column names are different depending on the source. For example, Yahoo Finance data column headings look like this Open, High, Low, Close, Volume, Adj Close. Quandl.com will have data sets that have DATE,VALUE,date,value etc. The all upper case and lowercase throws everything off and Value and Adj. Close for the most part mean the same thing. I want to associate columns with different names but the same meaning to one value. For example Adj. Close and value both = AC; Open, OPEN, and open all = O.

我正在构建函数来帮助我从网络加载数据。就加载数据而言,我试图解决的问题是列名因来源而异。例如,Yahoo Finance 数据列标题看起来像这样 Open、High、Low、Close、Volume、Adj Close。Quandl.com 将拥有包含 DATE、VALUE、日期、值等的数据集。所有大写和小写字母都将丢弃所有内容以及 Value 和 Adj。关闭在大多数情况下意味着相同的事情。我想将具有不同名称但含义相同的列与一个值相关联。例如 Adj。关闭并评估两者 = AC; 打开,打开,然后打开所有 = O。

So I have a Csv file ("Functions//ColumnNameChanges.txt") that stores dict() keys and values of column names.

所以我有一个 Csv 文件(“Functions//ColumnNameChanges.txt”),它存储 dict() 键和列名的值。

Date,D
Open,O
High,H

and then I wrote this function to populate my dictionary

然后我写了这个函数来填充我的字典

def DictKeyValuesFromText ():

    Dictionary = {}
    TextFileName = "Functions//ColumnNameChanges.txt"
    with open(TextFileName,'r') as f:
        for line in f:
            x = line.find(",")
            y = line.find("/")
            k = line[0:x]
            v = line[x+1:y]

            Dictionary[k] = v
    return Dictionary

This is the output of print(DictKeyValuesFromText())

这是 print(DictKeyValuesFromText()) 的输出

{'': '', 'Date': 'D', 'High': 'H', 'Open': 'O'}

The next function is where my problems are at

下一个功能是我的问题所在

def ChangeColumnNames(DataFrameFileLocation):
    x = DictKeyValuesFromText()
    df = pd.read_csv(DataFrameFileLocation)
    for y in df.columns:
        if y not in x.keys():
            i = input("The column " +  y +  " is not in the list, give a name:")
            df.rename(columns={y:i}) 
        else:
            df.rename(columns={y:x[y]})

    return df

df.rename is not working. This is the output I get print(ChangeColumnNames("Tvix_data.csv"))

df.rename 不起作用。这是我得到的输出 print(ChangeColumnNames("Tvix_data.csv"))

The column Low is not in the list, give a name:L
The column Close is not in the list, give a name:C
The column Volume is not in the list, give a name:V
The column Adj Close is not in the list, give a name:AC
            Date        Open        High         Low       Close    Volume  \
0     2010-11-30  106.269997  112.349997  104.389997  112.349997         0
1     2010-12-01   99.979997  100.689997   98.799998  100.689997         0
2     2010-12-02   98.309998   98.309998   86.499998   86.589998         0

The columns names should be D, O, H, L, C, V. I am missing something any help would be appreciated.

列名应该是 D、O、H、L、C、V。我错过了一些任何帮助,将不胜感激。

回答by DeepSpace

df.renameworks just fine, but it is not inplace by default. Either re-assign its return value or use inplace=True. It expects a dictionary with old names as keys and new names as values.

df.rename工作得很好,但默认情况下它不是就位的。重新分配其返回值或使用inplace=True. 它需要一个以旧名称作为键和新名称作为值的字典。

df = df.rename({'col_a': 'COL_A', 'col_b': 'COL_B'})

df = df.rename({'col_a': 'COL_A', 'col_b': 'COL_B'})

or

或者

df.rename({'col_a': 'COL_A', 'col_b': 'COL_B'}, inplace=True)

df.rename({'col_a': 'COL_A', 'col_b': 'COL_B'}, inplace=True)

回答by Loochie

Well, when you already have the dictionary store it in a variable say

好吧,当您已经将字典存储在变量中时,请说

DC = {'': '', 'Date': 'D', 'High': 'H', 'Open': 'O'}

DC can now be mapped to the dataframe columns like

DC 现在可以映射到数据框列,如

df.columns = df.columns.map(DC)

In case you want to use rename() method you can simply go with

如果您想使用 rename() 方法,您可以简单地使用

df = df.rename(columns = DC)