IPython Notebook 和 Pandas 自动完成

Question

提问by metersk

I noticed if I were to type df.column_name(), I can autocomplete the column_namewith a tab in IPython notebook.

我注意到如果我要输入df.column_name()，我可以column_name使用 IPython 笔记本中的选项卡自动完成。

Now, the proper syntax for doing something to a column would be df['column_name'], where I am unable to autocomplete (I am assuming because it is a string?). Is there any other notation or way to simplyfy typing out column names. I am essentailly looking for a solution that would allow me to tab autocomplete the column name within this df['column_name'].

现在，对列执行某些操作的正确语法是df['column_name']，我无法自动完成（我假设它是一个字符串？）。是否有任何其他符号或方法可以简单地键入列名。我正在寻找一种解决方案，该解决方案允许我在此df['column_name'].

Answer 1

采纳答案by Maturin

I've found the following method to be useful to me. It basically creates a namedtuplecontaining the names of all the variables in the data frame as strings.

我发现以下方法对我有用。它基本上创建了一个namedtuple包含数据框中所有变量名称的字符串。

For example, consider the following data frame containing 2 variables called "variable_1" and "variable_2":

例如，考虑以下包含 2 个变量的数据框，称为“variable_1”和“variable_2”：

from collections import namedtuple
from pandas import DataFrame
import numpy as np

df = DataFrame({'variable_1':np.arange(5),'variable_2':np.arange(5)})

The following code creates a namedtuple called "var":

以下代码创建了一个名为“var”的命名元组：

def ntuples():
    list_of_names = df.columns.values
    list_of_names_dict = {x:x for x in list_of_names}

    Varnames = namedtuple('Varnames', list_of_names) 
    return Varnames(**list_of_names_dict)

var = ntuples()

In a notebook, when I write var.and press Tab, the names of all the variables in the dataframe dfwill be displayed. Writing var.variable_1is equivalent to writing 'variable_1'. So the following would work: df[var.variable_1].

在笔记本中，当我编写var.并按 Tab 键时，df将显示数据框中所有变量的名称。写入var.variable_1相当于写入'variable_1'。因此，以下将起作用：df[var.variable_1].

The reason I define a function to do it is that often times you will add new variables to a data frame. In order to update the new variables to your namedtuple "var" simply call the function again, ntuples(), and you are good to go.

我定义一个函数来执行此操作的原因是，您通常会向数据框添加新变量。为了将新变量更新到您的命名元组“var”，只需再次调用该函数，ntuples()就可以了。

Answer 2

回答by gobrewers14

I'm not sure how your data is situated but when I am importing a csv/txt file, I specify the names of the columns in a list, such as...

我不确定您的数据的位置，但是当我导入 csv/txt 文件时，我在列表中指定了列的名称，例如...

names = ['col_1', 'col_2', 'col_3']

etc... and then import my file as such...

等等...然后导入我的文件...

import pandas as pd
data = pd.read_csv('./some_file.txt', header = True, delimiter = '\t', names = names)

You could then do tab completion like...

然后，您可以执行选项卡完成，例如...

new_thing = data[names[1]]

where you would be hitting tab as you started to type "names" and then all you would have to do is specify what 'name' item you wanted. I not sure if this is any more efficient then simply typing out the word.

当您开始键入“名称”时，您将在其中点击选项卡，然后您所要做的就是指定您想要的“名称”项目。我不确定这是否比简单地输入单词更有效。

IPython Notebook 和 Pandas 自动完成

提问by metersk

采纳答案by Maturin

回答by gobrewers14

相关推荐

最近更新

标签

IPython Notebook 和 Pandas 自动完成

提问by metersk

采纳答案by Maturin

回答by gobrewers14

相关推荐

pandas.merge：匹配最近的时间戳>=时间戳系列

pandas 获取所有具有 NaN 值的行

从 Pandas DataFrame 构建 NetworkX 图

pandas 获取在熊猫的列中具有相同值的行

相关推荐

最近更新

标签