Python Pandas：NameError：未定义名称

Question

提问by Jazzmine

Ok, this is my first Python Pandas program and I'm having a hard time figuring out what the column name is so I can reference it in a function call.

好的，这是我的第一个 Python Pandas 程序，我很难弄清楚列名是什么，以便我可以在函数调用中引用它。

Below is my code. parseDeviceType is calling a function to parse useragentstring. But when I call it using what I think the column name is, I get an error that name is not defined:

下面是我的代码。parseDeviceType 正在调用一个函数来解析 useragentstring。但是，当我使用我认为的列名来调用它时，我收到一个未定义名称的错误：

df = pd.read_csv('user_agent_strings.txt',index_col=None, na_values=['NA'],sep=',')
dt=parseDeviceType(user_agent_string)
print df.columns

NameError: name 'user_agent_string' is not defined
Index([u'user_agent_string'], dtype='object')

And here's the header and first row of data from the input file containing the useragentstrings:

这是包含用户代理字符串的输入文件中的标题和第一行数据：

"user_agent_string"
"Mozilla/5.0 (iPad; CPU OS 7_1_1 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D201 Safari/9537.53"

Can you help me understand how to reference the column name in the dt=parseDeviceType(user_agent_string)call? I'd like to also know how to reference it by column number if that is possible in a call to a function.

你能帮我理解如何在dt=parseDeviceType(user_agent_string)调用中引用列名吗？如果在调用函数时可能的话，我还想知道如何通过列号引用它。

Thanks

谢谢

Answer 1

回答by ZMatrix

Try to remove .txt from your file name might help. Like the following:

尝试从文件名中删除 .txt 可能会有所帮助。像下面这样：

df = pd.read_csv('user_agent_strings', index_col=None, na_values=['NA'],sep=',')

Answer 2

回答by Sunitha G

Import pandas package to read data

导入pandas包读取数据

import pandas as pd 

df = pd.read_csv('user_agent_strings', index_col=None, na_values=['NA'],sep=',')

Answer 3

回答by chthonicdaemon

The first thing you need to understand is the error message you are seeing:

您需要了解的第一件事是您看到的错误消息：

NameErroris a Python exception and is not related to Pandas in this case. You can get exactly the same error by trying to use any name which the interpreter doesn't know about:

NameError是 Python 异常，在这种情况下与 Pandas 无关。通过尝试使用解释器不知道的任何名称，您可能会得到完全相同的错误：

>>> b = a
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined

It is important to know that very few Python commands will "magically" create names. To create a name, you would almost always need an assignment (name = ...). So as a general rule if you you haven't done this, namewill not exist. In your code, the name you have created is df, so you will need to go through that to get to your data.

重要的是要知道很少有 Python 命令会“神奇地”创建名称。要创建名称，您几乎总是需要赋值 ( name = ...)。所以作为一般规则，如果你没有这样做，name就不会存在。在您的代码中，您创建的名称是df，因此您需要通过它来获取您的数据。

You can use two different ways to access the data in the dataframe, which are equivalent: df['user_agent_string']or df.user_agent_string. I recommend trying this out in an interactive environment so that you can see the results before passing it to a function.

您可以使用两种不同的方式来访问数据框中的数据，它们是等效的：df['user_agent_string']或df.user_agent_string. 我建议在交互式环境中尝试此操作，以便您可以在将结果传递给函数之前查看结果。

I'm also going to guess that your function parseDeviceTypeonly does this for one string (based on the comments), but you want to call this function on every item in your file. To do this you would need apply:

我还将猜测您的函数parseDeviceType仅对一个字符串执行此操作（基于注释），但您想对文件中的每个项目调用此函数。为此，您需要apply：

parsed_types = df.user_agent_string.apply(parseDeviceType)

To access columns by number instead of name (which I don't recommend), you can use iloc. This allows you to access all the rows (:) and the first colum (0) from the dataframe object:

要按编号而不是名称访问列（我不建议这样做），您可以使用iloc. 这允许您访问数据帧对象中的所有行 ( :) 和第一列 ( 0)：

user_agent_string = df.iloc[:, 0]

Python Pandas：NameError：未定义名称

提问by Jazzmine

回答by ZMatrix

回答by Sunitha G

回答by chthonicdaemon

相关推荐

最近更新

标签

Python Pandas：NameError：未定义名称

提问by Jazzmine

回答by ZMatrix

回答by Sunitha G

回答by chthonicdaemon

相关推荐

return True/False 实际做什么？（Python）

如何使用python脚本将文件上传到sharepoint站点

Python 我可以在没有自动 ID 的情况下在 Django 中创建模型吗？

Python 手动设置图例中点的颜色

相关推荐

最近更新

标签