Python Pandas:NameError:未定义名称
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28534249/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Pandas: NameError: name is not defined
提问by Jazzmine
Ok, this is my first Python Pandas program and I'm having a hard time figuring out what the column name is so I can reference it in a function call.
好的,这是我的第一个 Python Pandas 程序,我很难弄清楚列名是什么,以便我可以在函数调用中引用它。
Below is my code. parseDeviceType is calling a function to parse useragentstring. But when I call it using what I think the column name is, I get an error that name is not defined:
下面是我的代码。parseDeviceType 正在调用一个函数来解析 useragentstring。但是,当我使用我认为的列名来调用它时,我收到一个未定义名称的错误:
df = pd.read_csv('user_agent_strings.txt',index_col=None, na_values=['NA'],sep=',')
dt=parseDeviceType(user_agent_string)
print df.columns
NameError: name 'user_agent_string' is not defined
Index([u'user_agent_string'], dtype='object')
And here's the header and first row of data from the input file containing the useragentstrings:
这是包含用户代理字符串的输入文件中的标题和第一行数据:
"user_agent_string"
"Mozilla/5.0 (iPad; CPU OS 7_1_1 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D201 Safari/9537.53"
Can you help me understand how to reference the column name in the dt=parseDeviceType(user_agent_string)
call? I'd like to also know how to reference it by column number if that is possible in a call to a function.
你能帮我理解如何在dt=parseDeviceType(user_agent_string)
调用中引用列名吗?如果在调用函数时可能的话,我还想知道如何通过列号引用它。
Thanks
谢谢
回答by ZMatrix
Try to remove .txt from your file name might help. Like the following:
尝试从文件名中删除 .txt 可能会有所帮助。像下面这样:
df = pd.read_csv('user_agent_strings', index_col=None, na_values=['NA'],sep=',')
回答by Sunitha G
Import pandas package to read data
导入pandas包读取数据
import pandas as pd
df = pd.read_csv('user_agent_strings', index_col=None, na_values=['NA'],sep=',')
回答by chthonicdaemon
The first thing you need to understand is the error message you are seeing:
您需要了解的第一件事是您看到的错误消息:
NameError
is a Python exception and is not related to Pandas in this case. You can get exactly the same error by trying to use any name which the interpreter doesn't know about:
NameError
是 Python 异常,在这种情况下与 Pandas 无关。通过尝试使用解释器不知道的任何名称,您可能会得到完全相同的错误:
>>> b = a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
It is important to know that very few Python commands will "magically" create names. To create a name, you would almost always need an assignment (name = ...
). So as a general rule if you you haven't done this, name
will not exist. In your code, the name you have created is df
, so you will need to go through that to get to your data.
重要的是要知道很少有 Python 命令会“神奇地”创建名称。要创建名称,您几乎总是需要赋值 ( name = ...
)。所以作为一般规则,如果你没有这样做,name
就不会存在。在您的代码中,您创建的名称是df
,因此您需要通过它来获取您的数据。
You can use two different ways to access the data in the dataframe, which are equivalent: df['user_agent_string']
or df.user_agent_string
. I recommend trying this out in an interactive environment so that you can see the results before passing it to a function.
您可以使用两种不同的方式来访问数据框中的数据,它们是等效的:df['user_agent_string']
或df.user_agent_string
. 我建议在交互式环境中尝试此操作,以便您可以在将结果传递给函数之前查看结果。
I'm also going to guess that your function parseDeviceType
only does this for one string (based on the comments), but you want to call this function on every item in your file. To do this you would need apply
:
我还将猜测您的函数parseDeviceType
仅对一个字符串执行此操作(基于注释),但您想对文件中的每个项目调用此函数。为此,您需要apply
:
parsed_types = df.user_agent_string.apply(parseDeviceType)
To access columns by number instead of name (which I don't recommend), you can use iloc
. This allows you to access all the rows (:
) and the first colum (0
) from the dataframe object:
要按编号而不是名称访问列(我不建议这样做),您可以使用iloc
. 这允许您访问数据帧对象中的所有行 ( :
) 和第一列 ( 0
):
user_agent_string = df.iloc[:, 0]