Python 如何打开 .data 文件扩展名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31797013/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to open a .data file extension
提问by Jason Donnald
I am working on side stuff where the data provided is in a .data
file. How do I open a .data
file to see what the data looks like and also how do I read from a .data
file programmatically through python? I have Mac OSX
我正在处理提供的数据在.data
文件中的辅助内容。如何打开.data
文件以查看数据的外观以及如何.data
通过 python以编程方式读取文件?我有 Mac OSX
NOTE:The Data I am working with is for one of the KDD cup challenges
注意:我正在使用的数据是针对其中一个KDD cup challenges
回答by user2539336
It vastly depends on what is in it. It could be a binary file or it could be a text file.
这在很大程度上取决于其中的内容。它可以是二进制文件,也可以是文本文件。
If it is a text file then you can open it in the same way you open any file (f=open(filename,"r"))
如果它是一个文本文件,那么您可以像打开任何文件一样打开它 (f=open(filename,"r"))
If it is a binary file you can just add a "b" to the open command (open(filename,"rb")). There is an example here:
如果它是一个二进制文件,你可以在打开命令(open(filename,"rb"))中添加一个“b”。这里有一个例子:
Reading binary file in Python and looping over each byte
Depending on the type of data in there, you might want to try passing it through a csv reader (csv python module) or an xml parsing library (an example of which is lxml)
根据那里的数据类型,您可能想尝试通过 csv 阅读器(csv python 模块)或 xml 解析库(其中一个例子是 lxml)传递它
After further into from above and looking at the page the format is:
从上面进一步进入并查看页面后,格式为:
Data Format The datasets use a format similar as that of the text export format from relational databases:
数据格式数据集使用的格式类似于关系数据库中的文本导出格式:
One header lines with the variables names One line per instance Separator tabulation between the values There are missing values (consecutive tabulations)
带有变量名称的标题行 每个实例一行 值之间的分隔符列表 缺少值(连续列表)
Therefore see this answer:
因此,请参阅此答案:
parsing a tab-separated file in Python
I would advise trying to process one line at a time rather than loading the whole file, but if you have the ram why not...
我建议尝试一次处理一行而不是加载整个文件,但如果你有内存为什么不......
I suspect it doesnt open in sublime because the file is huge, but that is just a guess.
我怀疑它不会在 sublime 中打开,因为文件很大,但这只是一个猜测。
回答by nbari
To get a quick overview of what the file may content you could do this within a terminal, using strings
or cat
, for example:
要快速了解文件可能包含的内容,您可以在终端中执行此操作,例如使用strings
或cat
:
$ strings file.data
or
或者
$ cat -v file.data
In case you forget to pass the -v
option to cat and if is a binary file you could mess your terminal and therefore need to reset it:
如果您忘记将-v
选项传递给 cat 并且如果是二进制文件,您可能会弄乱您的终端,因此需要重置它:
$ reset