Python 使用pandas从txt加载数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/21546739/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Load data from txt with pandas
提问by albus_c
I am loading a txt file containig a mix of float and string data. I want to store them in an array where I can access each element. Now I am just doing
我正在加载一个包含浮点数和字符串数据混合的 txt 文件。我想将它们存储在一个数组中,我可以在其中访问每个元素。现在我只是在做
import pandas as pd
data = pd.read_csv('output_list.txt', header = None)
print data
This is the structure of the input file: 1 0 2000.0 70.2836942112 1347.28369421 /file_address.txt. 
这是输入文件的结构:1 0 2000.0 70.2836942112 1347.28369421 /file_address.txt.
Now the data are imported as a unique column. How can I divide it, so to store different elements separately (so I can call data[i,j])? And how can I define a header?
现在数据作为唯一列导入。我该如何划分它,以便分别存储不同的元素(以便我可以调用data[i,j])?以及如何定义标题?
采纳答案by pietrovismara
You can use:
您可以使用:
data = pd.read_csv('output_list.txt', sep=" ", header=None)
data.columns = ["a", "b", "c", "etc."]
Add sep=" "in your code, leaving a blank space between the quotes. So pandas can detect spaces between values and sort in columns. Data columns is for naming your columns.
添加sep=" "您的代码,在引号之间留一个空格。因此,pandas 可以检测值之间的空格并按列进行排序。数据列用于命名您的列。
回答by Sam Perry
@Pietrovismara's solution is correct but I'd just like to add: rather than having a separate line to add column names, it's possible to do this from pd.read_csv.
@Pietrovismara 的解决方案是正确的,但我想补充一点:与其使用单独的行来添加列名,不如从 pd.read_csv 中执行此操作。
df = pd.read_csv('output_list.txt', sep=" ", header=None, names=["a", "b", "c"])
回答by Meenakshi Ravisankar
I'd like to add to the above answers, you could directly use
我想补充上面的答案,你可以直接使用
df = pd.read_fwf('output_list.txt')
fwf stands for fixed width formatted lines.
fwf 代表固定宽度的格式化行。
回答by ramakrishnareddy
you can use this
你可以用这个
import pandas as pd
dataset=pd.read_csv("filepath.txt",delimiter="\t")
回答by tulsi kumar
You can do as:
你可以这样做:
import pandas as pd
df = pd.read_csv('file_location\filename.txt', delimiter = "\t")
(like, df = pd.read_csv('F:\Desktop\ds\text.txt', delimiter = "\t")
(例如,df = pd.read_csv('F:\Desktop\ds\text.txt', delimiter = "\t")
回答by bfree67
If you don't have an index assigned to the data and you are not sure what the spacing is, you can use to let pandas assign an index and look for multiple spaces.
如果您没有为数据分配索引并且不确定间距是多少,您可以使用让熊猫分配索引并查找多个空格。
df = pd.read_csv('filename.txt', delimiter= '\s+', index_col=False)
回答by Kaustubh J
You can import the text file using the read_table command as so:
您可以使用 read_table 命令导入文本文件,如下所示:
import pandas as pd
df=pd.read_table('output_list.txt',header=None)
Preprocessing will need to be done after loading
加载后需要进行预处理
回答by pari
Based on the latest changes in pandas, you can use, read_csv , read_table is deprecated:
根据 pandas 的最新变化,您可以使用 read_csv , read_table 已弃用:
import pandas as pd
pd.read_csv("file.txt", sep = "\t")

