Python 使用pandas从txt加载数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21546739/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:07:01  来源:igfitidea点击:

Load data from txt with pandas

pythoniopandas

提问by albus_c

I am loading a txt file containig a mix of float and string data. I want to store them in an array where I can access each element. Now I am just doing

我正在加载一个包含浮点数和字符串数据混合的 txt 文件。我想将它们存储在一个数组中,我可以在其中访问每个元素。现在我只是在做

import pandas as pd

data = pd.read_csv('output_list.txt', header = None)
print data

This is the structure of the input file: 1 0 2000.0 70.2836942112 1347.28369421 /file_address.txt.

这是输入文件的结构:1 0 2000.0 70.2836942112 1347.28369421 /file_address.txt.

Now the data are imported as a unique column. How can I divide it, so to store different elements separately (so I can call data[i,j])? And how can I define a header?

现在数据作为唯一列导入。我该如何划分它,以便分别存储不同的元素(以便我可以调用data[i,j])?以及如何定义标题?

采纳答案by pietrovismara

You can use:

您可以使用:

data = pd.read_csv('output_list.txt', sep=" ", header=None)
data.columns = ["a", "b", "c", "etc."]

Add sep=" "in your code, leaving a blank space between the quotes. So pandas can detect spaces between values and sort in columns. Data columns is for naming your columns.

添加sep=" "您的代码,在引号之间留一个空格。因此,pandas 可以检测值之间的空格并按列进行排序。数据列用于命名您的列。

回答by Sam Perry

@Pietrovismara's solution is correct but I'd just like to add: rather than having a separate line to add column names, it's possible to do this from pd.read_csv.

@Pietrovismara 的解决方案是正确的,但我想补充一点:与其使用单独的行来添加列名,不如从 pd.read_csv 中执行此操作。

df = pd.read_csv('output_list.txt', sep=" ", header=None, names=["a", "b", "c"])

回答by Meenakshi Ravisankar

I'd like to add to the above answers, you could directly use

我想补充上面的答案,你可以直接使用

df = pd.read_fwf('output_list.txt')

fwf stands for fixed width formatted lines.

fwf 代表固定宽度的格式化行。

回答by ramakrishnareddy

you can use this

你可以用这个

import pandas as pd
dataset=pd.read_csv("filepath.txt",delimiter="\t")

回答by tulsi kumar

You can do as:

你可以这样做:

import pandas as pd
df = pd.read_csv('file_location\filename.txt', delimiter = "\t")

(like, df = pd.read_csv('F:\Desktop\ds\text.txt', delimiter = "\t")

(例如,df = pd.read_csv('F:\Desktop\ds\text.txt', delimiter = "\t")

回答by bfree67

If you don't have an index assigned to the data and you are not sure what the spacing is, you can use to let pandas assign an index and look for multiple spaces.

如果您没有为数据分配索引并且不确定间距是多少,您可以使用让熊猫分配索引并查找多个空格。

df = pd.read_csv('filename.txt', delimiter= '\s+', index_col=False)

回答by Kaustubh J

You can import the text file using the read_table command as so:

您可以使用 read_table 命令导入文本文件,如下所示:

import pandas as pd
df=pd.read_table('output_list.txt',header=None)

Preprocessing will need to be done after loading

加载后需要进行预处理

回答by pari

Based on the latest changes in pandas, you can use, read_csv , read_table is deprecated:

根据 pandas 的最新变化,您可以使用 read_csv , read_table 已弃用:

import pandas as pd
pd.read_csv("file.txt", sep = "\t")