如何使用 Pandas 在 Python 中读取文本文件

Question

提问by Bhanu Chander

I'm new to Pandas and I've been trying to do a scatter plot in Python 2.7, I've the dataset in .txt file something like this (comma separated)

我是 Pandas 的新手，我一直在尝试在 Python 2.7 中绘制散点图，我在 .txt 文件中的数据集是这样的（逗号分隔）

6.1101,17.592
5.5277,9.1302
8.5186,13.662
7.0032,11.854
5.8598,6.8233
8.3829,11.886
7.4764,4.3483



import pandas as pd
import matplotlib.pyplot as mplt

# Taking Dataset using Pandas

input_data = pd.read_csv('data.txt');
#input_data.head(5)

How to plot the above data in scatter plot without any headers on the dataset ?

如何在散点图中绘制上述数据，而数据集上没有任何标题？

I've seen in tutorials and examples that if the data set has column headings then it's possible to plot the scatter plot. I tried putting x and y as the headers for the two columns of the data set in .txt file and tried the below code.

我在教程和示例中看到，如果数据集具有列标题，则可以绘制散点图。我尝试将 x 和 y 作为 .txt 文件中两列数据集的标题，并尝试了以下代码。

input_data = pd.read_csv('data.txt');
#input_data.head(5)
x_value = input_data[['x']]
y_value = input_data[['y']]

mplt.scatter(x_value, y_value)

But still I'm getting error as shown below

但我仍然收到如下所示的错误

Traceback (most recent call last):
  File "E:\IIT Madras\Research\Experiments\Machine Learning\Linear Regression\Linear_Regression.py", line 16, in <module>
    y_value = input_data[['y']]
  File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 1791, in __getitem__
    return self._getitem_array(key)
  File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 1835, in _getitem_array
    indexer = self.ix._convert_to_indexer(key, axis=1)
  File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 1112, in _convert_to_indexer
    raise KeyError('%s not in index' % objarr[mask])
KeyError: "['y'] not in index"

Is there a better way to deal with this (with and without header names) ?

有没有更好的方法来处理这个（有和没有标题名称）？

EDIT:

编辑：

The following worked for me after going through Ishan reply

经过伊山回复后，以下内容对我有用

input_data = pd.read_csv('data.txt', header =None);
x_value = input_data[[0]]
y_value = input_data[[1]]
mplt.scatter(x_value, y_value)
mplt.show()

Answer 1

回答by Ishan

Try importing the data without column headers and then naming columns by your own :

尝试导入没有列标题的数据，然后按您自己的命名列：

df=pd.read_csv(r'/home/ishan/Desktop/file',header=None)
df.columns=['x','y']
import matplotlib.pyplot as plt
plt.scatter(df['x'],df['y'])
plt.show()

如何使用 Pandas 在 Python 中读取文本文件

提问by Bhanu Chander

回答by Ishan

相关推荐

最近更新

标签

如何使用 Pandas 在 Python 中读取文本文件

提问by Bhanu Chander

回答by Ishan

相关推荐

pandas 如何按熊猫中的时间戳排序？

Pandas read_sql() 可以返回哪些异常

pandas 为什么我的熊猫数据框变成了“无”类型？

pandas 从现有的熊猫数据框中复制一些行到一个新的

相关推荐

最近更新

标签