Python 以分号为分隔符读取 CSV 文件

Question

提问by Abhijeet Mohanty

I have a numpy2D array which is of the shape (4898, )where elements in each row are separated by a semi-colonbut are still stored in a single column and not multiple columns (the desired outcome). How do I create a split at each occurrence of a semi-colon in each array of the 2D array. I have written the following Python script to do so but it throws errors.

我有一个numpy二维数组，它的形状(4898, )是每行中的元素用分号分隔，但仍存储在单列而不是多列中（所需的结果）。如何在二维数组的每个数组中每次出现分号时创建拆分。我已经编写了以下 Python 脚本来执行此操作，但它会引发错误。

stochastic_gradient_descent_winequality.py

import numpy
import pandas

if __name__ == '__main__' :

    with open('winequality-white.csv', 'r') as f_0 :
        with open('winequality-white-updated.csv', 'w') as f_1 :
            f_0.next()
            for line in f_0 :
                f_1.write(line)


    wine_data = pandas.read_csv('winequality-white-updated.csv', sep = ',', header = None)
    wine_data_ = wine_data
    wine_data = numpy.array([x.split(';') for x in wine_data_], dtype = numpy.float)

    print (numpy.shape(wine_data))

Errors

错误

Traceback (most recent call last):
  File "stochastic_gradient_descent_winequality.py", line 16, in <module>
    wine_data = numpy.array([x.split(';') for x in wine_data_], dtype = numpy.float)
AttributeError: 'numpy.int64' object has no attribute 'split'

Answer 1

回答by Arya McCarthy

If you're using semicolons (;) as your csv-file separator instead of commas (,), you can adjust that first line:

如果您使用分号 ( ;) 作为 csv 文件分隔符而不是逗号 ( ,)，则可以调整第一行：

wine_data = pandas.read_csv('winequality-white-updated.csv', sep = ';', header = None)

The problem with your list comprehension is that [x.split(';') for x in wine_data_]iterates over the column names.

您的列表理解的问题在于[x.split(';') for x in wine_data_]迭代列名称。

That being the case, you have no need for the line with the list comprehension. You can read in your data and be done.

在这种情况下，您不需要使用列表理解的行。您可以读入您的数据并完成。

wine_data = pandas.read_csv('winequality-white-updated.csv', sep = ',', header = None)
print (numpy.shape(wine_data))

Answer 2

回答by Kondiba

In this

在这

x.split(';') for x in wine_data_

whatever xyou are getting that is not string. Only string have split(). If it is other than string then it will give this error

无论x你得到什么，它都不是字符串。只有字符串有split(). 如果它不是字符串，那么它会给出这个错误

object has no attribute 'split

对象没有属性 'split

Check your xvalue.

检查你的x价值。

Answer 3

回答by Tiny.D

Suppose your csv file is like this:

假设你的 csv 文件是这样的：

2.12;5.12;3.12
3.1233;4;2
4;4.9696;3
2;5.0344;3
3.59595;4;2
4;4;3.59595
...

Then change your code like this:

然后像这样改变你的代码：

import pandas, numpy
wine_data = pandas.read_csv('test.csv', sep = ',', header = None)
wine_data_ = wine_data
wine_data = numpy.array([x.split(';') for x in wine_data_[0]], dtype = numpy.float)
wine_data

The wine_datawill be:

该wine_data会是：

array([[ 2.12   ,  5.12   ,  3.12   ],
       [ 3.1233 ,  4.     ,  2.     ],
       [ 4.     ,  4.9696 ,  3.     ],
       [ 2.     ,  5.0344 ,  3.     ],
       [ 3.59595,  4.     ,  2.     ],
       [ 4.     ,  4.     ,  3.59595]])

Be more efficient:

提高效率：

import pandas, numpy
wine_data = pandas.read_csv('test.csv', sep = ';', header = None)
wine_data = numpy.array(wine_data,dtype = numpy.float)

Python 以分号为分隔符读取 CSV 文件

提问by Abhijeet Mohanty

回答by Arya McCarthy

回答by Kondiba

回答by Tiny.D

相关推荐

最近更新

标签

Python 以分号为分隔符读取 CSV 文件

提问by Abhijeet Mohanty

回答by Arya McCarthy

回答by Kondiba

回答by Tiny.D

相关推荐

Python 类型错误：预期的 str、bytes 或 os.PathLike 对象，而不是 _io.BufferedReader

在路径中找不到“dot.exe”。Python 上的 Pydot (Windows 7)

从python中的文件加载json后检查密钥是否丢失

Python '+=' 和 '==+' 的区别？

相关推荐

最近更新

标签