Python 将 .data 文件转换为 .csv

Question

提问by Little

I have found the following data set named ecoli.data and available in:

我发现以下名为 ecoli.data 的数据集可用于：

https://archive.ics.uci.edu/ml/machine-learning-databases/ecoli/

I would like to open it in R for making a classification task, but I would prefer to convert this document into a csv file. When I open it in word I notice that is not tab delimited, because there are like tree spaces between each row; so bottomline question is how to convert this file into csv using Excel or maybe Python.

我想在 R 中打开它以进行分类任务，但我更愿意将此文档转换为 csv 文件。当我在 word 中打开它时，我注意到它不是制表符分隔的，因为每行之间都有树状空间；所以底线问题是如何使用 Excel 或 Python 将此文件转换为 csv。

Answer 1

回答by cars10m

Rename the file to ecoli.txtthen open it in Excel. This way you will be using the "Text Import Wizard" of Microsoft Excel that enables you to chose options like "Fixed width". Just click on "next" a few times and "finish" and you will have the data in the Excel grid. Now save it again as CSV.

重命名文件，ecoli.txt然后在 Excel 中打开它。这样，您将使用 Microsoft Excel 的“文本导入向导”，它使您能够选择“固定宽度”等选项。只需单击“下一步”几次并“完成”，您就会在 Excel 网格中获得数据。现在再次将其另存为 CSV。

Answer 2

回答by Sait

Using Python 2.7:

使用 Python 2.7：

import csv

with open('ecoli.data.txt') as input_file:
   lines = input_file.readlines()
   newLines = []
   for line in lines:
      newLine = line.strip().split()
      newLines.append( newLine )

with open('output.csv', 'wb') as test_file:
   file_writer = csv.writer(test_file)
   file_writer.writerows( newLines )

Answer 3

回答by hrbrmstr

Here are two ways to actually do that in R (that work):

以下是在 R 中实际执行此操作的两种方法（该工作）：

library(readr)

url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/ecoli/ecoli.data"

with base R

带基 R

df <- read.table(url)
dplyr::glimpse(df)

## Observations: 336
## Variables:
## $ V1 (fctr) AAT_ECOLI, ACEA_ECOLI, ACEK_ECOLI, ACKA_ECOLI, ADI_ECOLI, ...
## $ V2 (dbl) 0.49, 0.07, 0.56, 0.59, 0.23, 0.67, 0.29, 0.21, 0.20, 0.42,...
## $ V3 (dbl) 0.29, 0.40, 0.40, 0.49, 0.32, 0.39, 0.28, 0.34, 0.44, 0.40,...
## $ V4 (dbl) 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48,...
## $ V5 (dbl) 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,...
## $ V6 (dbl) 0.56, 0.54, 0.49, 0.52, 0.55, 0.36, 0.44, 0.51, 0.46, 0.56,...
## $ V7 (dbl) 0.24, 0.35, 0.37, 0.45, 0.25, 0.38, 0.23, 0.28, 0.51, 0.18,...
## $ V8 (dbl) 0.35, 0.44, 0.46, 0.36, 0.35, 0.46, 0.34, 0.39, 0.57, 0.30,...
## $ V9 (fctr) cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp...

write.csv(df, "ecoli.csv", row.names=FALSE)

with readrfunctions

带readr功能

df <- read_table(url, col_names=FALSE)
dplyr::glimpse(df)

## Observations: 336
## Variables:
## $ X1 (chr) "AAT_ECOLI", "ACEA_ECOLI", "ACEK_ECOLI", "ACKA_ECOLI", "ADI...
## $ X2 (dbl) 0.49, 0.07, 0.56, 0.59, 0.23, 0.67, 0.29, 0.21, 0.20, 0.42,...
## $ X3 (dbl) 0.29, 0.40, 0.40, 0.49, 0.32, 0.39, 0.28, 0.34, 0.44, 0.40,...
## $ X4 (dbl) 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48,...
## $ X5 (dbl) 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,...
## $ X6 (dbl) 0.56, 0.54, 0.49, 0.52, 0.55, 0.36, 0.44, 0.51, 0.46, 0.56,...
## $ X7 (dbl) 0.24, 0.35, 0.37, 0.45, 0.25, 0.38, 0.23, 0.28, 0.51, 0.18,...
## $ X8 (dbl) 0.35, 0.44, 0.46, 0.36, 0.35, 0.46, 0.34, 0.39, 0.57, 0.30,...
## $ X9 (chr) "cp", "cp", "cp", "cp", "cp", "cp", "cp", "cp", "cp", "cp",...

write_csv(df, "ecoli.csv")

Answer 4

回答by Anurag Sharma

Use pandas.read_table('https://archive.ics.uci.edu/ml/machine-learning-databases/ecoli/', delim_whitespace=True)

用 pandas.read_table('https://archive.ics.uci.edu/ml/machine-learning-databases/ecoli/', delim_whitespace=True)

Answer 5

回答by Rathinavel Subramanian

It's very simple, click the actual dataset name ex: xyz.data and rename it with XYZ.csv this will be converted into CSV format.

很简单，点击实际的数据集名称例如：xyz.data 并将其重命名为 XYZ.csv 这将转换为 CSV 格式。

Answer 6

回答by T-rex

An alternative to solve your problem could be to read your .datafile on R using the read.tablecommand.

解决您的问题的另一种方法是.data使用该read.table命令在 R 上读取您的文件。

ecoli <- read.table("ecoli.data",header=F)

Python 将 .data 文件转换为 .csv

提问by Little

回答by cars10m

回答by Sait

回答by hrbrmstr

回答by Anurag Sharma

回答by Rathinavel Subramanian

回答by T-rex

相关推荐

最近更新

标签

Python 将 .data 文件转换为 .csv

提问by Little

回答by cars10m

回答by Sait

回答by hrbrmstr

回答by Anurag Sharma

回答by Rathinavel Subramanian

回答by T-rex

相关推荐

Python 在 Pandas 和 numpy 中聚合 lambda 函数

Python 如何使用宏保存 XLSM 文件，使用 openpyxl

Python Pygame 中的倒数计时器

Python EOFError：读取一行时的EOF

相关推荐

最近更新

标签