在python中将空格分隔文件转换为逗号分隔值文件

Question

提问by muammar

I am very new to Python. I know that this has already been asked, and I apologise, but the difference in this new situation is that spaces between strings are not equal. I have a file, named coord, that contains the following space delimited strings:

我对 Python 很陌生。我知道这已经被问到了，我很抱歉，但这种新情况的不同之处在于字符串之间的空格不相等。我有一个名为coord的文件，其中包含以下以空格分隔的字符串：

   1  C       6.00    0.000000000    1.342650315    0.000000000
   2  C       6.00    0.000000000   -1.342650315    0.000000000
   3  C       6.00    2.325538562    2.685300630    0.000000000
   4  C       6.00    2.325538562   -2.685300630    0.000000000
   5  C       6.00    4.651077125    1.342650315    0.000000000
   6  C       6.00    4.651077125   -1.342650315    0.000000000
   7  C       6.00   -2.325538562    2.685300630    0.000000000
   8  C       6.00   -2.325538562   -2.685300630    0.000000000
   9  C       6.00   -4.651077125    1.342650315    0.000000000
  10  C       6.00   -4.651077125   -1.342650315    0.000000000
  11  H       1.00    2.325538562    4.733763602    0.000000000
  12  H       1.00    2.325538562   -4.733763602    0.000000000
  13  H       1.00   -2.325538562    4.733763602    0.000000000
  14  H       1.00   -2.325538562   -4.733763602    0.000000000
  15  H       1.00    6.425098097    2.366881801    0.000000000
  16  H       1.00    6.425098097   -2.366881801    0.000000000
  17  H       1.00   -6.425098097    2.366881801    0.000000000
  18  H       1.00   -6.425098097   -2.366881801    0.000000000

Please, note the spaces before the start of each string in the first column. So I have tried the following in order of converting it to csv:

请注意第一列中每个字符串开头之前的空格。所以我尝试了以下将其转换为 csv 的顺序：

with open('coord') as infile, open('coordv', 'w') as outfile:
    outfile.write(infile.read().replace("  ", ", "))

# Unneeded columns are deleted from the csv

input = open('coordv', 'rb')
output = open('coordcsvout', 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
    if row:
        writer.writerow(row)
input.close()
output.close()

with open("coordcsvout","rb") as source:
    rdr= csv.reader( source )
    with open("coordbarray","wb") as result:
        wtr= csv.writer(result)
        for r in rdr:
            wtr.writerow( (r[5], r[6], r[7]) )

When I run the script, I get the following for the coordvin the very first part of the script, which is of course very wrong:

当我运行脚本时，我在脚本的第一部分得到了coordv的以下内容，这当然是非常错误的：

,  1, C, , ,  6.00, , 0.000000000, , 1.342650315, , 0.000000000
,  2, C, , ,  6.00, , 0.000000000,  -1.342650315, , 0.000000000
,  3, C, , ,  6.00, , 2.325538562, , 2.685300630, , 0.000000000
,  4, C, , ,  6.00, , 2.325538562,  -2.685300630, , 0.000000000
,  5, C, , ,  6.00, , 4.651077125, , 1.342650315, , 0.000000000
,  6, C, , ,  6.00, , 4.651077125,  -1.342650315, , 0.000000000
,  7, C, , ,  6.00,  -2.325538562, , 2.685300630, , 0.000000000
,  8, C, , ,  6.00,  -2.325538562,  -2.685300630, , 0.000000000
,  9, C, , ,  6.00,  -4.651077125, , 1.342650315, , 0.000000000
, 10, C, , ,  6.00,  -4.651077125,  -1.342650315, , 0.000000000
, 11, H, , ,  1.00, , 2.325538562, , 4.733763602, , 0.000000000
, 12, H, , ,  1.00, , 2.325538562,  -4.733763602, , 0.000000000
, 13, H, , ,  1.00,  -2.325538562, , 4.733763602, , 0.000000000
, 14, H, , ,  1.00,  -2.325538562,  -4.733763602, , 0.000000000
, 15, H, , ,  1.00, , 6.425098097, , 2.366881801, , 0.000000000
, 16, H, , ,  1.00, , 6.425098097,  -2.366881801, , 0.000000000
, 17, H, , ,  1.00,  -6.425098097, , 2.366881801, , 0.000000000
, 18, H, , ,  1.00,  -6.425098097,  -2.366881801, , 0.000000000

I have tried different possibilities in .replace without any success, and so far I haven't found any source of information on how I could do this. What would be the best way to get a comma-separated values from this coordfile? What I'm interested is in using then the csv module in python to choose columns 4:6 and finally use numpy to import them as follows:

我在 .replace 中尝试了不同的可能性，但没有成功，到目前为止，我还没有找到任何关于如何做到这一点的信息来源。从这个坐标文件中获取逗号分隔值的最佳方法是什么？我感兴趣的是使用 python 中的 csv 模块来选择列 4:6，最后使用 numpy 导入它们，如下所示：

from numpy import genfromtxt
cocmatrix = genfromtxt('input', delimiter=',')

I'd be very glad if somebody could help me with this problem.

如果有人能帮助我解决这个问题，我会很高兴。

Answer 1

采纳答案by j011y

replace your first bit with this. it's not super pretty but it will give you a csv format.

用这个替换你的第一个位。它不是超级漂亮，但它会给你一个 csv 格式。

with open('coord') as infile, open('coordv', 'w') as outfile:
    for line in infile:
        outfile.write(" ".join(line.split()).replace(' ', ','))
        outfile.write(",") # trailing comma shouldn't matter

if you want the outfile to have everything on different lines you could add outfile.write("\n")at the end of the for loop, but i dont think your code that follows this will work with it like that.

如果您希望 outfile 将所有内容都放在不同的行上，您可以outfile.write("\n")在 for 循环的末尾添加，但我认为您遵循的代码不会像那样使用它。

Answer 2

回答by the wolf

You can use csv:

您可以使用 csv：

import csv

with open(ur_infile) as fin, open(ur_outfile, 'w') as fout:
    o=csv.writer(fout)
    for line in fin:
        o.writerow(line.split())

Answer 3

回答by Daniel

You can use python pandas, I have written your data to data.csv:

您可以使用python pandas，我已将您的数据写入data.csv：

import pandas as pd
>>> df = pd.read_csv('data.csv',sep='\s+',header=None)
>>> df
     0  1  2         3         4  5
0    1  C  6  0.000000  1.342650  0
1    2  C  6  0.000000 -1.342650  0
2    3  C  6  2.325539  2.685301  0
3    4  C  6  2.325539 -2.685301  0
4    5  C  6  4.651077  1.342650  0
5    6  C  6  4.651077 -1.342650  0
...

The great thing about this is to access the underlying numpy array you can use df.values:

这样做的好处是可以访问您可以使用的底层 numpy 数组df.values：

>>> type(df.values)
<type 'numpy.ndarray'>

To save the data frame with comma delimiters:

要使用逗号分隔符保存数据框：

>>> df.to_csv('data_out.csv',header=None)

Pandas is a great library for managing large amounts of data, as a bonus it works well with numpy. There is also a very good chance that this will be much faster then using the csvmodule.

Pandas 是一个很好的管理大量数据的库，作为奖励，它与 numpy 配合得很好。这也很有可能比使用该csv模块快得多。

Answer 4

回答by user1667218

Why not to read a file line by line? Split a line into a list then rejoin a list with ','.

为什么不逐行读取文件？将一行拆分为一个列表，然后使用 ',' 重新加入一个列表。

Answer 5

回答by user1667218

>>> a = 'cah  1  C       6.00    0.000000000    1.342650315    0.000000000'
=>  a = 'cah  1  C       6.00    0.000000000    1.342650315    0.000000000'

>>> a.split()
=>  ['cah', '1', 'C', '6.00', '0.000000000', '1.342650315', '0.000000000']

>>> ','.join(a.split())
=>  'cah,1,C,6.00,0.000000000,1.342650315,0.000000000'

>>> ['"' + x + '"' for x in a.split()]
=>  ['"cah"', '"1"', '"C"', '"6.00"', '"0.000000000"', '"1.342650315"', '"0.000000000"']

>>> ','.join(['"' + x + '"' for x in a.split()]
=>  '"cah","1","C","6.00","0.000000000","1.342650315","0.000000000"'

Answer 6

回答by dstromberg

The csv module is good, or here's a way to do it without:

csv 模块很好，或者这里有一种方法可以做到：

#!/usr/local/cpython-3.3/bin/python

with open('input-file.csv', 'r') as infile, open('output.csv', 'w') as outfile:
    for line in infile:
        fields = line.split()
        outfile.write('{}\n'.format(','.join(fields)))

Answer 7

回答by Majid Hoseiny

for converting "space" to ","

用于将“空格”转换为“，”

only fill the filename to what you want

只填写你想要的文件名

with open('filename') as infile, open('output', 'w') as outfile:
    outfile.write(infile.read().replace(" ", ","))

for converting "," to "Space"

用于将“,”转换为“空格”

with open('filename') as infile, open('output', 'w') as outfile: outfile.write(infile.read().replace(",", " "))

Answer 8

回答by Ranjeet R Patil

For Merging Multiple text files in one CSV

用于在一个 CSV 中合并多个文本文件

import csv
import os
for x in range(0,n):            #n = max number of files 
    with open('input{}.txt'.format(x)) as fin, open('output.csv', 'a') as fout:
       csv_output=csv.writer(fout)
       for line in fin:
            csv_output.writerow(line.split())

在python中将空格分隔文件转换为逗号分隔值文件

提问by muammar

采纳答案by j011y

回答by the wolf

回答by Daniel

回答by user1667218

回答by user1667218

回答by dstromberg

回答by Majid Hoseiny

for converting "space" to ","

用于将“空格”转换为“，”

for converting "," to "Space"

用于将“,”转换为“空格”

回答by Ranjeet R Patil

For Merging Multiple text files in one CSV

用于在一个 CSV 中合并多个文本文件

相关推荐

最近更新

标签

在python中将空格分隔文件转换为逗号分隔值文件

提问by muammar

采纳答案by j011y

回答by the wolf

回答by Daniel

回答by user1667218

回答by user1667218

回答by dstromberg

回答by Majid Hoseiny

for converting "space" to ","

用于将“空格”转换为“，”

for converting "," to "Space"

用于将“,”转换为“空格”

回答by Ranjeet R Patil

For Merging Multiple text files in one CSV

用于在一个 CSV 中合并多个文本文件

相关推荐

如何在 SublimeREPL 上运行 Python 代码

Python 如何在 TensorFlow 中将张量转换为 numpy 数组？

protobuf 到 python 中的 json

Python 在 Windows 中 Kivy 到 Apk

相关推荐

最近更新

标签