带有更多分隔符的 Pandas 数据框 to_csv
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45983286/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Data Frame to_csv with more separator
提问by Bhuvan Kumar
I have a file of 40 columns and 600 000 rows. After processing it in pandas dataframe, i would like to save the data frame to csv with different spacing length. There is a sep kwarg in df.to_csv, i tried with regex, but i'm getting error
我有一个 40 列和 600 000 行的文件。在pandas数据帧中处理后,我想将数据帧保存到具有不同间距长度的csv。df.to_csv 中有一个 sep kwarg,我尝试使用正则表达式,但出现错误
TypeError: "delimiter" must be an 1-character string.
类型错误:“分隔符”必须是 1 个字符的字符串。
I want the output with different column spacing, as shown below
我想要不同列间距的输出,如下图
A B C D E F G
1 3 5 8 8 9 8
1 3 5 8 8 9 8
1 3 5 8 8 9 8
1 3 5 8 8 9 8
1 3 5 8 8 9 8
Using the below code i'm getting the tab delimited. which are all with same spacing.
使用下面的代码,我得到了制表符分隔。它们都具有相同的间距。
df.to_csv("D:\test.txt", sep = "\t", encoding='utf-8')
A B C D E F G
1 3 5 8 8 9 8
1 3 5 8 8 9 8
1 3 5 8 8 9 8
1 3 5 8 8 9 8
1 3 5 8 8 9 8
I don't want to do looping, It might take lot of time for 600k lines.
我不想做循环,600k 行可能需要很多时间。
回答by Bhuvan Kumar
Thank you for comments, It helped me. Below is the code.
谢谢你的评论,它帮助了我。下面是代码。
import pandas as pd
#Create DataFrame
df = pd.DataFrame({'A':[0,1,2,3],'B':[0,11,2,333],'C':[0,1,22,3],'D':[00,1,2,33]})
#Convert the Columns to string
df[df.columns]=df[df.columns].astype(str)
#Create the list of column separator width
SepWidth = [5,6,3,8]
#Temp dict
tempdf = {}
#Convert all the column to series
for i, eCol in enumerate(df):
tempdf[i] = pd.Series(df[eCol]).str.pad(width=SepWidth[i])
#Final DataFrame
Fdf = pd.concat(tempdf, axis=1)
#print Fdf
#Export to csv
Fdf.to_csv("D:\test.txt", sep='\t', index=False, header=False, encoding='utf-8')
output of test.txt
test.txt 的输出
0 0 0 0
1 11 1 1
2 2 22 2
3 333 3 33
UPDATE
更新
Tab delimited ('\t') was included in spacing, while using pandas.to_csv. Behalf of pandas.to_csv i'm using below code to save as txt.
在使用 pandas.to_csv 时,空格中包含制表符分隔 ('\t')。代表 pandas.to_csv 我使用下面的代码保存为 txt。
numpy.savttxt(file, df.values, fmt='%s')
numpy.savttxt(file, df.values, fmt='%s')