pandas 如何将以逗号分隔的制表符更改为熊猫中的逗号分隔符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33524199/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to change tab delimited in to comma delimited in pandas
提问by Same
I don't know if this is something possible. I am trying to append 12 files into a single file. One of the files is tab delimited and the rest comma delimitted. I loaded all the 12 files into dataframe and append it into an empty dataframe one by one in a loop.
我不知道这是否可能。我正在尝试将 12 个文件附加到一个文件中。其中一个文件以制表符分隔,其余文件以逗号分隔。我将所有 12 个文件加载到数据帧中,并在循环中将其逐个附加到空数据帧中。
list_of_files = glob.glob('./*.txt')
df = pd.DataFrame()
for filename in list_of_files:
file = pd.read_csv(filename)
dfFilename = pd.DataFrame(file)
df = df.append(dfFilename, ignore_index=True)
But the big file is not in the format I wanted it to be. And I think the problem is with the tab delimited file. And I tried to run the code without the tab delimited file and the format of the appended file is fine. So I was thinking if it is possible to change the tab delimited format into comma delimited using pandas.
但是大文件不是我想要的格式。我认为问题在于制表符分隔的文件。我尝试在没有制表符分隔文件的情况下运行代码,并且附加文件的格式很好。所以我在想是否可以将制表符分隔格式更改为使用Pandas分隔的逗号。
Thank you for your help and suggestion
感谢您的帮助和建议
回答by AustinC
You need to tell Pandas that the file is tab delimited when you import it. You can pass a delimiter to the read_csv method but in your case, since the delimiter changes by file, you want to pass None - this will make Pandas auto-detect the correct delimiter.
您需要在导入时告诉 Pandas 该文件是制表符分隔的。您可以将分隔符传递给 read_csv 方法,但在您的情况下,由于分隔符按文件更改,您希望传递 None - 这将使 Pandas 自动检测正确的分隔符。
Change your read_csv line to:
将您的 read_csv 行更改为:
pd.read_csv(filename,sep=None)