pandas 将数据框保存和加载到 csv 导致未命名列

Question

提问by idoda

prob in the title. exaple:

标题中的问题。例子：

x=[('a','a','c') for i in range(5)]
df = DataFrame(x,columns=['col1','col2','col3'])
df.to_csv('test.csv')
df1 = read_csv('test.csv')

   Unnamed: 0 col1 col2 col3
0           0    a    a    c
1           1    a    a    c
2           2    a    a    c
3           3    a    a    c
4           4    a    a    c

The reason seems to be that when saving a dataframe, the index column is written also, with no name in the header. then when you load the csv again, it is loaded with the index column as unnamed column. Is this a bug? How can I avoid writing a csv with the index, or dropping unnamed columns in reading?

原因似乎是在保存数据帧时，索引列也被写入，标题中没有名称。然后当您再次加载 csv 时，它会加载索引列作为未命名列。这是一个错误吗？如何避免使用索引编写 csv，或在读取时删除未命名的列？

Answer 1

回答by Max

You can remove row labels via the indexand index_labelparameters of to_csv.

您可以通过to_csv的index和index_label参数删除行标签。

Answer 2

回答by Jeff

These are not symmetric as there are ambiguities in the csv format because of the positioning. You need to specify an index_colon read-back

这些不是对称的，因为 csv 格式由于定位存在歧义。您需要指定一个index_col回读

In [1]: x=[('a','a','c') for i in range(5)]

In [2]: df = DataFrame(x,columns=['col1','col2','col3'])

In [3]: df.to_csv('test.csv')

In [4]: !cat test.csv
,col1,col2,col3
0,a,a,c
1,a,a,c
2,a,a,c
3,a,a,c
4,a,a,c

In [5]: pd.read_csv('test.csv',index_col=0)
Out[5]: 
  col1 col2 col3
0    a    a    c
1    a    a    c
2    a    a    c
3    a    a    c
4    a    a    c

This looks very similar to the above, so is 'foo' a column or an index?

这看起来与上面的非常相似，那么 'foo' 是列还是索引？

In [6]: df.index.name = 'foo'

In [7]: df.to_csv('test.csv')

In [8]: !cat test.csv
foo,col1,col2,col3
0,a,a,c
1,a,a,c
2,a,a,c
3,a,a,c
4,a,a,c

Answer 3

回答by Денис Волконский

That s how use index df.to_csv('test.csv', index_label=False)But for me, when I've tried submit to Kaggle it's return error "ERROR: Record 1 had 3 columns but expected 2", so I solved it use this code.

这就是使用索引的方式 df.to_csv('test.csv', index_label=False)但是对我来说，当我尝试提交给 Kaggle 时，它返回错误“错误：记录 1 有 3 列但预期为 2”，所以我使用此代码解决了它。

Answer 4

回答by piokuc

You can specify explicitly which columns you want to write using colsparameter.

您可以使用cols参数明确指定要写入的列。

pandas 将数据框保存和加载到 csv 导致未命名列

提问by idoda

回答by Max

回答by Jeff

回答by Денис Волконский

回答by piokuc

相关推荐

最近更新

标签

pandas 将数据框保存和加载到 csv 导致未命名列

提问by idoda

回答by Max

回答by Jeff

回答by Денис Волконский

回答by piokuc

相关推荐

比较 2 个不同的 Pandas 数据帧的 2 列，如果相同，则在 Python 中将 1 插入另一个

pandas 如何使用熊猫按周对数据透视表结果进行分组？

在 Pandas 中将 MultiIndex 的级别重新索引为任意顺序

pandas 熊猫中的简单 for 循环

相关推荐

最近更新

标签