Python 熊猫中的不同 read_csv index_col = None / 0 / False

Question

提问by markov zain

I used the read_csv command following below:

我使用了下面的 read_csv 命令：

    In [20]:
    dataframe = pd.read_csv('D:/UserInterest/output/ENFP_0719/Bookmark.csv', index_col=None)
    dataframe.head()
    Out[20]:
    Unnamed: 0  timestamp   url visits
    0   0   1.404028e+09    http://m.blog.naver.com/PostView.nhn?blogId=mi...   2
    1   1   1.404028e+09    http://m.facebook.com/l.php?u=http%3A%2F%2Fblo...   1
    2   2   1.404028e+09    market://details?id=com.kakao.story 1
    3   3   1.404028e+09    https://story-api.kakao.com/upgrade/install 4
    4   4   1.403889e+09    http://m.cafe.daum.net/WorldcupLove/Knj/173424...   1

The result shows column Unnamed:0and it is simillar when I used index_col=False, but when I used index_col=0, the result is following below:

结果显示列Unnamed:0，当我使用时它是相似的index_col=False，但是当我使用时index_col=0，结果如下：

dataframe = pd.read_csv('D:/UserInterest/output/ENFP_0719/Bookmark.csv', index_col=0)
dataframe.head()
Out[21]:
timestamp   url visits
0   1.404028e+09    http://m.blog.naver.com/PostView.nhn?blogId=mi...   2
1   1.404028e+09    http://m.facebook.com/l.php?u=http%3A%2F%2Fblo...   1
2   1.404028e+09    market://details?id=com.kakao.story 1
3   1.404028e+09    https://story-api.kakao.com/upgrade/install 4
4   1.403889e+09    http://m.cafe.daum.net/WorldcupLove/Knj/173424...   1

The result did show the column Unnamed:0, In here I want to ask, what is the difference between index_col=None, index_col=0, and index_col=False, I have read the documentation in this, but I still did not get the idea.

结果确实显示了该列Unnamed:0，在这里我想问一下index_col=None，index_col=0，和之间有什么区别index_col=False，我已经阅读了this中的文档，但我仍然没有得到这个想法。

Answer 1

采纳答案by EdChum

UPDATE

更新

I think since version 0.16.1it will now raise an error if you try to pass Truefor index_colto avoid this ambiguity

我认为从0.16.1版本开始，如果您尝试传递Trueforindex_col以避免这种歧义，它现在会引发错误

ORIGINAL

原来的

A lot of people get confused by this, to specify the ordinal index of your column you should pass the int position in this case 0.

很多人对此感到困惑，要指定列的序数索引，您应该在这种情况下传递 int 位置0。

In [3]:

import io
import pandas as pd
t="""index,a,b
0,hello,pandas"""
pd.read_csv(io.StringIO(t))
?
Out[3]:
   index      a       b
0      0  hello  pandas

The default value is index_col=Noneas shown above.

默认值index_col=None如上所示。

If we set index_col=0we're explicitly stating to treat the first column as the index:

如果我们设置，index_col=0我们明确声明将第一列视为索引：

In [4]:

pd.read_csv(io.StringIO(t), index_col=0)
Out[4]:
           a       b
index               
0      hello  pandas

If we pass index_col=Falsewe get the same result as None:

如果我们通过，index_col=False我们会得到与以下相同的结果None：

In [5]:

pd.read_csv(io.StringIO(t), index_col=False)
Out[5]:
   index      a       b
0      0  hello  pandas

If we now state index_col=Nonewe get the same behaviour as when we didn't pass this param:

如果我们现在声明，index_col=None我们会得到与未传递此参数时相同的行为：

In [6]:

pd.read_csv(io.StringIO(t), index_col=None)
Out[6]:
   index      a       b
0      0  hello  pandas

There is a bug where if you pass Truethis was erroneously being converted to index_col=1as Truewas being converted to 1:

有一个错误，如果你通过Truethis 被错误地转换index_col=1为True正在转换为1：

In [6]:

pd.read_csv(io.StringIO(t), index_col=True)
Out[6]:
       index       b
a               
0      hello  pandas

EDIT

编辑

For the case where you have a blank index column which is what you have:

对于您拥有一个空白索引列的情况：

In [7]:

import io
import pandas as pd
t=""",a,b
0,hello,pandas"""
pd.read_csv(io.StringIO(t))
?
Out[7]:
   Unnamed: 0      a       b
0           0  hello  pandas
In [8]:

pd.read_csv(io.StringIO(t), index_col=0)
Out[8]:
       a       b
0  hello  pandas
In [9]:

pd.read_csv(io.StringIO(t), index_col=False)
Out[9]:
   Unnamed: 0      a       b
0           0  hello  pandas
In [10]:

pd.read_csv(io.StringIO(t), index_col=None)
Out[10]:
   Unnamed: 0      a       b
0           0  hello  pandas

Python 熊猫中的不同 read_csv index_col = None / 0 / False

提问by markov zain

采纳答案by EdChum

相关推荐

最近更新

标签

Python 熊猫中的不同 read_csv index_col = None / 0 / False

提问by markov zain

采纳答案by EdChum

相关推荐

Python 在设定的行范围内读取文本文件

Python django.db.utils.ProgrammingError: 关系已经存在

Python PySpark 删除行

Python 将索引数组转换为 1-hot 编码的 numpy 数组

相关推荐

最近更新

标签