Python 禁用索引熊猫数据框

Question

提问by GeauxEric

How can I drop or disable the indices in a pandas Data Frame?

如何删除或禁用熊猫数据框中的索引？

I am learning the pandas from the book "python for data analysis" and I already know I can use the dataframe.drop to drop one column or one row. But I did not find anything about disabling the all the indices in place.

我正在从“python for data analysis”一书中学习熊猫，我已经知道我可以使用 dataframe.drop 删除一列或一行。但是我没有找到任何关于禁用所有索引的信息。

Answer 1

采纳答案by Viktor Kerkez

df.valuesgives you the raw NumPy ndarraywithout the indexes.

df.values为您提供ndarray没有索引的原始 NumPy 。

>>> df
   x   y
0  4  GE
1  1  RE
2  1  AE
3  4  CD
>>> df.values
array([[4, 'GE'],
       [1, 'RE'],
       [1, 'AE'],
       [4, 'CD']], dtype=object)

You cannot have a DataFrame without the indexes, they are the whole point of the DataFrame :)

没有索引就不能拥有 DataFrame，它们是 DataFrame 的重点：)

But just to be clear, this operation is not inplace:

但要明确的是，此操作不是到位的：

>>> df.values is df.values
False

DataFrame keeps the data in two dimensional arrays grouped by type, so when you want the whole data frame it will have to find the LCD of all the dtypes and construct a 2D array of that type.

DataFrame 将数据保存在按类型分组的二维数组中，因此当您想要整个数据框时，它必须找到所有 dtype 的 LCD 并构建该类型的二维数组。

To instantiate a new data frame with the values from the old one, just pass the old DataFrame to the new ones constructor and no data will be copied the same data structures will be reused:

要使用旧数据帧的值实例化新数据帧，只需将旧数据帧传递给新的构造函数，并且不会复制任何数据，将重用相同的数据结构：

>>> df1 = pd.DataFrame([[1, 2], [3, 4]])
>>> df2 = pd.DataFrame(df1)
>>> df2.iloc[0,0] = 42
>>> df1
    0  1
0  42  2
1   3  4

But you can explicitly specify the copyparameter:

但是您可以明确指定copy参数：

>>> df1 = pd.DataFrame([[1, 2], [3, 4]])
>>> df2 = pd.DataFrame(df1, copy=True)
>>> df2.iloc[0,0] = 42
>>> df1
   0  1
0  1  2
1  3  4

Answer 2

回答by Sudipta Basak

I have a function that may help some. I combine csv files with a header in the following way in python:

我有一个功能可以帮助一些人。我在 python 中以下列方式将 csv 文件与标题结合起来：

    def combine_csvs(filedict, combined_file):
        files = filedict['files']
        df = pd.read_csv(files[0])
        for file in files[1:]:
            df = pd.concat([df, pd.read_csv(file)])
        df.to_csv(combined_file, index=False)
        return df

It can take as many files as you need. Call this as:

它可以根据需要获取任意数量的文件。称之为：

    combine_csvs(dict(files=["file1.csv","file2.csv", "file3.csv"]), 'output.csv')

Or if you are reading the dataframe in python as:

或者，如果您正在将 python 中的数据帧读取为：

    df = combine_csvs(dict(files=["file1.csv","file2.csv"]), 'output.csv')

The combine_csvs fucntion does not save the indices. If you need the indices use 'index=True' instead.

combine_csvs 功能不保存索引。如果您需要索引，请改用 'index=True'。

Answer 3

回答by naught101

d.index = range(len(d))

does a simple in-place index reset - i.e. it removes all of the existing indices, and adds a basic integer one, which is the most basic index type a pandas Dataframe can have.

执行简单的就地索引重置 - 即它删除所有现有索引，并添加一个基本整数，这是 Pandas Dataframe 可以拥有的最基本的索引类型。

Answer 4

回答by Matt

I was having a similar issue trying to take a DataFrame from an index-less CSV and write it back to another file.

我在尝试从无索引的 CSV 中获取 DataFrame 并将其写回另一个文件时遇到了类似的问题。

I came up with the following:

我想出了以下内容：

import pandas as pd
import os

def csv_to_df(csv_filepath):
    # the read_table method allows you to set an index_col to False, from_csv does not
    dataframe_conversion = pd.io.parsers.read_table(csv_filepath, sep='\t', header=0, index_col=False)
    return dataframe_conversion

def df_to_excel(df):
    from pandas import ExcelWriter
    # Get the path and filename w/out extension
    file_name = 'foo.xlsx'
    # Add the above w/ .xslx
    file_path = os.path.join('some/directory/', file_name)
    # Write the file out
    writer = ExcelWriter(file_path)
    # index_label + index are set to `False` so that all the data starts on row
    # index 1 and column labels (called headers by pandas) are all on row index 0.
    df.to_excel(writer, 'Attributions Detail', index_label=False, index=False, header=True)
    writer.save()

Answer 5

回答by Jason Sprong

Additionally, if you are using the df.to_excelfunction of a pd.ExcelWriter, which is where it is written to an Excel worksheet, you can specify index=Falsein your parameters there.

此外，如果您正在使用 a 的df.to_excel函数pd.ExcelWriter，这是将它写入 Excel 工作表的位置，您可以index=False在那里指定参数。

create the Excel writer:

创建 Excel 编写器：

writer = pd.ExcelWriter(type_box + '-rules_output-' + date_string + '.xlsx',engine='xlsxwriter')

We have a list called lines:

我们有一个名为的列表lines：

# create a dataframe called 'df'
df = pd.DataFrame([sub.split(",") for sub in lines], columns=["Rule", "Device", "Status"]))

#convert df to Excel worksheet
df.to_excel(writer, sheet_name='all_status',**index=False**)
writer.save()

Python 禁用索引熊猫数据框

提问by GeauxEric

采纳答案by Viktor Kerkez

回答by Sudipta Basak

回答by naught101

回答by Matt

回答by Jason Sprong

相关推荐

最近更新

标签

Python 禁用索引熊猫数据框

提问by GeauxEric

采纳答案by Viktor Kerkez

回答by Sudipta Basak

回答by naught101

回答by Matt

回答by Jason Sprong

相关推荐

Python 读取csv文件pandas时给出列名

将 unicode 列表转换为包含 python 字符串的列表的简单方法？

python请求ssl握手失败

Python 访问列表的多个元素，知道它们的索引

相关推荐

最近更新

标签