pandas 在熊猫中按位置或索引访问列

Question

提问by jezrael

I have a list as follows and I search it in a csv file to get the item code associate with it. E.g., for 0 -> item code is 11nm

我有一个列表如下，我在一个 csv 文件中搜索它以获取与其关联的项目代码。例如，对于 0 -> 项目代码是 11nm

L = [0, 2]

CSV file:
0, 11nm
1, 22nm
2, 33nm
3, 44nm

I am currently doing it as follows.

我目前正在这样做。

df = pd.read_csv('item_code.csv', sep = ',')
item_codes= df[df["No"].isin(L)]["item_code"].tolist()

However, now I want to know how to do the same thing for a csv file when the file headings (No, item_code) is unavailable.

但是，现在我想知道当文件标题（否，item_code）不可用时如何对 csv 文件执行相同的操作。

Please help me.

请帮我。

Answer 1

回答by cs95

When the column names are unavailable, you can refer to them by index using df.iloc:

当列名不可用时，您可以使用df.iloc以下方法通过索引引用它们：

item_codes = df[df.iloc[:, 0].isin(L)].iloc[:, 1].tolist()

MCVE:

MCVE：

import pandas as pd
import numpy as np
import io

text = \
'''0, 11nm
1, 22nm
2, 33nm
3, 44nm'''

buf = io.StringIO(text)    
df = pd.read_csv(buf, sep=',\s*', header=None, engine='python') # no column names
print(df) 

   0     1
0  0  11nm
1  1  22nm
2  2  33nm
3  3  44nm

L = [0, 2]
item_codes = df[df.iloc[:, 0].isin(L)].iloc[:, 1]
print(item_codes)

0    11nm
2    33nm
Name: 1, dtype: object

print(item_codes.tolist())
['11nm', '33nm']

Notes:

笔记：

sep=',\s*'is a regex pattern (to specify column delimiters)
header=Nonewill prevent any rows from being assigned
engine='python'to select the regex engine

sep=',\s*'是正则表达式模式（用于指定列分隔符）
header=None将阻止分配任何行
engine='python'选择正则表达式引擎

Answer 2

回答by jezrael

You can use parameter namesfor specify columns names, for select column use loc:

您可以使用参数names来指定列名称，用于选择列loc：

df = pd.read_csv('item_code.csv', names=['No','item_code'])
print (df)
   No item_code
0   0      11nm
1   1      22nm
2   2      33nm
3   3      44nm


item_codes= df.loc[df["No"].isin(L), "item_code"].tolist()
print (item_codes)
['11nm', '33nm']

Or use parameter header=Nonefor default columns names 0,1...:

或使用参数header=None作为默认列名0,1...：

df = pd.read_csv('item_code.csv', header=None)

print (df)
   0     1
0  0  11nm
1  1  22nm
2  2  33nm
3  3  44nm

#first column selected by position with iloc
item_codes= df.loc[df.iloc[:,0].isin(L), 1].tolist()
print (item_codes)
['11nm', '33nm']

#first column selected by column name
item_codes= df.loc[df[0].isin(L), 1].tolist()
print (item_codes)
['11nm', '33nm']

Answer 3

回答by Mohamed Ali JAMAOUI

After reading the csv file with header=None, to let pandas know that you don't have a header in your file:

使用阅读 csv 文件后header=None，让 Pandas 知道您的文件中没有标题：

df = pd.read_csv('item_code.csv', sep = ',', header=None)

You can use the column index instead of the column name.

您可以使用列索引代替列名。

Like this :

像这样：

df[df[0].isin(L)][1].tolist()

or this :

或这个：

df[df.iloc[:,0].isin(L)][1].tolist()

Explanation:

解释：

if you print the dataframe after reading it without header with print(df)

如果在没有标题的情况下阅读数据帧后打印数据帧 print(df)

You will notice that pandas assigns the number [0,1]to the column names instead of the ["No", "item_code"]that weren't present as a header. Thus, you can reference each column with its index like this df[0]or df.iloc[:, 0].

您会注意到，pandas 将编号分配[0,1]给列名，而不是["No", "item_code"]未作为标题出现的列名。因此，您可以像这样df[0]或df.iloc[:, 0].

The latter df.iloc[:, 0]tells pandas to take all rows and only column 0.

后者df.iloc[:, 0]告诉 Pandas 获取所有行且仅获取 column 0。

pandas 在熊猫中按位置或索引访问列

提问by jezrael

回答by cs95

回答by jezrael

回答by Mohamed Ali JAMAOUI

Explanation:

解释：

相关推荐

最近更新

标签

pandas 在熊猫中按位置或索引访问列

提问by jezrael

回答by cs95

回答by jezrael

回答by Mohamed Ali JAMAOUI

Explanation:

解释：

相关推荐

更改 Pandas 条形图的颜色

pandas `.loc` 和 `.iloc` 带有 MultiIndex'd DataFrame

Python Pandas - 选择等于的数据框列

pandas 使用 df.query() 从 DataFrame 中提取行

相关推荐

最近更新

标签