Pandas：如何从 CSV 文件中读取特定行

Question

提问by kev

I have a csv file example.csvlike-

我有一个 csv 文件，例如example.csv-

    name  |  hits
   ---------------
     A    |  34
     B    |  30
     C    |  25
     D    |  20

Using pandasin Python, how do I only read the rows with hits > 20? Looking for something like-

pandas在 Python 中使用，我如何只读取行hits > 20？寻找类似的东西-

my_df = pd.read_csv('example.csv', where col('hits') > 20)

Answer 1

Read the entire csv and do filtering like below

阅读整个 csv 并进行如下过滤

my_df =  pd.read_csv("example.csv")
my_df = my_df[my_df['hits']>20]

If you are having memory issues while reading, you can set chunksizeparameter to read it in chunks

如果您在阅读时遇到内存问题，您可以设置chunksize参数以分块读取

Answer 2

Read the entire csv and then use query() method to select the required section :

阅读整个 csv，然后使用 query() 方法选择所需的部分：

required_df = my_df.query("hits > 20")

or,

或者，

required_df =df.loc[df['hits']>20]

Answer 3

Once you create a dataframe from any source, you can simply use

从任何来源创建数据框后，您只需使用

dataframe_name['column_name'] (conditions) (value)

dataframe_name['column_name']（条件）（值）

something like

就像是

dataframe['score'] > 200