如何使用 Python 在 csv 文件中搜索“文本”或“数字”,如果存在,则仅将第一列和第二列值打印到新的 csv 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43986126/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:36:43  来源:igfitidea点击:

How to search for a 'text' or 'number' in a csv file with Python AND if exists print only first and second column values to a new csv file

pythonpython-2.7csv

提问by rcubefather

I want to do the following using Python.

我想使用 Python 执行以下操作。

Step-1: Read a specific third column on a csv file using Python.
Step-2: Create a list with values got from step-1
Step-3: Take the value of index[0], search in csv file, if present print the values of column 1 and 2 only to a new csv file(There are 6 columns). If Not presents just ignore and goto next search.

file1.csv:

文件1.csv:

Country,Location,number,letter,name,pup-name,null
a,ab,1,qw,abcd,test1,3
b,cd,1,df,efgh,test2,4
c,ef,2,er,fgh,test3,5
d,gh,3,sd,sds,test4,
e,ij,5,we,sdrt,test5,
f,kl,6,sc,asdf,test6,
g,mn,7,df,xcxc,test7,
h,op,8,gb,eretet,test8,
i,qr,8,df,hjjh,test9,

Python script written for this:

为此编写的 Python 脚本:

import csv
import time
from collections import defaultdict

columns = defaultdict(list)

with open('file1.csv') as f:
    reader = csv.reader(f)
    reader.next()
    for row in reader:
        for (i,v) in enumerate(row):
            columns[i].append(v)
#print(columns[2])

b=(columns[2])
for x in b[:]:
    time.sleep(1)
    print x

Output of above script:

上面脚本的输出:

MacBook-Pro:test_usr$ python csv_file.py 
1
1
2
3
5
6
7
8
8
MacBook-Pro:test_usr$

I am able to do the steps 1 and 2.

我能够完成第 1 步和第 2 步。

Please guide me on doing Step-3. That is how to search for text/string in csv file and if present how to extract only specific column values to a new csv file?

请指导我做第 3 步。那就是如何在 csv 文件中搜索文本/字符串,如果存在如何仅将特定列值提取到新的 csv 文件中?

Output file should look like:

输出文件应如下所示:

a,ab
b,cd
c,ef
d,gh
e,ij
f,kl
g,mn
h,op
i,qr

Note : Search string will be from another csv file. Please don't suggest the direct answer for printing values of column 1 and 2 directly.

注意:搜索字符串将来自另一个 csv 文件。请不要直接建议直接打印第 1 列和第 2 列的值。

FINAL CODE is looks this:

最终代码看起来是这样的:

import csv
import time
from collections import defaultdict

columns = defaultdict(list)

with open('file1.csv') as f:
    reader = csv.reader(f)
    reader.next()
    for row in reader:
        for (i,v) in enumerate(row):
            columns[i].append(v)
            b=(columns[2])

            for x in b[:]:
                with open('file2.csv') as f, open('file3.csv', 'a') as g:
                    reader = csv.reader(f)
                    #next(reader, None) # discard the header
                    writer = csv.writer(g)
                    for row in reader:
                        if row[2] == x:
                            writer.writerow(row[:2])

file1.csv:

文件1.csv:

Country,Location,number,letter,name,pup-name,null
a,ab,1,qw,abcd,test1,3
b,cd,1,df,efgh,test2,4
c,ef,2,er,fgh,test3,5
d,gh,3,sd,sds,test4,
e,ij,5,we,sdrt,test5,
f,kl,6,sc,asdf,test6,
g,mn,7,df,xcxc,test7,
h,op,8,gb,eretet,test8,
i,qr,8,df,hjjh,test9,

file2.csv:

file2.csv:

count,name,number,Type,status,Config Version,,IP1,port
1,bob,1,TRAFFIC,end,1.2,,1.1.1.1,1
2,john,1,TRAFFIC,end,2.1,,1.1.1.2,2
4,foo,2,TRAFFIC,end,1.1,,1.1.1.3,3
5.333333333,test,3,TRAFFIC,end,3.1,,1.1.1.4,4
6.833333333,raa,5,TRAFFIC,end,5.1,,1.1.1.5,5
8.333333333,kaa,6,TRAFFIC,end,7.1,,1.1.1.6,6
9.833333333,thaa,7,TRAFFIC,end,9.1,,1.1.1.7,7
11.33333333,paa,8,TRAFFIC,end,11.1,,1.1.1.8,8
12.83333333,maa,8,TRAFFIC,end,13.1,,1.1.1.9,9

If I run the above script, output of file3.csv:

如果我运行上面的脚本,file3.csv 的输出:

1,bob
2,john
1,bob
2,john
1,bob
2,john
1,bob
2,john
1,bob
2,john
1,bob
2,john
1,bob
2,john
1,bob
2,john
1,bob
2,john
1,bob
2,john
1,bob
2,john
1,bob
2,john
.
.
.

Its goes like this in loop

But output should be like this:

但是输出应该是这样的:

count,name
1,bob,
2,john,
4,foo,
5.333333333,test,
6.833333333,raa,
8.333333333,kaa,
9.833333333,thaa,
11.33333333,paa,
12.83333333,maa,

采纳答案by juanpa.arrivillaga

I think you should reconsider your approach. You can achieve your goal simply by iterating over the CSV file, without creating intermediate dicts and lists..., and since you want to work with specific columns, you'll make your life easier and your code more readable by using DictReaderand DictWriter

我认为你应该重新考虑你的方法。您只需遍历 CSV 文件即可实现您的目标,而无需创建中间dicts 和lists...,并且由于您希望使用特定列,因此通过使用DictReaderDictWriter

import csv
import time

search_string = "whatever"

with open('file1.csv', 'rb') as f, open('file2.csv', 'wb') as g:
    reader = csv.DictReader(f)
    c1, c2, c3, *_ = reader.fieldnames
    writer = csv.DictWriter(g, fieldnames=(c1, c2))
    for row in reader:
        if row[c3] == search_string:
            writer.writerow({c1:row[c1], c2:row[c2]})

Keep in mind that csvmodule will always return strings. You have to handle data-type conversions yourself, if you need them (I've left that out form above).

请记住,csv模块将始终返回strings。如果需要,您必须自己处理数据类型转换(我在上面省略了该表格)。

If you don't want to use DictReader/DictWriter, I suppose it is a little more verbose, and don't want a header in your output file:

如果您不想使用DictReader/ DictWriter,我想它有点冗长,并且不希望输出文件中有标题:

with open('file1.csv') as f, open('file2.csv', 'w') as g:
    reader = csv.reader(f)
    next(reader, None) # discard the header
    writer = csv.writer(g)
    for row in reader:
        if row[2] == search_string:
            writer.writerow(row[:2])

回答by Jon Kiparsky

That is how to search for text/string in csv file and if present how to extract only specific column values to a new csv file?

那就是如何在 csv 文件中搜索文本/字符串,如果存在如何仅将特定列值提取到新的 csv 文件中?

This is two questions.

这是两个问题。

First question: to search for text in a file, the simplest answer would be to read the file text into memory and look for the text. If you want to look for the text in a specific column of the csv you're reading in, you can use a DictReader to make life easy:

第一个问题:要在文件中搜索文本,最简单的答案是将文件文本读入内存并查找文本。如果您想在您正在阅读的 csv 的特定列中查找文本,您可以使用 DictReader 来简化工作:

for row in reader:
  if search_target in row[header]:
    # found it!

Second question: One way to write specific columns to a new csv would be as follows:

第二个问题:将特定列写入新 csv 的一种方法如下:

keys = ["Country", "Location"]
new_rows = [{key: row[key] for key in keys} for row in reader]
writer = csv.DictWriter(somefile, keys)
writer.writerows(new_rows)