如何使用 Python 从 URL 读取 CSV 文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16283799/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 22:12:16  来源:igfitidea点击:

How to read a CSV file from a URL with Python?

pythoncsvcurloutputpython-2.x

提问by mongotop

when I do curl to a API call link http://example.com/passkey=wedsmdjsjmdd

当我 curl 到 API 调用链接http://example.com/passkey=wedsmdjsjmdd

curl 'http://example.com/passkey=wedsmdjsjmdd'

I get the employee output data on a csv file format, like:

我以 csv 文件格式获取员工输出数据,例如:

"Steve","421","0","421","2","","","","","","","","","421","0","421","2"

how can parse through this using python.

如何使用python解析这个。

I tried:

我试过:

import csv 
cr = csv.reader(open('http://example.com/passkey=wedsmdjsjmdd',"rb"))
for row in cr:
    print row

but it didn't work and I got an error

但它没有用,我得到了一个错误

http://example.com/passkey=wedsmdjsjmdd No such file or directory:

http://example.com/passkey=wedsmdjsjmdd No such file or directory:

Thanks!

谢谢!

采纳答案by eandersson

You need to replace openwith urllib.urlopenor urllib2.urlopen.

您需要替换openurllib.urlopenurllib2.urlopen

e.g.

例如

import csv
import urllib2

url = 'http://winterolympicsmedals.com/medals.csv'
response = urllib2.urlopen(url)
cr = csv.reader(response)

for row in cr:
    print row

This would output the following

这将输出以下内容

Year,City,Sport,Discipline,NOC,Event,Event gender,Medal
1924,Chamonix,Skating,Figure skating,AUT,individual,M,Silver
1924,Chamonix,Skating,Figure skating,AUT,individual,W,Gold
...

回答by Kathirmani Sukumar

Using pandas it is very simple to read a csv file directly from a url

使用 pandas 直接从 url 读取 csv 文件非常简单

import pandas as pd
data = pd.read_csv('https://example.com/passkey=wedsmdjsjmdd')

This will read your data in tabular format, which will be very easy to process

这将以表格格式读取您的数据,这将非常容易处理

回答by Rodo

You could do it with the requests module as well:

你也可以用 requests 模块做到这一点:

url = 'http://winterolympicsmedals.com/medals.csv'
r = requests.get(url)
text = r.iter_lines()
reader = csv.reader(text, delimiter=',')

回答by The Aelfinn

To increase performance when downloading a large file, the below may work a bit more efficiently:

为了在下载大文件时提高性能,以下可能会更有效地工作:

import requests
from contextlib import closing
import csv

url = "http://download-and-process-csv-efficiently/python.csv"

with closing(requests.get(url, stream=True)) as r:
    reader = csv.reader(r.iter_lines(), delimiter=',', quotechar='"')
    for row in reader:
        # Handle each row here...
        print row   

By setting stream=Truein the GET request, when we pass r.iter_lines()to csv.reader(), we are passing a generatorto csv.reader(). By doing so, we enable csv.reader() to lazily iterate over each line in the response with for row in reader.

通过stream=True在 GET 请求中设置,当我们传递r.iter_lines()给 csv.reader() 时,我们将一个生成器传递给 csv.reader()。通过这样做,我们使 csv.reader() 能够懒惰地迭代响应中的每一行for row in reader

This avoids loading the entire file into memory before we start processing it, drastically reducing memory overhead for large files.

这避免了在我们开始处理之前将整个文件加载到内存中,从而大大减少了大文件的内存开销。

回答by user2458922

import pandas as pd
url='https://raw.githubusercontent.com/juliencohensolal/BankMarketing/master/rawData/bank-additional-full.csv'
data = pd.read_csv(url,sep=";") # use sep="," for coma separation. 
data.describe()

enter image description here

在此处输入图片说明

回答by Adeshina Otayo

what you were trying to do with the curl command was to download the file to your local hard drive(HD). You however need to specify a path on HD

您尝试使用 curl 命令执行的操作是将文件下载到本地硬盘驱动器 (HD)。但是,您需要在 HD 上指定路径

curl http://example.com/passkey=wedsmdjsjmdd -o ./example.csv
cr = csv.reader(open('./example.csv',"r"))
for row in cr:
    print row