如何使用 Python 从 URL 读取 CSV 文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16283799/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to read a CSV file from a URL with Python?
提问by mongotop
when I do curl to a API call link http://example.com/passkey=wedsmdjsjmdd
当我 curl 到 API 调用链接http://example.com/passkey=wedsmdjsjmdd
curl 'http://example.com/passkey=wedsmdjsjmdd'
I get the employee output data on a csv file format, like:
我以 csv 文件格式获取员工输出数据,例如:
"Steve","421","0","421","2","","","","","","","","","421","0","421","2"
how can parse through this using python.
如何使用python解析这个。
I tried:
我试过:
import csv
cr = csv.reader(open('http://example.com/passkey=wedsmdjsjmdd',"rb"))
for row in cr:
print row
but it didn't work and I got an error
但它没有用,我得到了一个错误
http://example.com/passkey=wedsmdjsjmdd No such file or directory:
http://example.com/passkey=wedsmdjsjmdd No such file or directory:
Thanks!
谢谢!
采纳答案by eandersson
You need to replace openwith urllib.urlopenor urllib2.urlopen.
您需要替换open为urllib.urlopen或urllib2.urlopen。
e.g.
例如
import csv
import urllib2
url = 'http://winterolympicsmedals.com/medals.csv'
response = urllib2.urlopen(url)
cr = csv.reader(response)
for row in cr:
print row
This would output the following
这将输出以下内容
Year,City,Sport,Discipline,NOC,Event,Event gender,Medal
1924,Chamonix,Skating,Figure skating,AUT,individual,M,Silver
1924,Chamonix,Skating,Figure skating,AUT,individual,W,Gold
...
回答by Kathirmani Sukumar
Using pandas it is very simple to read a csv file directly from a url
使用 pandas 直接从 url 读取 csv 文件非常简单
import pandas as pd
data = pd.read_csv('https://example.com/passkey=wedsmdjsjmdd')
This will read your data in tabular format, which will be very easy to process
这将以表格格式读取您的数据,这将非常容易处理
回答by Rodo
You could do it with the requests module as well:
你也可以用 requests 模块做到这一点:
url = 'http://winterolympicsmedals.com/medals.csv'
r = requests.get(url)
text = r.iter_lines()
reader = csv.reader(text, delimiter=',')
回答by The Aelfinn
To increase performance when downloading a large file, the below may work a bit more efficiently:
为了在下载大文件时提高性能,以下可能会更有效地工作:
import requests
from contextlib import closing
import csv
url = "http://download-and-process-csv-efficiently/python.csv"
with closing(requests.get(url, stream=True)) as r:
reader = csv.reader(r.iter_lines(), delimiter=',', quotechar='"')
for row in reader:
# Handle each row here...
print row
By setting stream=Truein the GET request, when we pass r.iter_lines()to csv.reader(), we are passing a generatorto csv.reader(). By doing so, we enable csv.reader() to lazily iterate over each line in the response with for row in reader.
通过stream=True在 GET 请求中设置,当我们传递r.iter_lines()给 csv.reader() 时,我们将一个生成器传递给 csv.reader()。通过这样做,我们使 csv.reader() 能够懒惰地迭代响应中的每一行for row in reader。
This avoids loading the entire file into memory before we start processing it, drastically reducing memory overhead for large files.
这避免了在我们开始处理之前将整个文件加载到内存中,从而大大减少了大文件的内存开销。
回答by user2458922
回答by Adeshina Otayo
what you were trying to do with the curl command was to download the file to your local hard drive(HD). You however need to specify a path on HD
您尝试使用 curl 命令执行的操作是将文件下载到本地硬盘驱动器 (HD)。但是,您需要在 HD 上指定路径
curl http://example.com/passkey=wedsmdjsjmdd -o ./example.csv
cr = csv.reader(open('./example.csv',"r"))
for row in cr:
print row


