如何使用 Scala 从互联网下载和保存文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24162478/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 06:19:59  来源:igfitidea点击:

How to download and save a file from the internet using Scala?

scaladownload

提问by slizorn

Basically I have a url/link to a text file online and I am trying to download it locally. For some reason, the text file that gets created/downloaded is blank. Open to any suggestions. Thanks!

基本上我有一个指向在线文本文件的 url/链接,我正在尝试在本地下载它。出于某种原因,创建/下载的文本文件是空白的。对任何建议持开放态度。谢谢!

    def downloadFile(token: String, fileToDownload: String) {

    val url = new URL("http://randomwebsite.com/docs?t=" + token + "&p=tsr%2F" + fileToDownload)
    val connection = url.openConnection().asInstanceOf[HttpURLConnection]
    connection.setRequestMethod("GET")
    val in: InputStream = connection.getInputStream
    val fileToDownloadAs = new java.io.File("src/test/resources/testingUpload1.txt")
    val out: OutputStream = new BufferedOutputStream(new FileOutputStream(fileToDownloadAs))
    val byteArray = Stream.continually(in.read).takeWhile(-1 !=).map(_.toByte).toArray
    out.write(byteArray)
    }

采纳答案by Eric Hydrick

Flush the buffer and then close your output stream.

刷新缓冲区,然后关闭输出流。

回答by Chetan Bhasin

I know this is an old question, but I just came across a really nice way of doing this :

我知道这是一个老问题,但我只是遇到了一个非常好的方法:

import sys.process._
import java.net.URL
import java.io.File

def fileDownloader(url: String, filename: String) = {
    new URL(url) #> new File(filename) !!
}

Hope this helps. Source.

希望这可以帮助。来源

You can now simply use fileDownloader function to download the files.

您现在可以简单地使用 fileDownloader 功能下载文件。

fileDownloader("http://ir.dcs.gla.ac.uk/resources/linguistic_utils/stop_words", "stop-words-en.txt")

回答by Herrington Darkholme

Here is a naive implementation by scala.io.Source.fromURLand java.io.FileWriter

下面是一个天真的实现scala.io.Source.fromURLjava.io.FileWriter

def downloadFile(token: String, fileToDownload: String) {
  try {
    val src = scala.io.Source.fromURL("http://randomwebsite.com/docs?t=" + token + "&p=tsr%2F" + fileToDownload)
    val out = new java.io.FileWriter("src/test/resources/testingUpload1.txt")
    out.write(src.mkString)
    out.close
  } catch {
    case e: java.io.IOException => "error occured"
  }
}

Your code works for me... There are other possibilities that make empty file.

你的代码对我有用......还有其他可能性可以制作空文件。

回答by Xavier Guihot

Here is a safer alternative to new URL(url) #> new File(filename) !!:

这是一个更安全的替代方案new URL(url) #> new File(filename) !!

val url = new URL(urlOfFileToDownload)

val connection = url.openConnection().asInstanceOf[HttpURLConnection]
connection.setConnectTimeout(5000)
connection.setReadTimeout(5000)
connection.connect()

if (connection.getResponseCode >= 400)
  println("error")
else
  url #> new File(fileName) !!


Two things:

两件事情:

  • When downloading from an URLobject, if an error (404for instance) is returned, then the URLobject will throw a FileNotFoundException. And since this exception is generated from another thread (as URLhappens to run on a separate thread), a simple Tryor try/catchwon't be able to catch the exception. Thus the preliminary check for the response code: if (connection.getResponseCode >= 400).
  • As a consequence of checking the response code, the connection might sometimes get stuck opened indefinitely for improper pages (as explained here). This can be avoided by setting a timeout on the connection: connection.setReadTimeout(5000).
  • URL对象下载时,如果404返回错误(例如),则该URL对象将抛出FileNotFoundException. 并且由于这个异常是从另一个线程生成的(URL恰好在一个单独的线程上运行),一个简单的Trytry/catch将无法捕获异常。因此初步检查响应代码if (connection.getResponseCode >= 400)
  • 由于检查响应代码的结果,在连接处可能有时会卡住打开无限期不当的网页(如解释在这里)。这可以通过在连接上设置超时来避免:connection.setReadTimeout(5000)