错误:XML 内容似乎不是 XML | R 3.1.0
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23584514/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Error: XML Content does not seem to be XML | R 3.1.0
提问by
I am trying to get this XML file, but am unable to. I checked the other solutions in the same topic, but I couldn't understand. I am a R newbie.
我正在尝试获取此 XML 文件,但无法获取。我检查了同一主题中的其他解决方案,但我无法理解。我是R新手。
> library(XML)
> fileURL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml"
> doc <- xmlTreeParse(fileURL,useInternal=TRUE)
Error: XML content does not seem to be XML: 'https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml'
错误:XML 内容似乎不是 XML:' https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml'
Can you please help?
你能帮忙吗?
回答by Rich Scriven
Remove the sfrom https
取出s从https
library(XML)
fileURL<-"https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml"
doc <- xmlTreeParse(sub("s", "", fileURL), useInternal = TRUE)
class(doc)
## [1] "XMLInternalDocument" "XMLAbstractDocument"
回答by jdharrison
You can use RCurlto fetch the content and then XML seems to be able to handle it
您可以使用RCurl来获取内容,然后 XML 似乎能够处理它
library(XML)
library(RCurl)
fileURL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml"
xData <- getURL(fileURL)
doc <- xmlParse(xData)
回答by kaarefc
xmlTreeParse does not support https.
xmlTreeParse 不支持 https。
You can load the data with getURL(from RCurl) and then parse it.
您可以使用getURL(from RCurl)加载数据,然后对其进行解析。
回答by Atul Kumar
Answer is at http://www.omegahat.net/RCurl/installed/RCurl/html/getURL.html. Key point is to use ssl.verifyPeer=FALSE with getURL if certificate error is shown.
答案在http://www.omegahat.net/RCurl/installed/RCurl/html/getURL.html。如果显示证书错误,关键点是将 ssl.verifyPeer=FALSE 与 getURL 一起使用。
library (RCurl)
library (XML)
curlVersion()$features
curlVersion()$protocol
##These should show ssl and https. I can see these on windows 8.1 at least.
##It may differ on other OSes.
temp <- getURL("https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml", ssl.verifyPeer=FALSE)
DFX <- xmlTreeParse(temp,useInternal = TRUE)
If ssl or https capability is not shown by libcurl functions, check using Rcurl with HTTPs.
如果 libcurl 函数未显示 ssl 或 https 功能,请使用 Rcurl 和 HTTPs检查。

