macos 使用 AppleScript 保存来自 Safari 的打开网页的源代码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7639512/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Save the source of an open webpage from Safari with AppleScript
提问by
How could I write a script that saves a webpage open in Safari to some path?
我如何编写一个脚本,将在 Safari 中打开的网页保存到某个路径?
(The code will be used for a more complicated script later, so a kludgy solution using System Events won't do.) A lot of googling to find a script that uses the save source function left me pretty uninformed so an answer to this might a the first on the internets. I've pasted some stuff that might be useful below.
(该代码稍后将用于更复杂的脚本,因此使用 System Events 的笨拙解决方案是行不通的。)通过谷歌搜索找到使用 save source 函数的脚本让我很不知情,因此对此的答案可能互联网上的第一个。我在下面粘贴了一些可能有用的东西。
Potentially useful stuff
可能有用的东西
These two entries from the AppleScript dictionary for Safari look useful:
Safari 的 AppleScript 字典中的这两个条目看起来很有用:
document?n[see also Standard Suite] : A Safari document representing the active tab in a window.
properties:
- source(text, r/o) : The HTML source of the web page currently loaded in the document.
- text(text, r/o) : The text of the web page currently loaded in the document. Modifications to text aren't reflected on the web page.
- URL(text) : The current URL of the document.
文件?n[另见标准套件]:表示窗口中活动选项卡的 Safari 文档。
特性:
- source(text, r/o) : 当前加载到文档中的网页的 HTML 源代码。
- text(text, r/o) :文档中当前加载的网页的文本。对文本的修改不会反映在网页上。
- URL(文本):文档的当前 URL。
and later:
然后:
save?v: Save an object.
savespecifier : the object for the command
- [astext] : The file type in which to save the data.
- [inalias] : The file in which to save the object.
保存?v: 保存一个对象。
保存说明符:命令的对象
- [ astext] :保存数据的文件类型。
- [ inalias] :保存对象的文件。
A script that almost does what I want
一个几乎可以做我想做的脚本
This script does save an HTML document, but the output looks broken compared to files saved using Safari's “Export as Page Source” function manually:
这个脚本确实保存了一个 HTML 文档,但与使用 Safari 的“导出为页面源”功能手动保存的文件相比,输出看起来很糟糕:
tell application "Safari"
(* Get a reference to the document *)
set myDoc to document of front window
(* Get the source of the page *)
set mySrc to source of myDoc
(* Get a file name *)
set myName to "Message_" & "0001" & ".html" -- the # will be modified later
tell application "Finder"
(* Get a path to the front window *)
set myPath to (target of front window) as string
(* Get a file path *)
set filePath to myPath & myName
(* Create a brand new file *)
set openRef to open for access (myPath & myName) with write permission
(* Save the document source *)
write mySrc to openRef
(* Close the file *)
close access openRef
end tell
This is what I've written so far:
这是我到目前为止所写的:
Scripts I've written so far
到目前为止我写的脚本
This is my first attempt:
tell application "Safari" set pageToSaveSafariWindowIn to "Q:?:" set pageToBeSaved to front window save document pageToBeSaved as source in alias pageToSaveSafariWindowIn end tell
Here are the resulting logs:
tell application "Safari" get window 1 --> window id 6017 save document (window id 6017) as source in alias "Q:?:" --> error number -1700 from window id 6017 to integer
and
error "Safari got an error: Can't make window id 6017 into type integer." number -1700 from window id 6017 to integer
And another attempt:
tell application "Safari" save source of document in "Q:?:" end tell
which gives the result log:
error "Can't get source of document." number -1728 from ?class conT? of document
这是我的第一次尝试:
tell application "Safari" set pageToSaveSafariWindowIn to "Q:?:" set pageToBeSaved to front window save document pageToBeSaved as source in alias pageToSaveSafariWindowIn end tell
以下是结果日志:
tell application "Safari" get window 1 --> window id 6017 save document (window id 6017) as source in alias "Q:?:" --> error number -1700 from window id 6017 to integer
和
错误“Safari 出现错误:无法将窗口 ID 6017 设为整数类型。” 从窗口 id 6017 到整数的数字 -1700
另一个尝试:
tell application "Safari" save source of document in "Q:?:" end tell
这给出了结果日志:
错误“无法获取文档来源。” 编号 -1728 来自?class conT?文件的
采纳答案by McUsr
This is a way to save a window full of tabs. The original UI handler was written by StefanK aka. Stefan Klieme of Macscripter fame. It considers webarchives file endings, when Safari is in doubt, you can adjust whether you want to overwrite or ignore already written files. It don't save duplicate tabs, and you may set a property to decide whether it shold close the tab when it is saved.
这是一种保存充满选项卡的窗口的方法。最初的 UI 处理程序是由 StefanK aka 编写的。Macscripter 成名的 Stefan Klieme。它会考虑 webarchives 文件的结尾,当 Safari 有疑问时,您可以调整是否要覆盖或忽略已写入的文件。它不保存重复的选项卡,您可以设置一个属性来决定它在保存时是否关闭选项卡。
Please look at MacScripter, a direct link is in the script, for any updates.
请查看 MacScripter,脚本中有一个直接链接,用于任何更新。
You can overcourse use wget, but I settled for UI Scripting, as wget has download stuff that is already in your browser, and is a kluge to program as well.
您可以过度使用 wget,但我选择了 UI 脚本,因为 wget 可以下载浏览器中已有的内容,并且也是编程的一大障碍。
property tlvl : me # Release 1.0.1 # ? 2012 McUsr and put in Public Domain under GPL 1.0 # Please refer to this post: http://macscripter.net/post.php?tid=30892 property shallClose : false # set this to false if you don't want to close the windows, just saving them property dontOverWriteSavedTabs : false # set this to true if you don't want to overwrite already saved tabs in the folder script saveTabsInSafariWindowsToFolder property parent : AppleScript property scripttitle : "SafariSaveTabs" on run if downloadWindowInFront() then return 0 # activates Safari local script_cache set script_cache to my storage's scriptCache() set saveFolder to POSIX path of (getHFSFolder({theMessage:"Choose or create folder to save Safari-tabs in.", hfsPath:DefaultLocation of script_cache as alias})) if saveFolder = false then return 0 -- we were obviously mistaken, about what we wanted to do. my storage's saveParenFolderInScriptCache(saveFolder, script_cache) tell application "Safari" tell its window 1 local tabc, oldidx set tabc to count tabs of it if not tlvl's shallClose then set oldidx to index of current tab tell tab tabc to do JavaScript "self.focus()" end if local saveCounter set saveCounter to 1 -- regulates setting of save folder to only first time in Safari. repeat while tabc > 0 local theUrl, theIdx, theProtocol, alreadyClosed set {theUrl, theIdx, alreadyClosed} to {URL of its current tab, index of its current tab, false} if my isntAduplicateTab(theIdx, it) then set theProtocol to my urlprotocol(theUrl) if theProtocol is in {"http", "https"} then # save it set saveCounter to my saveCurrentTab(saveFolder, saveCounter) else if theProtocol is "file" then # make an alias of it my makeAliasForAFurl(saveFolder, theUrl) end if else if tlvl's shallClose then close current tab set alreadyClosed to true end if end if if not alreadyClosed and tlvl's shallClose then close current tab of it set tabc to tabc - 1 else if not tlvl's shallClose then set tabc to tabc - 1 if tabc > 0 then tell tab tabc to do JavaScript "self.focus()" end if end repeat # move forwards if not tlvl's shallClose then tell tab oldidx to do JavaScript "self.focus()" end if end tell end tell end run to makeAliasForAFurl(destinationFolder, furl) local ti, tids, thefilePath set ti to "file://" set {tids, AppleScript's text item delimiters} to {AppleScript's text item delimiters, ti} set thefilePath to text item 2 of furl set AppleScript's text item delimiters to tids set theFile to POSIX file thefilePath as alias set theFolder to POSIX file destinationFolder tell application "Finder" make alias at theFolder to theFile # I don't care if there was one there from before, as it could equally # be a file with the same name. end tell end makeAliasForAFurl to saveCurrentTab(destinationFolder, timeNumber) tell application id "sfri" to activate tell application "System Events" set UI elements enabled to true tell process "Safari" keystroke "s" using {command down} tell window 1 repeat until exists sheet 1 delay 0.2 end repeat tell sheet 1 if timeNumber = 1 then -- We'll set the savepath upon first call keystroke "g" using {command down, shift down} repeat until exists sheet 1 delay 0.2 end repeat tell sheet 1 set value of text field 1 to destinationFolder click button 1 delay 0.1 end tell end if keystroke return delay 0.2 if exists sheet 1 then -- We are being asked if we want to overwrite already saved tab if dontOverWriteSavedTabs then keystroke return # if it was already saved. We don't overwrite it click button 3 else keystroke tab keystroke space # we are to overwrite end if else try set dummy to focused of sheet 1 on error # click button 1 of panel of application "Safari" keystroke return delay 0.2 if exists sheet 1 then -- We are being asked if we want to overwrite already saved tab if dontOverWriteSavedTabs then keystroke return # if it was already saved. We don't overwrite it click button 3 else keystroke tab keystroke space # we are to overwrite end if end if end try end if end tell end tell end tell end tell set timeNumber to timeNumber + 1 return timeNumber end saveCurrentTab on downloadWindowInFront() tell application "Safari" activate set tabCount to count tabs of its window 1 if tabCount 0 then set colons to true if (offset of "/" in aPath) > 0 then set slashes to true if colons and slashes then return null else if colons then set origDelims to ":" else if slashes then set origDelims to "/" else return null end if local tids set {tids, AppleScript's text item delimiters} to {AppleScript's text item delimiters, origDelims} if aPath = "/" then -- we return root when we get root set AppleScript's text item delimiters to tids return "/" end if local theParentFolder if text -1 of aPath is in {":", "/"} then set theParentFolder to text items 1 thru -2 of text 1 thru -2 of aPath else set theParentFolder to text items 1 thru -2 of aPath end if set theParentFolder to theParentFolder as text if slashes and theParentFolder = "" then set theParentFolder to "/" -- sets the root path if we got a folder one level below it if colons and (":" is not in theParentFolder) then set theParentFolder to theParentFolder & ":" -- we return volumename, if we are given volumename set AppleScript's text item delimiters to tids return theParentFolder end parentfolder script storage property cachespath : ((path to library folder from user domain as text) & "caches:" & "net.mcusr." & scripttitle) on scriptCache() local script_cache try set script_cache to load script alias (my cachespath) on error script newScriptCache property DefaultLocation : (path to desktop folder as text) # edit any of those with default values end script set script_cache to newScriptCache end try return script_cache end scriptCache to saveScriptCache(theCache) store script theCache in my cachespath replacing yes end saveScriptCache to saveParenFolderInScriptCache(theFolderToSaveIn, script_cache) local containingFolder set containingFolder to (parentfolder of saveTabsInSafariWindowsToFolder for theFolderToSaveIn) & "/" local theLoc set theLoc to POSIX file containingFolder as alias set DefaultLocation of script_cache to theLoc my saveScriptCache(script_cache) end saveParenFolderInScriptCache end script end script tell saveTabsInSafariWindowsToFolder to run
Enjoy
享受
回答by DonCristobal
I have found what I believe to be a better / easier solution:
我找到了我认为更好/更简单的解决方案:
tell application "Safari"
activate
set URL of document 1 to "http://www.apple.com"
delay 5
set myString to source of document 1
end tell
set newFile to POSIX file "/Users/myUsername/test.html"
open for access newFile with write permission
write myString to newFile
close access newFile
Notes:
笔记:
"source of document 1" seems to be filled with the correct source text only AFTER the web page is fully loaded. Thus the need for the delay. Maybe you can use a lower delay.
There are some solutions which recommend the use of curl. I haven't tried this, but I assume that for dynamically generated pages this could be problematic.
The above works on OSX 10.8.4. Not tested for other versions.
“文档 1 的来源”似乎只有在网页完全加载后才填充了正确的源文本。因此需要延迟。也许您可以使用较低的延迟。
有一些解决方案推荐使用 curl。我没有试过这个,但我认为对于动态生成的页面这可能是有问题的。
以上适用于 OSX 10.8.4。其他版本未测试。
回答by double_j
set hyperlink to "http://www.google.com/"
set sourceCode to (do shell script "curl " & hyperlink)
do shell script "echo " & quoted form of sourceCode & " >> /Users/name/Desktop/test.csv"
You can throw this in a repeat and it will append each source code from every listed stite to the end of your created document. i.e.
您可以重复此操作,它会将每个列出的站点的每个源代码附加到您创建的文档的末尾。IE
set hyperlink to "http://www.aRepetitivePageSite.com/2014?page="
set your_count to 1
repeat until your_count = 10
set sourceCode to (do shell script "curl " & (hyperlink & your_count as string as text))
do shell script "echo " & quoted form of sourceCode & " >> /Users/name/Desktop/test.csv"
set your_count to your_count + 1
end repeat
回答by gadgetmo
Automator will do that. Here is the workflow - http://cl.ly/450m0Q21463p16322P1i.
Automator 会这样做。这是工作流程 - http://cl.ly/450m0Q21463p16322P1i。
Automator -> Actions -> Internet -> Get Current Webpage from Safari
-> Download Urls
.
Automator -> Actions -> Internet -> Get Current Webpage from Safari
-> Download Urls
.
回答by Simon White
If you were to do this task manually, you would View Source in Safari, Copy the source to the clipboard, go into an HTML source code editor and make a new document, Paste the source code in, choose Save and navigate to the Documents folder, name the document, and then Save it.
如果您要手动执行此任务,您将在 Safari 中查看源代码,将源代码复制到剪贴板,进入 HTML 源代码编辑器并创建一个新文档,粘贴源代码,选择保存并导航到文档文件夹,为文档命名,然后保存。
So when you want to write an AppleScript to do this task, a key thing is that you still want to use those same apps, but instead of running them manually, you will run them with AppleScript. A great AppleScriptable HTML source code editor is TextWrangler, which is free from Mac App Store.
因此,当您想编写 AppleScript 来完成此任务时,关键是您仍然希望使用相同的应用程序,但不是手动运行它们,而是使用 AppleScript 运行它们。一个很棒的 AppleScriptable HTML 源代码编辑器是 TextWrangler,它可以从 Mac App Store 免费下载。
Once you have both a Web browser (Safari) to get the HTML source from the network and an HTML source code editor (TextWrangler) to create and Save the HTML document, you can write a very small, very easy to write, very easy to read, very easy to maintain AppleScript like this one:
一旦您拥有从网络获取 HTML 源代码的 Web 浏览器 (Safari) 和用于创建和保存 HTML 文档的 HTML 源代码编辑器 (TextWrangler),您就可以编写一个非常小、非常容易编写、非常容易阅读,很容易像这样维护 AppleScript:
tell application "Safari"
activate
if document 1 exists then
set theDocumentTitle to the name of document 1
set theDocumentSource to the source of document 1
tell application "TextWrangler"
activate
set theNewDocument to make new document with properties {name:theDocumentTitle, text:theDocumentSource}
set theDocumentsFolderPath to the path to the documents folder as text
set theSaveFilePath to theDocumentsFolderPath & theDocumentTitle & ".html"
save theNewDocument to file theSaveFilePath
end tell
end if
end tell
… which will simply ask Safari to provide the name and source code of its frontmost document, and then ask TextWrangler to use that information to create and Save a matching HTML document in your Documents folder. Those are tasks that those 2 apps are each very good at. You sort of don't have to ask twice or do a lot of explaining.
...这将简单地要求 Safari 提供其最前面文档的名称和源代码,然后要求 TextWrangler 使用该信息在您的 Documents 文件夹中创建和保存匹配的 HTML 文档。这些是这两个应用程序都非常擅长的任务。你有点不必问两次或做很多解释。