vba 带有标记的 HTML 文本,用于 Excel 单元格中的格式化文本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9999713/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
HTML Text with tags to formatted text in an Excel cell
提问by Kevin McGovern
Is there a way to take HTML and import it to excel so that it is formatted as rich text (preferably by using VBA)? Basically, when I paste to an Excel cell, I'm looking to turn this:
有没有办法将 HTML 导入到 Excel 中,以便将其格式化为富文本(最好使用 VBA)?基本上,当我粘贴到 Excel 单元格时,我希望将其转换为:
<html><p>This is a test. Will this text be <b>bold</b> or <i>italic</i></p></html>
into this:
进入这个:
This is a test. Will this text be boldor italic
这是一个测试。这段文字是粗体还是斜体
采纳答案by Siddharth Rout
Yes it is possible :) In fact let Internet Explorer do the dirty work for you ;)
是的,这是可能的 :) 事实上,让 Internet Explorer 为你做那些肮脏的工作 ;)
TRIED AND TESTED
久经考验
MY ASSUMPTIONS
我的假设
- I am assuming that the html text is in Cell A1 of Sheet1. You can also use a variable instead.
- If you have a column full of html values, then simply put the below code in a loop
- 我假设 html 文本位于 Sheet1 的单元格 A1 中。您也可以改用变量。
- 如果您有一列充满 html 值,那么只需将以下代码放入循环中
CODE
代码
Sub Sample()
Dim Ie As Object
Set Ie = CreateObject("InternetExplorer.Application")
With Ie
.Visible = False
.Navigate "about:blank"
.document.body.InnerHTML = Sheets("Sheet1").Range("A1").Value
.document.body.createtextrange.execCommand "Copy"
ActiveSheet.Paste Destination:=Sheets("Sheet1").Range("A1")
.Quit
End With
End Sub
SNAPSHOT
快照
HTH
HTH
Sid
锡德
回答by Dick Kusleika
You can copy the HTML code to the clipboard and paste special it back as Unicode text. Excel will render the HTML in the cell. Check out this post http://www.dailydoseofexcel.com/archives/2005/02/23/html-in-cells-ii/
您可以将 HTML 代码复制到剪贴板,然后将其作为 Unicode 文本粘贴回去。Excel 将在单元格中呈现 HTML。查看这篇文章http://www.dailydoseofexcel.com/archives/2005/02/23/html-in-cells-ii/
The relevant macro code from the post:
帖子中的相关宏代码:
Private Sub Worksheet_Change(ByVal Target As Range)
Dim objData As DataObject
Dim sHTML As String
Dim sSelAdd As String
Application.EnableEvents = False
If Target.Cells.Count = 1 Then
If LCase(Left(Target.Text, 6)) = "<html>" Then
Set objData = New DataObject
sHTML = Target.Text
objData.SetText sHTML
objData.PutInClipboard
sSelAdd = Selection.Address
Target.Select
Me.PasteSpecial "Unicode Text"
Me.Range(sSelAdd).Select
End If
End If
Application.EnableEvents = True
End Sub
回答by hehret
I ran into the same error that BornToCode first identified in the comments of the original solution. Being unfamiliar with Excel and VBA it took me a second to figure out how to implement tiQU's solution. So I'm posting it as a "For Dummies" solution below
我遇到了与 BornToCode 最初在原始解决方案的注释中发现的相同的错误。由于不熟悉 Excel 和 VBA,我花了一秒钟才弄清楚如何实施 tiQU 的解决方案。所以我将它作为“傻瓜”解决方案发布在下面
- First enable developer mode in Excel: Link
- Select the Developer Tab > Visual Basic
- Click View > Code
- Paste the code below updating the lines that require cell references to be correct.
- Click the Green Run Arrow or press F5
- 首先在 Excel 中启用开发人员模式:链接
- 选择开发人员选项卡 > Visual Basic
- 单击查看 > 代码
- 粘贴下面的代码更新需要单元格引用正确的行。
- 单击绿色运行箭头或按 F5
Sub Sample()
Dim Ie As Object
Set Ie = CreateObject("InternetExplorer.Application")
With Ie
.Visible = False
.Navigate "about:blank"
.document.body.InnerHTML = Sheets("Sheet1").Range("I2").Value
'update to the cell that contains HTML you want converted
.ExecWB 17, 0
'Select all contents in browser
.ExecWB 12, 2
'Copy them
ActiveSheet.Paste Destination:=Sheets("Sheet1").Range("J2")
'update to cell you want converted HTML pasted in
.Quit
End With
End Sub
回答by ozmike
If the IE example doesn't work use this one. Anyway this should be faster than starting up an instance of IE.
如果 IE 示例不起作用,请使用此示例。无论如何,这应该比启动 IE 实例更快。
Here is a complete solution based on
http://www.dailydoseofexcel.com/archives/2005/02/23/html-in-cells-ii/
这里是一个完整的解决方案基于
http://www.dailydoseofexcel.com/archives/2005/02/23/html-in-cells-ii/
Note, if your innerHTML is all numbers eg '12345', HTML formatting dosen't fully work in excel as it treats number differently? but add a character eg a trailing space at the end eg. 12345 + "& nbsp;" formats ok.
请注意,如果您的 innerHTML 都是数字,例如“12345”,HTML 格式在 excel 中不能完全工作,因为它对数字的处理方式不同?但在末尾添加一个字符,例如尾随空格,例如。12345+" 格式正常。
Sub test()
Cells(1, 1).Value = "<HTML>1<font color=blue>a</font>" & _
"23<font color=red>4</font></HTML>"
Dim rng As Range
Set rng = ActiveSheet.Cells(1, 1)
Worksheet_Change rng, ActiveSheet
End Sub
Private Sub Worksheet_Change(ByVal Target As Range, ByVal sht As Worksheet)
Dim objData As DataObject ' Set a reference to MS Forms 2.0
Dim sHTML As String
Dim sSelAdd As String
Application.EnableEvents = False
If Target.Cells.Count = 1 Then
Set objData = New DataObject
sHTML = Target.Text
objData.SetText sHTML
objData.PutInClipboard
Target.Select
sht.PasteSpecial Format:="Unicode Text"
End If
Application.EnableEvents = True
End Sub
回答by tiQu
I know this thread is ancient, but after assigning the innerHTML, ExecWB worked for me:
我知道这个线程很古老,但是在分配了 innerHTML 之后,ExecWB 对我来说有效:
.ExecWB 17, 0
'Select all contents in browser
.ExecWB 12, 2
'Copy them
And then just paste the contents into Excel. Since these methods are prone to runtime errors, but work fine after one or two tries in debug mode, you might have to tell Excel to try again if it runs into an error. I solved this by adding this error handler to the sub, and it works fine:
然后只需将内容粘贴到 Excel 中即可。由于这些方法容易出现运行时错误,但在调试模式下尝试一两次后工作正常,如果遇到错误,您可能必须告诉 Excel 再试一次。我通过将这个错误处理程序添加到 sub 来解决这个问题,它工作正常:
Sub ApplyHTML()
On Error GoTo ErrorHandler
...
Exit Sub
ErrorHandler:
Resume
'I.e. re-run the line of code that caused the error
Exit Sub
End Sub
回答by E. Ledding
Nice! Very slick.
好的!很滑。
I was disappointed that Excel doesn't let us paste to a merged cell and also pastes results containing a break into successive rows below the "target" cell though, as that meant it simply doesn't work for me. I tried a few tweaks (unmerge/remerge, etc.) but then Excel dropped anything below a break, so that was a dead end.
我很失望 Excel 不允许我们粘贴到合并的单元格,并且还将包含中断的结果粘贴到“目标”单元格下方的连续行中,因为这意味着它根本不适用于我。我尝试了一些调整(取消合并/重新合并等),但随后 Excel 将任何内容都放在了休息时间以下,所以这是一个死胡同。
Ultimately, I came up with a routine that'll handle simple tags and not use the "native" Unicode converter that is causing the issue with merged fields. Hope others find this useful:
最终,我想出了一个处理简单标签的例程,而不使用导致合并字段问题的“本机”Unicode 转换器。希望其他人觉得这很有用:
Public Sub AddHTMLFormattedText(rngA As Range, strHTML As String, Optional blnShowBadHTMLWarning As Boolean = False)
' Adds converts text formatted with basic HTML tags to formatted text in an Excel cell
' NOTE: Font Sizes not handled perfectly per HTML standard, but I find this method more useful!
Dim strActualText As String, intSrcPos As Integer, intDestPos As Integer, intDestSrcEquiv() As Integer
Dim varyTags As Variant, varTag As Variant, varEndTag As Variant, blnTagMatch As Boolean
Dim intCtr As Integer
Dim intStartPos As Integer, intEndPos As Integer, intActualStartPos As Integer, intActualEndPos As Integer
Dim intFontSizeStartPos As Integer, intFontSizeEndPos As Integer, intFontSize As Integer
varyTags = Array("<b>", "</b>", "<i>", "</i>", "<u>", "</u>", "<sub>", "</sub>", "<sup>", "</sup>")
' Remove unhandled/unneeded tags, convert <br> and <p> tags to line feeds
strHTML = Trim(strHTML)
strHTML = Replace(strHTML, "<html>", "")
strHTML = Replace(strHTML, "</html>", "")
strHTML = Replace(strHTML, "<p>", "")
While LCase(Right$(strHTML, 4)) = "</p>" Or LCase(Right$(strHTML, 4)) = "<br>"
strHTML = Left$(strHTML, Len(strHTML) - 4)
strHTML = Trim(strHTML)
Wend
strHTML = Replace(strHTML, "<br>", vbLf)
strHTML = Replace(strHTML, "</p>", vbLf)
strHTML = Trim(strHTML)
ReDim intDestSrcEquiv(1 To Len(strHTML))
strActualText = ""
intSrcPos = 1
intDestPos = 1
Do While intSrcPos <= Len(strHTML)
blnTagMatch = False
For Each varTag In varyTags
If LCase(Mid$(strHTML, intSrcPos, Len(varTag))) = varTag Then
blnTagMatch = True
intSrcPos = intSrcPos + Len(varTag)
If intSrcPos > Len(strHTML) Then Exit Do
Exit For
End If
Next
If blnTagMatch = False Then
varTag = "<font size"
If LCase(Mid$(strHTML, intSrcPos, Len(varTag))) = varTag Then
blnTagMatch = True
intEndPos = InStr(intSrcPos, strHTML, ">")
intSrcPos = intEndPos + 1
If intSrcPos > Len(strHTML) Then Exit Do
Else
varTag = "</font>"
If LCase(Mid$(strHTML, intSrcPos, Len(varTag))) = varTag Then
blnTagMatch = True
intSrcPos = intSrcPos + Len(varTag)
If intSrcPos > Len(strHTML) Then Exit Do
End If
End If
End If
If blnTagMatch = False Then
strActualText = strActualText & Mid$(strHTML, intSrcPos, 1)
intDestSrcEquiv(intSrcPos) = intDestPos
intDestPos = intDestPos + 1
intSrcPos = intSrcPos + 1
End If
Loop
' Clear any bold/underline/italic/superscript/subscript formatting from cell
rngA.Font.Bold = False
rngA.Font.Underline = False
rngA.Font.Italic = False
rngA.Font.Subscript = False
rngA.Font.Superscript = False
rngA.Value = strActualText
' Now start applying Formats!"
' Start with Font Size first
intSrcPos = 1
intDestPos = 1
Do While intSrcPos <= Len(strHTML)
varTag = "<font size"
If LCase(Mid$(strHTML, intSrcPos, Len(varTag))) = varTag Then
intFontSizeStartPos = InStr(intSrcPos, strHTML, """") + 1
intFontSizeEndPos = InStr(intFontSizeStartPos, strHTML, """") - 1
If intFontSizeEndPos - intFontSizeStartPos <= 3 And intFontSizeEndPos - intFontSizeStartPos > 0 Then
Debug.Print Mid$(strHTML, intFontSizeStartPos, intFontSizeEndPos - intFontSizeStartPos + 1)
If Mid$(strHTML, intFontSizeStartPos, 1) = "+" Then
intFontSizeStartPos = intFontSizeStartPos + 1
intFontSize = 11 + 2 * Mid$(strHTML, intFontSizeStartPos, intFontSizeEndPos - intFontSizeStartPos + 1)
ElseIf Mid$(strHTML, intFontSizeStartPos, 1) = "-" Then
intFontSizeStartPos = intFontSizeStartPos + 1
intFontSize = 11 - 2 * Mid$(strHTML, intFontSizeStartPos, intFontSizeEndPos - intFontSizeStartPos + 1)
Else
intFontSize = Mid$(strHTML, intFontSizeStartPos, intFontSizeEndPos - intFontSizeStartPos + 1)
End If
Else
' Error!
GoTo HTML_Err
End If
intEndPos = InStr(intSrcPos, strHTML, ">")
intSrcPos = intEndPos + 1
intStartPos = intSrcPos
If intSrcPos > Len(strHTML) Then Exit Do
While intDestSrcEquiv(intStartPos) = 0 And intStartPos < Len(strHTML)
intStartPos = intStartPos + 1
Wend
If intStartPos >= Len(strHTML) Then GoTo HTML_Err ' HTML is bad!
varEndTag = "</font>"
intEndPos = InStr(intSrcPos, LCase(strHTML), varEndTag)
If intEndPos = 0 Then GoTo HTML_Err ' HTML is bad!
While intDestSrcEquiv(intEndPos) = 0 And intEndPos > intSrcPos
intEndPos = intEndPos - 1
Wend
If intEndPos > intSrcPos Then
intActualStartPos = intDestSrcEquiv(intStartPos)
intActualEndPos = intDestSrcEquiv(intEndPos)
rngA.Characters(intActualStartPos, intActualEndPos - intActualStartPos + 1) _
.Font.Size = intFontSize
End If
End If
intSrcPos = intSrcPos + 1
Loop
'Now do remaining tags
intSrcPos = 1
intDestPos = 1
Do While intSrcPos <= Len(strHTML)
If intDestSrcEquiv(intSrcPos) = 0 Then
' This must be a Tag!
For intCtr = 0 To UBound(varyTags) Step 2
varTag = varyTags(intCtr)
intStartPos = intSrcPos + Len(varTag)
While intDestSrcEquiv(intStartPos) = 0 And intStartPos < Len(strHTML)
intStartPos = intStartPos + 1
Wend
If intStartPos >= Len(strHTML) Then GoTo HTML_Err ' HTML is bad!
If LCase(Mid$(strHTML, intSrcPos, Len(varTag))) = varTag Then
varEndTag = varyTags(intCtr + 1)
intEndPos = InStr(intSrcPos, LCase(strHTML), varEndTag)
If intEndPos = 0 Then GoTo HTML_Err ' HTML is bad!
While intDestSrcEquiv(intEndPos) = 0 And intEndPos > intSrcPos
intEndPos = intEndPos - 1
Wend
If intEndPos > intSrcPos Then
intActualStartPos = intDestSrcEquiv(intStartPos)
intActualEndPos = intDestSrcEquiv(intEndPos)
With rngA.Characters(intActualStartPos, intActualEndPos - intActualStartPos + 1).Font
If varTag = "<b>" Then
.Bold = True
ElseIf varTag = "<i>" Then
.Italic = True
ElseIf varTag = "<u>" Then
.Underline = True
ElseIf varTag = "<sup>" Then
.Superscript = True
ElseIf varTag = "<sub>" Then
.Subscript = True
End If
End With
End If
intSrcPos = intSrcPos + Len(varTag) - 1
Exit For
End If
Next
End If
intSrcPos = intSrcPos + 1
intDestPos = intDestPos + 1
Loop
Exit_Sub:
Exit Sub
HTML_Err:
' There was an error with the Tags. Show warning if requested.
If blnShowBadHTMLWarning Then
MsgBox "There was an error with the Tags in the HTML file. Could not apply formatting."
End If
End Sub
Note this doesn't care about tag nesting, instead only requiring a close tag for every open tag, and assuming the close tag nearest the opening tag applies to the opening tag. Properly nested tags will work fine, while improperly nested tags will not be rejected and may or may not work.
请注意,这并不关心标签嵌套,而是只需要为每个打开标签提供一个结束标签,并假设最接近开始标签的结束标签适用于开始标签。正确嵌套的标签可以正常工作,而不正确嵌套的标签不会被拒绝,并且可能会也可能不会起作用。
回答by John Kurtz
To put HTML/Word in an Excel Shape and locate it on an Excel Cell:
将 HTML/Word 放入 Excel 形状并将其定位在 Excel 单元格中:
- Write my HTML to a temp file.
- Open temp file via Word Interop.
- Copy it from Word to clipboard.
- Open Excel via Interop.
- Set and Select a cell to a range.
- PasteSpecial as a "Microsoft Word Document Object"
- Adjust the excel row to the Shape height.
- 将我的 HTML 写入临时文件。
- 通过 Word Interop 打开临时文件。
- 将其从 Word 复制到剪贴板。
- 通过 Interop 打开 Excel。
- 设置并选择一个单元格到一个范围。
- PasteSpecial 作为“Microsoft Word 文档对象”
- 将 excel 行调整为形状高度。
In this way, even HTML with tables and other stuff does not get split over multiple cells.
这样,即使是带有表格和其他内容的 HTML 也不会拆分到多个单元格中。
private void btnPutHTMLIntoExcelShape_Click(object sender, EventArgs e)
{
var fFile = new FileInfo(@"C:\Temp\temp.html");
StreamWriter SW = fFile.CreateText();
SW.Write(hecNote.DocumentHtml);
SW.Close();
Word.Application wrdApplication;
Word.Document wrdDocument;
wrdApplication = new Word.Application();
wrdApplication.Visible = true;
wrdDocument = wrdApplication.Documents.Add(@"C:\Temp\temp.html");
wrdDocument.ActiveWindow.Selection.WholeStory();
wrdDocument.ActiveWindow.Selection.Copy();
Excel.Application excApplication;
Excel.Workbook excWorkbook;
Excel._Worksheet excWorksheet;
Excel.Range excRange = null;
excApplication = new Excel.Application();
excApplication.Visible = true;
excWorkbook = excApplication.Workbooks.Add(Type.Missing);
excWorksheet = (Excel.Worksheet)excWorkbook.Worksheets.get_Item(1);
excWorksheet.Name = "Work";
excRange = excWorksheet.get_Range("A1");
excRange.Select();
excWorksheet.PasteSpecial("Microsoft Word Document Object");
Excel.Shape O = excWorksheet.Shapes.Item(1);
this.Text = $"{O.Height} x {O.Width}";
((Excel.Range)excWorksheet.Rows[1, Type.Missing]).RowHeight = O.Height;
}