使用python将文本文件转换为html文件

Question

提问by user3832061

I have a text file that contains :

我有一个包含以下内容的文本文件：

JavaScript              0
/AA                     0
OpenAction              1
AcroForm                0
JBIG2Decode             0
RichMedia               0
Launch                  0
Colors>2^24             0
uri                     0

I wrote this code to convert the text file to html :

我写了这段代码来将文本文件转换为 html ：

contents = open("C:\Users\Suleiman JK\Desktop\Static_hash\test","r")
    with open("suleiman.html", "w") as e:
        for lines in contents.readlines():
            e.write(lines + "<br>\n")

but the problem that I had in html file that in each line there is no space between the two columns:

但是我在 html 文件中遇到的问题是，每行两列之间没有空格：

JavaScript 0
/AA 0
OpenAction 1
AcroForm 0
JBIG2Decode 0
RichMedia 0
Launch 0
Colors>2^24 0
uri 0

what should I do to have the same content and the two columns like in text file

我应该怎么做才能拥有相同的内容和文本文件中的两列

Answer 1

采纳答案by u3311882

Just change your code to include <pre>and </pre>tags to ensure that your text stays formatted the way you have formatted it in your original text file.

只需更改您的代码以包含<pre>和</pre>标记，以确保您的文本保持格式与您在原始文本文件中的格式相同。

contents = open"C:\Users\Suleiman JK\Desktop\Static_hash\test","r")
with open("suleiman.html", "w") as e:
    for lines in contents.readlines():
        e.write("<pre>" + lines + "</pre> <br>\n")

Answer 2

回答by neil

That is because HTML parsers collapse all whitespace. There are two ways you could do it (well probably many more).

那是因为 HTML 解析器会折叠所有空白。有两种方法可以做到（可能还有更多）。

One would be to flag it as "preformatted text" by putting it in <pre>...</pre>tags.

一种方法是将其放入<pre>...</pre>标签中，将其标记为“预先格式化的文本” 。

The other would be a table (and this is what a table ismade for):

另一个将是一张桌子（这就是一张桌子的用途）：

<table>
  <tr><td>Javascript</td><td>0</td></tr>
  ...
</table>

Fairly tedious to type out by hand, but easy to generate from your script. Something like this should work:

手动输入相当乏味，但很容易从您的脚本中生成。这样的事情应该工作：

contents = open("C:\Users\Suleiman JK\Desktop\Static_hash\test","r")
with open("suleiman.html", "w") as e:
    e.write("<table>\n")   
    for lines in contents.readlines():
        e.write("<tr><td>%s</td><td>%s</td></tr>\n"%lines.split())
    e.write("</table>\n")

Answer 3

回答by Burhan Khalid

You can use a standalone template library like makoor jinja. Here is an example with jinja:

您可以使用独立的模板库，如mako或jinja。这是一个 jinja 的例子：

from jinja2 import Template
c = '''<!doctype html>
<html>
<head>
    <title>My Title</title>
</head>
<body>
<table>
   <thead>
       <tr><th>Col 1</th><th>Col 2</th></tr>
   </thead>
   <tbody>
       {% for col1, col2 in lines %}
       <tr><td>{{ col 1}}</td><td>{{ col2 }}</td></tr>
       {% endfor %}
   </tbody>
</table>
</body>
</html>'''

t = Template(c)

lines = []

with open('yourfile.txt', 'r') as f:
    for line in f:
        lines.append(line.split())

with open('results.html', 'w') as f:
    f.write(t.render(lines=lines))

If you can't install jinja, then here is an alternative:

如果您无法安装jinja，那么这里有一个替代方案：

header = '<!doctyle html><html><head><title>My Title</title></head><body>'
body = '<table><thead><tr><th>Col 1</th><th>Col 2</th></tr>'
footer = '</table></body></html>'

with open('input.txt', 'r') as input, open('output.html', 'w') as output:
   output.writeln(header)
   output.writeln(body)
   for line in input:
       col1, col2 = line.rstrip().split()
       output.write('<tr><td>{}</td><td>{}</td></tr>\n'.format(col1, col2))
   output.write(footer)

Answer 4

回答by Adam Smith

This is HTML -- use BeautifulSoup

这是 HTML -- 使用 BeautifulSoup

from bs4 import BeautifulSoup

soup = BeautifulSoup()
body = soup.new_tag('body')
soup.insert(0, body)
table = soup.new_tag('table')
body.insert(0, table)

with open('path/to/input/file.txt') as infile:
    for line in infile:
        row = soup.new_tag('tr')
        col1, col2 = line.split()
        for coltext in (col2, col1): # important that you reverse order
            col = soup.new_tag('td')
            col.string = coltext
            row.insert(0, col)
        table.insert(len(table.contents), row)

with open('path/to/output/file.html', 'w') as outfile:
    outfile.write(soup.prettify())

Answer 5

回答by muthukumar

I have added title, looping here line by line and appending each line on < tr > and < td > tags, it is should work as single table without column. No need to use these tags(< tr >< /tr > and < td >< /td >[gave a spaces for readability]) for col1 and col2.

我添加了标题，在此处逐行循环并在 <tr> 和 <td> 标签上附加每一行，它应该作为没有列的单个表工作。无需为 col1 和 col2 使用这些标签（< tr >< /tr > 和 < td >< /td >[给可读性的空格]）。

log: snippet:

日志：片段：

MUTHU PAGE
2019/08/19 19:59:25 MUTHUKUMAR_TIME_DATE,line: 118 INFO | Logger object created for: MUTHUKUMAR_APP_USER_SIGNUP_LOG 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 48 INFO | ***** User SIGNUP page start ***** 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 49 INFO | Enter first name: [Alphabet character only allowed, minimum 3 character to maximum 20 chracter]

穆图页
2019/08/19 19:59:25 MUTHUKUMAR_TIME_DATE,line: 118 INFO | 创建的记录器对象： MUTHUKUMAR_APP_USER_SIGNUP_LOG 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 48 INFO | ***** 用户注册页面开始 ***** 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 49 INFO | 输入名字：[仅允许字母字符，最少 3 个字符至最多 20 个字符]

html source page:

html源页面：

'''

<?xml version="1.0" encoding="utf-8"?>
<body>
 <table>
  <p>
   MUTHU PAGE
  </p>
  <tr>
   <td>
    2019/08/19 19:59:25 MUTHUKUMAR_TIME_DATE,line: 118     INFO | Logger object created for: MUTHUKUMAR_APP_USER_SIGNUP_LOG
   </td>
  </tr>
  <tr>
   <td>
    2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 48     INFO | ***** User SIGNUP page start *****
   </td>
  </tr>
  <tr>
   <td>
    2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 49     INFO | Enter first name: [Alphabet character only allowed, minimum 3 character to maximum 20 chracter]

'''

CODE:

代码：

from bs4 import BeautifulSoup

soup = BeautifulSoup(features='xml')
body = soup.new_tag('body')
soup.insert(0, body)
table = soup.new_tag('table')
body.insert(0, table)

with open('C:\Users\xxxxx\Documents\Latest_24_may_2019\New_27_jun_2019\DB\log\input.txt') as infile:
    title_s = soup.new_tag('p')
    title_s.string = " MUTHU PAGE "
    table.insert(0, title_s)
    for line in infile:
        row = soup.new_tag('tr')
        col1 = list(line.split('\n'))
        col1 = [ each for each in col1 if each != '']
        for coltext in col1:
            col = soup.new_tag('td')
            col.string = coltext
            row.insert(0, col)
        table.insert(len(table.contents), row)

with open('C:\Users\xxxx\Documents\Latest_24_may_2019\New_27_jun_2019\DB\log\output.html', 'w') as outfile:
    outfile.write(soup.prettify())

使用python将文本文件转换为html文件

提问by user3832061

采纳答案by u3311882

回答by neil

回答by Burhan Khalid

回答by Adam Smith

回答by muthukumar

相关推荐

最近更新

标签

使用python将文本文件转换为html文件

提问by user3832061

采纳答案by u3311882

回答by neil

回答by Burhan Khalid

回答by Adam Smith

回答by muthukumar

相关推荐

numpy python 3.4.1 安装：在注册表中找不到 Python 3.4

在python的箱线图中显示平均值？

Python Conda 环境和 .BAT 文件

Python 熊猫数据框的中位数

相关推荐

最近更新

标签