使用python将文本文件转换为html文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24715027/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 05:03:02  来源:igfitidea点击:

converting text file to html file with python

pythonhtmltext

提问by user3832061

I have a text file that contains :

我有一个包含以下内容的文本文件:

JavaScript              0
/AA                     0
OpenAction              1
AcroForm                0
JBIG2Decode             0
RichMedia               0
Launch                  0
Colors>2^24             0
uri                     0

I wrote this code to convert the text file to html :

我写了这段代码来将文本文件转换为 html :

contents = open("C:\Users\Suleiman JK\Desktop\Static_hash\test","r")
    with open("suleiman.html", "w") as e:
        for lines in contents.readlines():
            e.write(lines + "<br>\n")

but the problem that I had in html file that in each line there is no space between the two columns:

但是我在 html 文件中遇到的问题是,每行两列之间没有空格:

JavaScript 0
/AA 0
OpenAction 1
AcroForm 0
JBIG2Decode 0
RichMedia 0
Launch 0
Colors>2^24 0
uri 0 

what should I do to have the same content and the two columns like in text file

我应该怎么做才能拥有相同的内容和文本文件中的两列

采纳答案by u3311882

Just change your code to include <pre>and </pre>tags to ensure that your text stays formatted the way you have formatted it in your original text file.

只需更改您的代码以包含<pre></pre>标记,以确保您的文本保持格式与您在原始文本文件中的格式相同。

contents = open"C:\Users\Suleiman JK\Desktop\Static_hash\test","r")
with open("suleiman.html", "w") as e:
    for lines in contents.readlines():
        e.write("<pre>" + lines + "</pre> <br>\n")

回答by neil

That is because HTML parsers collapse all whitespace. There are two ways you could do it (well probably many more).

那是因为 HTML 解析器会折叠所有空白。有两种方法可以做到(可能还有更多)。

One would be to flag it as "preformatted text" by putting it in <pre>...</pre>tags.

一种方法是将其放入<pre>...</pre>标签中,将其标记为“预先格式化的文本” 。

The other would be a table (and this is what a table ismade for):

另一个将是一张桌子(这就是一张桌子用途):

<table>
  <tr><td>Javascript</td><td>0</td></tr>
  ...
</table>

Fairly tedious to type out by hand, but easy to generate from your script. Something like this should work:

手动输入相当乏味,但很容易从您的脚本中生成。这样的事情应该工作:

contents = open("C:\Users\Suleiman JK\Desktop\Static_hash\test","r")
with open("suleiman.html", "w") as e:
    e.write("<table>\n")   
    for lines in contents.readlines():
        e.write("<tr><td>%s</td><td>%s</td></tr>\n"%lines.split())
    e.write("</table>\n")

回答by Burhan Khalid

You can use a standalone template library like makoor jinja. Here is an example with jinja:

您可以使用独立的模板库,如makojinja。这是一个 jinja 的例子:

from jinja2 import Template
c = '''<!doctype html>
<html>
<head>
    <title>My Title</title>
</head>
<body>
<table>
   <thead>
       <tr><th>Col 1</th><th>Col 2</th></tr>
   </thead>
   <tbody>
       {% for col1, col2 in lines %}
       <tr><td>{{ col 1}}</td><td>{{ col2 }}</td></tr>
       {% endfor %}
   </tbody>
</table>
</body>
</html>'''

t = Template(c)

lines = []

with open('yourfile.txt', 'r') as f:
    for line in f:
        lines.append(line.split())

with open('results.html', 'w') as f:
    f.write(t.render(lines=lines))

If you can't install jinja, then here is an alternative:

如果您无法安装jinja,那么这里有一个替代方案:

header = '<!doctyle html><html><head><title>My Title</title></head><body>'
body = '<table><thead><tr><th>Col 1</th><th>Col 2</th></tr>'
footer = '</table></body></html>'

with open('input.txt', 'r') as input, open('output.html', 'w') as output:
   output.writeln(header)
   output.writeln(body)
   for line in input:
       col1, col2 = line.rstrip().split()
       output.write('<tr><td>{}</td><td>{}</td></tr>\n'.format(col1, col2))
   output.write(footer)

回答by Adam Smith

This is HTML -- use BeautifulSoup

这是 HTML -- 使用 BeautifulSoup

from bs4 import BeautifulSoup

soup = BeautifulSoup()
body = soup.new_tag('body')
soup.insert(0, body)
table = soup.new_tag('table')
body.insert(0, table)

with open('path/to/input/file.txt') as infile:
    for line in infile:
        row = soup.new_tag('tr')
        col1, col2 = line.split()
        for coltext in (col2, col1): # important that you reverse order
            col = soup.new_tag('td')
            col.string = coltext
            row.insert(0, col)
        table.insert(len(table.contents), row)

with open('path/to/output/file.html', 'w') as outfile:
    outfile.write(soup.prettify())

回答by muthukumar

I have added title, looping here line by line and appending each line on < tr > and < td > tags, it is should work as single table without column. No need to use these tags(< tr >< /tr > and < td >< /td >[gave a spaces for readability]) for col1 and col2.

我添加了标题,在此处逐行循环并在 <tr> 和 <td> 标签上附加每一行,它应该作为没有列的单个表工作。无需为 col1 和 col2 使用这些标签(< tr >< /tr > 和 < td >< /td >[给可读性的空格])。

log: snippet:

日志:片段:

MUTHU PAGE

2019/08/19 19:59:25 MUTHUKUMAR_TIME_DATE,line: 118 INFO | Logger object created for: MUTHUKUMAR_APP_USER_SIGNUP_LOG 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 48 INFO | ***** User SIGNUP page start ***** 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 49 INFO | Enter first name: [Alphabet character only allowed, minimum 3 character to maximum 20 chracter]

穆图页

2019/08/19 19:59:25 MUTHUKUMAR_TIME_DATE,line: 118 INFO | 创建的记录器对象: MUTHUKUMAR_APP_USER_SIGNUP_LOG 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 48 INFO | ***** 用户注册页面开始 ***** 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 49 INFO | 输入名字:[仅允许字母字符,最少 3 个字符至最多 20 个字符]

html source page:

html源页面:

'''

'''

<?xml version="1.0" encoding="utf-8"?>
<body>
 <table>
  <p>
   MUTHU PAGE
  </p>
  <tr>
   <td>
    2019/08/19 19:59:25 MUTHUKUMAR_TIME_DATE,line: 118     INFO | Logger object created for: MUTHUKUMAR_APP_USER_SIGNUP_LOG
   </td>
  </tr>
  <tr>
   <td>
    2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 48     INFO | ***** User SIGNUP page start *****
   </td>
  </tr>
  <tr>
   <td>
    2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 49     INFO | Enter first name: [Alphabet character only allowed, minimum 3 character to maximum 20 chracter]

'''

'''

CODE:

代码:

from bs4 import BeautifulSoup

soup = BeautifulSoup(features='xml')
body = soup.new_tag('body')
soup.insert(0, body)
table = soup.new_tag('table')
body.insert(0, table)

with open('C:\Users\xxxxx\Documents\Latest_24_may_2019\New_27_jun_2019\DB\log\input.txt') as infile:
    title_s = soup.new_tag('p')
    title_s.string = " MUTHU PAGE "
    table.insert(0, title_s)
    for line in infile:
        row = soup.new_tag('tr')
        col1 = list(line.split('\n'))
        col1 = [ each for each in col1 if each != '']
        for coltext in col1:
            col = soup.new_tag('td')
            col.string = coltext
            row.insert(0, col)
        table.insert(len(table.contents), row)

with open('C:\Users\xxxx\Documents\Latest_24_may_2019\New_27_jun_2019\DB\log\output.html', 'w') as outfile:
    outfile.write(soup.prettify())