制作面向对象的 Python 脚本

Question

提问by greenie

I'm writing an application in Python that is going to have a lot of different functions, so logically I thought it would be best to split up my script into different modules. Currently my script reads in a text file that contains code which has been converted into tokens and spellings. The script then reconstructs the code into a string, with blank lines where comments would have been in the original code.

我正在用 Python 编写一个应用程序，它会有很多不同的功能，所以从逻辑上讲，我认为最好将我的脚本分成不同的模块。目前，我的脚本读取包含已转换为标记和拼写的代码的文本文件。然后脚本将代码重构为一个字符串，其中包含原始代码中的注释的空白行。

I'm having a problem making the script object-oriented though. Whatever I try I can't seem to get the program running the same way it would as if it was just a single script file. Ideally I'd like to have two script files, one that contains a class and function that cleans and reconstructs the file. The second script would simply call the function from the class in the other file on a file given as an argument from the command line. This is my current script:

我在使脚本面向对象时遇到了问题。无论我尝试什么，我似乎都无法让程序以与它只是一个脚本文件相同的方式运行。理想情况下，我希望有两个脚本文件，其中一个包含用于清理和重建文件的类和函数。第二个脚本将简单地从命令行中作为参数给出的文件上的另一个文件中的类调用该函数。这是我当前的脚本：

import sys

tokenList = open(sys.argv[1], 'r')
cleanedInput = ''
prevLine = 0

for line in tokenList:

    if line.startswith('LINE:'):
        lineNo = int(line.split(':', 1)[1].strip())
        diff = lineNo - prevLine - 1

        if diff == 0:
            cleanedInput += '\n'
        if diff == 1:
            cleanedInput += '\n\n'
        else:
            cleanedInput += '\n' * diff

        prevLine = lineNo
        continue

    cleanedLine = line.split(':', 1)[1].strip()
    cleanedInput += cleanedLine + ' '

print cleanedInput

After following Alex Martelli advice below, I now have the following code which gives me the same output as my original code.

在遵循下面的 Alex Martelli 建议之后，我现在有以下代码，它为我提供与原始代码相同的输出。

def main():
    tokenList = open(sys.argv[1], 'r')
    cleanedInput = []
    prevLine = 0

    for line in tokenList:

        if line.startswith('LINE:'):
            lineNo = int(line.split(':', 1)[1].strip())
            diff = lineNo - prevLine - 1

            if diff == 0:
                cleanedInput.append('\n')
            if diff == 1:
                cleanedInput.append('\n\n')
            else:
                cleanedInput.append('\n' * diff)

            prevLine = lineNo
            continue

        cleanedLine = line.split(':', 1)[1].strip()
        cleanedInput.append(cleanedLine + ' ')

    print cleanedInput

if __name__ == '__main__':
    main()

I would still like to split my code into multiple modules though. A 'cleaned file' in my program will have other functions performed on it so naturally a cleaned file should be a class in its own right?

不过，我仍然想将我的代码拆分为多个模块。我的程序中的“清理过的文件”将在其上执行其他功能，因此清理过的文件本身应该是一个类吗？

Answer 1

回答by Alex Martelli

To speed up your existing code measurably, add def main():before the assignment to tokenList, indent everything after that 4 spaces, and at the end put the usual idiom

要显着加快现有代码的速度，请def main():在赋值之前添加to tokenList，缩进 4 个空格之后的所有内容，并在最后放入通常的习语

if __name__ == '__main__':
  main()

(The guard is not actually necessary, but it's a good habit to have nevertheless since, for scripts with reusable functions, it makes them importable from other modules).

（守卫实际上不是必需的，但仍然是一个好习惯，因为对于具有可重用功能的脚本，它使它们可以从其他模块导入）。

This has little to do with "object oriented" anything: it's simply faster, in Python, to keep all your substantial code in functions, notas top-level module code.

这与“面向对象”几乎没有任何关系：在 Python 中，将所有重要代码保存在函数中，而不是作为顶级模块代码，速度更快。

Second speedup, change cleanedInputinto a list, i.e., its first assignment should be = [], and wherever you now have +=, use .appendinstead. At the end, ''.join(cleanedInput)to get the final resulting string. This makes your code take linear time as a function of input size (O(N)is the normal way of expressing this) while it currently takes quadratic time (O(N squared)).

第二个加速，cleanedInput变成一个列表，即它的第一个赋值应该是= []，无论你现在有什么+=，都用.append代替。最后，''.join(cleanedInput)得到最终的结果字符串。这使您的代码将线性时间作为输入大小的函数（这O(N)是表达这一点的正常方式），而目前它需要二次时间 ( O(N squared))。

Then, correctness: the two statements right after continuenever execute. Do you need them or not? Remove them (and the continue) if not needed, remove the continueif those two statements are actually needed. And the tests starting with if diffwill fail dramatically unless the previous ifwas executed, because diffwould be undefined then. Does your code as posted perhaps have indentation errors, i.e., is the indentation of what you posted different from that of your actual code?

然后，正确性：紧随其后的两个语句continue永远不会执行。你需要还是不需要？continue如果不需要，则删除它们（和），如果continue实际上需要这两个语句，则删除它们。if diff除非if执行前一个，否则以开头的测试将显着失败，因为diff那时将是未定义的。您发布的代码是否可能存在缩进错误，即您发布的内容的缩进与实际代码的缩进是否不同？

Considering these important needed enhancements, and the fact that it's hard to see what advantage you are pursuing in making this tiny code OO (and/or modular), I suggest clarifying the indenting / correctness situation, applying the enhancements I've proposed, and leaving it at that;-).

考虑到这些重要的必需增强功能，以及很难看出您在使这个小代码面向对象（和/或模块化）方面追求什么优势的事实，我建议澄清缩进/正确性情况，应用我提出的增强功能，以及就这样吧;-)。

Edit: as the OP has now applied most of my suggestions, let me follow up with one reasonable way to hive off most functionality to a class in a separate module. In a new file, for example foobar.py, in the same directory as the original script (or in site-packages, or elsewhere on sys.path), place this code:

编辑：由于 OP 现在已经应用了我的大部分建议，让我跟进一种合理的方法，将大部分功能分配给单独模块中的类。例如foobar.py，在一个新文件中，在与原始脚本相同的目录中（或在中site-packages，或在上的其他地方sys.path），放置以下代码：

def token_of(line):
  return line.partition(':')[-1].strip()

class FileParser(object):
  def __init__(self, filename):
    self.tokenList = open(filename, 'r')

  def cleaned_input(self):
    cleanedInput = []
    prevLine = 0

    for line in self.tokenList:
        if line.startswith('LINE:'):
            lineNo = int(token_of(line))
            diff = lineNo - prevLine - 1
            cleanedInput.append('\n' * (diff if diff>1 else diff+1))
            prevLine = lineNo
        else:
            cleanedLine = token_of(line)
            cleanedInput.append(cleanedLine + ' ')

    return cleanedInput

Your main script then becomes just:

然后你的主脚本就变成了：

import sys
import foobar

def main():
    thefile = foobar.FileParser(sys.argv[1])
    print thefile.cleaned_input()

if __name__ == '__main__':
  main()

Answer 2

回答by CBFraser

When I do this particular refactoring, I usually start with an initial transformation within the first file. Step 1: move the functionality into a method in a new class. Step 2: add the magic invocation below to get the file to run like a script again:

当我进行这种特定的重构时，我通常从第一个文件中的初始转换开始。步骤 1：将功能移动到新类中的方法中。第 2 步：添加下面的魔术调用以使文件再次像脚本一样运行：

class LineCleaner:

    def cleanFile(filename):
        cleanInput = ""
        prevLine = 0
        for line in open(filename,'r'):         
           <... as in original script ..>

if __name__ == '__main__':
     cleaner = LineCleaner()
     cleaner.cleanFile(sys.argv[1])

Answer 3

回答by Richard Levasseur

You can get away with creating a function and putting all your logic in it. For full "object orientedness" though, you can do something like this:

您可以创建一个函数并将所有逻辑放入其中。但是，对于完整的“面向对象”，您可以执行以下操作：

ps - your posted code has a bug on the continueline - it is always executed and the last 2 lines will never execute.

ps - 您发布的代码有一个错误continue- 它总是被执行，最后两行永远不会执行。

class Cleaner:
  def __init__(...):
    ...init logic...
  def Clean(self):
    for line in open(self.tokenList):
      ...cleaning logic...
    return cleanedInput

def main(argv):
  cleaner = Cleaner(argv[1])
  print cleaner.Clean()
  return 0

if '__main__' == __name__:
  sys.exit(main(sys.argv))

Answer 4

回答by przemo_li

If presented code is all code Just don't add any class !!

如果呈现的代码是所有代码只是不要添加任何类！

Your code is too simply for that !! OOP approach would add unnecessary complexity.

你的代码太简单了！！OOP 方法会增加不必要的复杂性。

But if still wont. Put all code into function eg.

但如果还是不行。将所有代码放入函数中，例如。

def parse_tokenized_input(file):
    tokenList = open(file, 'r')
    cleanedInput = ''
    prevLine = 0
    #rest of code

at end add:

最后添加：

if __name__ == '__main__':
    parse_tokenized_input(sys.argv[1])

If code works correct put def of function to new file (and all needed imports!) eg. mymodyle.py

如果代码工作正确，则将函数的定义放入新文件（以及所有需要的导入！）例如。mymodyle.py

your script now will be:

你的脚本现在将是：

from mymodule.py import parse_tokenized_input

if __name__ == '__main__':
        parse_tokenized_input(sys.argv[1])

Oh and think out better name for your function and module (module should have general name).

哦，为你的函数和模块想出更好的名字（模块应该有通用名称）。

制作面向对象的 Python 脚本

提问by greenie

回答by Alex Martelli

回答by CBFraser

回答by Richard Levasseur

回答by przemo_li

相关推荐

最近更新

标签

制作面向对象的 Python 脚本

提问by greenie

回答by Alex Martelli

回答by CBFraser

回答by Richard Levasseur

回答by przemo_li

相关推荐

python 校验和udp计算python

删除 Python 注释/文档字符串的脚本

在 Python 2.4 中使用 urllib 解析查询字符串

将 zip 文件下载到本地驱动器并使用 python 2.5 将所有文件解压缩到目标文件夹

相关推荐

最近更新

标签