如何在 iPython notebook 中调用用 argparse 编写的模块

Question

提问by Niels

I am trying to pass BioPython sequences to Ilya Stepanov's implementation of Ukkonen's suffix tree algorithmin iPython's notebook environment. I am stumbling on the argparse component.

我正在尝试将 BioPython 序列传递给Ilya Stepanov在 iPython 的笔记本环境中实现 Ukkonen 的后缀树算法。我在 argparse 组件上磕磕绊绊。

I have never had to deal directly with argparse before. How can I use this without rewriting main()?

我以前从来没有直接处理过 argparse。如何在不重写 main() 的情况下使用它？

By the by, this writeup of Ukkonen's algorithm is fantastic.

顺便提一下，这个书面记录Ukkonen算法的是太棒了。

Answer 1

采纳答案by BioGeek

I've had a similar problem before,but using optparseinstead of argparse.

我以前遇到过类似的问题，但使用optparse代替argparse.

You don't need to change anything in the original script, just assign a new list to sys.argvlike so:

您不需要更改原始脚本中的任何内容，只需分配一个新列表即可sys.argv：

if __name__ == "__main__":
    from Bio import SeqIO
    path = '/path/to/sequences.txt'
    sequences = [str(record.seq) for record in  SeqIO.parse(path, 'fasta')]
    sys.argv = ['-f'] + sequences
    main()

Answer 2

回答by Niels

I ended up using BioPython to extract the sequences and then editing Ilya Steanov's implementation to remove the argparse methods.

我最终使用 BioPython 来提取序列，然后编辑 Ilya Steanov 的实现以删除 argparse 方法。

import imp
seqs = []
lcsm = imp.load_source('lcsm', '/path/to/ukkonen.py')
for record in SeqIO.parse('/path/to/sequences.txt', 'fasta'):
    seqs.append(record)
lcsm.main(seqs)

For the algorithm, I had main()take one argument, his stringsvariable, but this sends the algorithm a list of special BioPython Sequence objects, which the re module doesn't like. So I had to extract the sequence string

对于算法，我有main()一个参数，他的strings变量，但这会向算法发送一个特殊的BioPython 序列对象列表，re 模块不喜欢这些对象。所以我不得不提取序列字符串

suffix_tree.append_string(s)

to

到

suffix_tree.append_string(str(s.seq))

which seems kind of brittle, but that's all I've got for now.

这似乎有点脆弱，但这就是我现在所拥有的。

Answer 3

回答by tbrittoborges

An alternative to use argparse in Ipython notebooks is passing a string to:

在 Ipython 笔记本中使用 argparse 的另一种方法是将字符串传递给：

args = parser.parse_args()(line 303 from the git repo you referenced.)

args = parser.parse_args()（来自您引用的 git repo 的第 303 行。）

Would be something like:

会是这样的：

parser = argparse.ArgumentParser(
        description='Searching longest common substring. '
                    'Uses Ukkonen\'s suffix tree algorithm and generalized suffix tree. '
                    'Written by Ilya Stepanov (c) 2013')

parser.add_argument(
        'strings',
        metavar='STRING',
        nargs='*',
        help='String for searching',
    )

parser.add_argument(
        '-f',
        '--file',
        help='Path for input file. First line should contain number of lines to search in'
    )

and

和

args = parser.parse_args("AAA --file /path/to/sequences.txt".split())

Edit: It works

编辑：它有效

Answer 4

回答by sngjuk

If you use iPython for testing, transforming argparse into class format can be a quick dummy solution.

如果您使用 iPython 进行测试，将 argparse 转换为类格式可能是一个快速的虚拟解决方案。

class Args:
  data = './data/penn'
  model = 'LSTM'
  emsize = 200
  nhid = 200

args=Args()

Github pagerepo offers transformation web service. http://35.192.144.192:8000/arg2cls.html
I hope that it would be helpful for your testing. Jan 9/19 many bugs are fixed.

Transformation script. Python3 is required.

Github 页面存储库提供转换 Web 服务。http://35.192.144.192:8000/arg2cls.html
希望对您的测试有所帮助。2019 年 1 月 9 日修复了许多错误。

转换脚本。需要 Python3。

python3 [arg2cls.py] [argparse_script.py]

then copy & paste class format to replace argparse functions.

然后复制和粘贴类格式以替换 argparse 函数。

#!/usr/bin/env python3
from collections import OrderedDict
import sys
import re
DBG = False

#add_argument(), set_defaults() only available.
ListStartPatt = re.compile(r'\s*\[.*')
ListStartPatt2 = re.compile(r'\).*\[.*') # list out of function scope.
ListPatt = re.compile(r'(\[.*?\])')
GbgPatt = re.compile(r'(.*?)\)[^\)]+') # for float('inf') cmplx.
GbgPatt2 = re.compile(r'(.*?)\).*') # general gbg, ? for non greedy.
LpRegex = re.compile(r'\({1,}\s{0,}')
RpRegex = re.compile(r'\s{0,}\){1,}')
PrRegex = re.compile(r'\((.*)(\))(?!.*\))') # from \( to last \).
CmRegex = re.compile(r'\s{0,},\s{0,}')
StrRegex = re.compile(r'\'(.*?)\'')

# Argument dict : {arg_name : value}
argDct=OrderedDict()

# process 'default=' value.
def default_value(tval, dtype=''):
  # string pattern.
  regres = StrRegex.match(tval) 
  if regres and not re.search('int|float|long|bool|complex', dtype):
    if DBG:
      print('default_value: str patt found')
    tval = regres.group(0)
    return tval

  # typed pattern.
  CommaSeparated = CmRegex.split(tval)[0]
  if DBG:
    print('comma sepearated value:', CommaSeparated)

  if ListStartPatt.match(CommaSeparated) and not ListStartPatt2.match(CommaSeparated):
    lres = ListPatt.search(tval)
    if lres:
      tval = lres.group(1)
    if DBG:
      print('list patt exist tval: ', tval)
  else :
    tval = CmRegex.split(tval)[0]
    if DBG:
      print('no list format tval: ', tval)

  # if default value is not like - int('inf') , remove characters after ')' garbage chars.
  ires = RpRegex.split(tval)[0]
  if not (re.search('int|float|long|bool|complex', ires) and re.search(r'[a-z]+\(',ires)):
    if DBG:
      print('not int("inf") format. Rp removed tval : ', tval)
    tval = re.split(r'\s{0,}\){1,}',tval)[0]
    gbg = GbgPatt2.search(tval)
    if gbg:
      tval = gbg.group(1)  
      if DBG:
        print('garbage exist & removed. tval : ', tval)

  # int('inf') patt.
  else:
    if DBG:
      print('type("inf") value garbaging!')
    gbg = GbgPatt.search(tval)
    if gbg:
      if DBG:
        print('garbage found, extract!')
      tval = gbg.group(1)

  return tval

# Handling add_argument()
def add_argument(arg_line):
  global argDct
  if DBG:
    print('\nin add_argument : **Pre regex: ', arg_line)

  '''    
  argument name
  '''
  # argname = DdRegex.split(arg_line)[1] # Dash or regex for arg name.
  argname = re.search('\'--(.*?)\'', arg_line)
  if not argname:
    argname = re.search('\'-+(.*?)\'', arg_line)

  # dest= keyword handling.
  dest = re.search(r',\s*dest\s*=(.*)', arg_line)
  if dest:
    dval = dest.group(1)
    dval = default_value(dval)
    argname = StrRegex.search(dval)

  # hyphen(-) to underscore(_)
  if argname:
    argname = argname.group(1).replace('-', '_')
  else :
    # naive str argname.
    sres = StrRegex.match(arg_line)
    if sres:
      argname = sres.group(1)
    if not argname:
      return # no argument name 

  '''
  check for syntaxes (type=, default=, required=, action=, help=, choices=)
  '''
  dtype = ''
  dres = re.search(r',\s*type\s*=\s*(.*)', arg_line)
  if dres:
    dtype = dres.group(1)
    dtype = CmRegex.split(dtype)[0]

  dfult = re.search(r',\s*default\s*=\s*(.*)', arg_line)
  rquird = re.search(r',\s*required\s*=\s*(.*)', arg_line)
  action = re.search(r',\s*action\s*=\s*(.*)', arg_line)
  hlp = re.search(r',\s*help\s*=\s*(.*)', arg_line)
  chice = re.search(r',\s*choices\s*=\s*(.*)', arg_line)

  # help message
  hlp_msg = ''
  if hlp:
    thl = hlp.group(1)
    if DBG:
      print('handling help=')
    hlp_msg = default_value(thl)
    if hlp_msg:
      hlp_msg = 'help='+hlp_msg

  # choice message
  choice_msg = ''
  if chice:
    tch = chice.group(1)
    if DBG:
      print('handling choices=')
    choice_msg = default_value(tch)
    if choice_msg:
      choice_msg = 'choices='+choice_msg+' '

  '''
  argument value
  '''
  # tval: argument value.
  tval = ''
  # default exist.
  if dfult:
    tval = dfult.group(1)
    tval = default_value(tval, dtype)
    if DBG:
      print('value determined : ', tval)

  # action or required syntaxes exist.
  elif action or rquird:
    if DBG:
      print('in action/required handling')
    msg_str = ''
    if action:
      tval = action.group(1)
      msg_str = 'action'
    elif rquird:
      tval = rquird.group(1)
      msg_str = 'required'

    tval = default_value(tval)
    tval = ' ** ' + msg_str + ' '+tval+'; '+choice_msg+ hlp_msg

  # no default, action, required.
  else : 
    argDct[argname] = ' ** default not found; '+choice_msg+ hlp_msg

  # value found.
  if tval:
    argDct[argname] = tval

# Handling set_defaults()
def set_defaults(arg_line):
  global argDct
  if DBG:
    print('\nin set_defaults arg_line: ', arg_line)

  # arguments to process.
  tv='' 
  # arguments of set_default()
  SetPatt = re.compile(r'(.+=.+\)?)')
  sres = SetPatt.match(arg_line)
  if sres:
    tv = sres.group(1)
    if DBG:
      print("setPatt res: ", tv)
    tv = re.sub(r'\s+','', tv)
    if DBG:
      print('\nset_default values: ', tv)

  # one arguemnt regex.
  SetArgPatt = re.compile(r',?([^=]+=)[^=,]+,?')
  # handling multiple set_default() arguments. (may have a bug)
  while True:
    tname=''
    tval =''
    tnv=''
    # func closed.
    if re.match(r',*\).*',tv):
      tv=''
      break
    if DBG:
      print('set_default remaining: ', tv)

    nres = SetArgPatt.match(tv)
    if nres:
      tname = nres.group(1)
      if len(tv.split(tname, 1)) > 1:
        tval = tv.split(tname,1)[1]
        tval = default_value(tval)
        tnv=tname+tval
        tname = tname.rsplit('=',1)[0]

      if DBG:
        print('set_default tnam: ', tname)
        print('set_default tval: ', tval)
      if tname:
        argDct[tname] = tval

      # split with processed argument.
      tv = tv.split(tnv)
      if len(tv) > 1:
        tv = tv[1]
      # no more value to process
      else:
        break

    # no arg=value pattern found.
    else:
      break

# Remove empty line & Concatenate line-separated syntax.
def preprocess(fname):
  try :
    with open(fname, 'r', encoding='UTF8') as f:
      txt = f.read()
      t = txt.splitlines(True)
      t = list( filter(None, t) )

      # remove empty line
      t = [x for x in t if not re.match(r'\s{0,}\n',x)]
      # concatenate multiple lined arguments.
      # empl : lines to be deleted from t[].
      empl = []
      for i in range(len(t)-1, 0, -1):
        if not re.search('add_argument|set_defaults', t[i]):
          t[i-1] += t[i]
          t[i-1]=re.sub(r'\n{0,}','',t[i-1])
          t[i-1]=re.sub(r'\s{1,}',' ',t[i-1])
          empl.append(t[i])

      for d in empl:
        t.remove(d)
      for i, line in enumerate(t):
        t[i] = line.replace('\"', '\'').split('parse_args()')[0]
      return t

  except IOError:
    print('IOError : no such file.', fname)
    sys.exit()

def transform(fname):
  # t : list() contains add_argument|set_defaults lines.
  arg_line_list = preprocess(fname)

  for i, arg_line in enumerate(arg_line_list):
    t = PrRegex.search(arg_line)

    if t:
      t = t.group(1) # t: content of add_argument Parentheses.
    else :
      continue # nothing to parse.

    if re.search(r'add_argument\s*\(', arg_line):
      add_argument(t)
    elif re.search(r'set_defaults\s*\(',arg_line):
      set_defaults(t)
    else :
      # Nothing to parse.
      continue

  print('\nclass Args:')
  for i in argDct:
    print(' ',i, '=', argDct[i])
  print()
  print('args=Args()')

def main():
  if len(sys.argv) <2:
    print('Usage : python arg2cls.py [target.py] [target2.py(optional)] ...')
    sys.exit(0)
  sys.argv.pop(0)

  #handling multiple file input.
  for fname in sys.argv:
    transform(fname)

if(__name__ == "__main__"):
  main()

Answer 5

回答by Dhruv Desai

I face a similar problem in invoking argsparse, the string '-f' was causing this problem. Just removing that from sys.srgv does the trick.

我在调用 argsparse 时遇到了类似的问题，字符串 '-f' 导致了这个问题。只需将其从 sys.srgv 中删除即可。

import sys
if __name__ == '__main__':
    if '-f' in sys.argv:
        sys.argv.remove('-f')
    main()

Answer 6

回答by hyun woo Cho

Clean sys.argv

干净的 sys.argv

import sys; sys.argv=['']; del sys

https://github.com/spyder-ide/spyder/issues/3883#issuecomment-269131039

Answer 7

回答by nivniv

If all arguments have a default value, then adding this to the top of the notebook should be enough:

如果所有参数都有默认值，那么将其添加到笔记本的顶部就足够了：

import sys
sys.argv = ['']

(otherwise, just add necessary arguments instead of the empty string)

（否则，只需添加必要的参数而不是空字符串）

如何在 iPython notebook 中调用用 argparse 编写的模块

提问by Niels

采纳答案by BioGeek

回答by Niels

回答by tbrittoborges

回答by sngjuk

回答by Dhruv Desai

回答by hyun woo Cho

回答by nivniv

相关推荐

最近更新

标签

如何在 iPython notebook 中调用用 argparse 编写的模块

提问by Niels

采纳答案by BioGeek

回答by Niels

回答by tbrittoborges

回答by sngjuk

回答by Dhruv Desai

回答by hyun woo Cho

回答by nivniv

相关推荐

在给定稀疏矩阵数据的情况下，Python 中计算余弦相似度的最快方法是什么？

Python 下采样 wav 音频文件

python中的解包函数

Python 对日期字符串列表进行排序

相关推荐

最近更新

标签