Python 如何正确获取文件扩展名?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37896386/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 20:05:45  来源:igfitidea点击:

How to get file extension correctly?

python

提问by Page David

I know that this question is asked many times on this website. But I found that they missed an important point: only file extension with one period was taken into consider like *.png *.mp3, but how do I deal with these filename with two period like .tar.gz.

我知道这个问题在这个网站上被问过很多次。但是我发现他们错过了一个重要的点:只考虑了一个句点的文件扩展名*.png *.mp3,但是我如何处理这些带有两个句点的文件名,例如.tar.gz.

The basic code is:

基本代码是:

filename = '/home/lancaster/Downloads/a.ppt'
extention = filename.split('/')[-1]

But obviously, this code do not work with the file like a.tar.gz. How to deal with it? Thanks.

但显然,此代码不适用于a.tar.gz. 如何处理?谢谢。

采纳答案by John Burger

The role of a file extension is to tell the viewer (and sometimes the computer) which application to use to handle the file.

文件扩展名的作用是告诉查看者(有时是计算机)使用哪个应用程序来处理文件。

Taking your worst-case example in your comments (a.ppt.tar.gz), this is a PowerPoint file that has been tar-balled and then gzipped. So you need to use a gzip-handling program to open it. Using PowerPoint or a tarball-handling program wouldn't work. OK, a clever program that knew how to handle both .tarand .gzfiles could understand both operations and work with a .tar.gzfile - but note that it would do that even if the extension was simply .gz.

在您的评论 ( a.ppt.tar.gz) 中以最坏的情况为例,这是一个 PowerPoint 文件,已被 tar 压缩然后 gzipped。因此,您需要使用 gzip 处理程序来打开它。使用 PowerPoint 或 tarball 处理程序将不起作用。好的,一个知道如何处理.tar.gz文件的聪明程序可以理解这两种操作并处理.tar.gz文件 - 但请注意,即使扩展名只是.gz.

The fact that both tarand gzipadd their extensions to the original filename, rather than replace them (as zipdoes) is a convenience. But the base name of the gzip file is still a.ppt.tar.

这两个事实targzip他们的扩展添加到原文件名,而不是取代它们(如zip做)是一个便利。但是 gzip 文件的基本名称仍然是a.ppt.tar.

回答by Or Duan

Python 3.4

蟒蛇 3.4

You can now use Pathfrom pathlib. It has many features, one of them is suffix:

您现在可以使用Pathfrom pathlib。它有很多特点,其中之一是suffix

>>> from pathlib import Path
>>> Path('my/library/setup.py').suffix
'.py'
>>> Path('my/library.tar.gz').suffix
'.gz'
>>> Path('my/library').suffix
''

If you want to get more than one suffix, use suffixes:

如果您想获得多个后缀,请使用suffixes

>>> from pathlib import Path
>>> Path('my/library.tar.gar').suffixes
['.tar', '.gar']
>>> Path('my/library.tar.gz').suffixes
['.tar', '.gz']
>>> Path('my/library').suffixes
[]

回答by Rahul K P

Here is a in build module in os. More about os.path.splitext.

这是os. 更多关于os.path.splitext.

In [1]: from os.path import splitext
In [2]: file_name,extension = splitext('/home/lancaster/Downloads/a.ppt')
In [3]: extension
Out[1]: '.ppt'

If you have to fine the extension of .tar.gz,.tar.bz2you have to write a function like this

如果您必须对 的扩展进行细化.tar.gz.tar.bz2则必须编写这样的函数

from os.path import splitext
def splitext_(path):
    for ext in ['.tar.gz', '.tar.bz2']:
        if path.endswith(ext):
            return path[:-len(ext)], path[-len(ext):]
    return splitext(path)

Result

结果

In [4]: file_name,ext = splitext_('/home/lancaster/Downloads/a.tar.gz')
In [5]: ext
Out[2]: '.tar.gz'

Edit

编辑

Generally you can use this function

一般你可以使用这个功能

from os.path import splitext
def splitext_(path):
    if len(path.split('.')) > 2:
        return path.split('.')[0],'.'.join(path.split('.')[-2:])
    return splitext(path)

It will work for all extensions.

它将适用于所有扩展。

Working on all files.

处理所有文件

In [6]: inputs = ['a.tar.gz', 'b.tar.lzma', 'a.tar.lz', 'a.tar.lzo', 'a.tar.xz','a.png']
In [7]: for file_ in inputs:                                                                    
    file_name,extension = splitext_(file_)
    print extension
   ....:     
tar.gz
tar.lzma
tar.lz
tar.lzo
tar.xz
.png

回答by no11

One possible way is:

一种可能的方法是:

  1. Slice at "." => tmp_ext = filename.split('.')[1:]
  1. 在“.”处切片 =>tmp_ext = filename.split('.')[1:]

Result is a list = ['tar', 'gz']

结果是一个列表 = ['tar', 'gz']

  1. Join them together => extention = ".".join(tmp_ext)
  1. 加入他们=> extention = ".".join(tmp_ext)

Result is your extension as string = 'tar.gz'

结果是您的扩展名作为字符串 = 'tar.gz'

Update: Example:

更新: 示例:

>>> test = "/test/test/test.tar.gz"
>>> t2 = test.split(".")[1:]
>>> t2
['tar', 'gz']
>>> ".".join(t2)
'tar.gz'

回答by matt

>>> import os
>>> import re

>>> filename = os.path.basename('/home/lancaster/Downloads/a.ppt')  
>>> extensions = re.findall(r'\.([^.]+)', basename)
['ppt']


>>> filename = os.path.basename('/home/lancaster/Downloads/a.ppt.tar.gz')  
>>> extensions = re.findall(r'\.([^.]+)', basename)
['ppt','tar','gz']

回答by LetzerWille

with re.findall and python 3.6

filename = '/home/Downloads/abc.ppt.tar.gz'

ext = r'\.\w{1,6}'

re.findall(f'{ext}\b | {ext}$', filename,  re.X)

['.ppt', '.tar', '.gz']

回答by Saket Mittal

Simplest One:

最简单的一个:

import os.path
print os.path.splitext("/home/lancaster/Downloads/a.ppt")[1]
# '.ppt'

回答by akshay.s.jagtap

filename = '/home/lancaster/Downloads/a.tar.gz'
extention = filename.split('/')[-1]

if '.' in extention:
  extention = extention.split('.')[-1]
  if len(extention) > 0:
    extention = '.'+extention
    print extention