如何使用 PHP 从 JPG 中读取 XMP 数据?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1578169/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 03:09:30  来源:igfitidea点击:

How can I read XMP data from a JPG with PHP?

phpmetadatajpegxmp

提问by Liam

PHP has built in support for reading EXIF and IPTC metadata, but I can't find any way to read XMP?

PHP 内置了对读取 EXIF 和 IPTC 元数据的支持,但我找不到任何读取 XMP 的方法?

回答by Stefan Gehrig

XMP data is literally embedded into the image file so can extract it with PHP's string-functions from the image file itself.

XMP 数据实际上嵌入到图像文件中,因此可以使用 PHP 的字符串函数从图像文件本身中提取它。

The following demonstrates this procedure (I'm using SimpleXMLbut every other XML API or even simple and clever string parsing may give you equal results):

下面演示了此过程(我使用的是SimpleXML,但其他所有 XML API 甚至简单而巧妙的字符串解析都可能给您相同的结果):

$content = file_get_contents($image);
$xmp_data_start = strpos($content, '<x:xmpmeta');
$xmp_data_end   = strpos($content, '</x:xmpmeta>');
$xmp_length     = $xmp_data_end - $xmp_data_start;
$xmp_data       = substr($content, $xmp_data_start, $xmp_length + 12);
$xmp            = simplexml_load_string($xmp_data);

Just two remarks:

只说两点:

  • XMP makes heavy use of XML namespaces, so you'll have to keep an eye on that when parsing the XMP data with some XML tools.
  • considering the possible size of image files, you'll perhaps not be able to use file_get_contents()as this function loads the whole image into memory. Using fopen()to open a file stream resource and checking chunks of data for the key-sequences <x:xmpmetaand </x:xmpmeta>will significantly reduce the memory footprint.
  • XMP 大量使用 XML 名称空间,因此在使用某些 XML 工具解析 XMP 数据时必须密切注意这一点。
  • 考虑到图像文件的可能大小,您可能无法使用file_get_contents()此函数将整个图像加载到内存中。使用fopen()打开一个文件流的资源和检查重点序列数据块<x:xmpmeta,并</x:xmpmeta>会显著减少内存占用。

回答by Bryan Geraghty

I'm only replying to this after so much time because this seems to be the best result when searching Google for how to parse XMP data. I've seen this nearly identical snippet used in code a few times and it's a terrible waste of memory. Here is an example of the fopen() method Stefan mentions after his example.

我只是在这么长时间后才回复这个,因为这似乎是在谷歌搜索如何解析 XMP 数据时最好的结果。我在代码中多次看到这个几乎相同的片段,这是一种可怕的内存浪费。这是 Stefan 在他的例子之后提到的 fopen() 方法的一个例子。

<?php

function getXmpData($filename, $chunkSize)
{
    if (!is_int($chunkSize)) {
        throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
    }

    if ($chunkSize < 12) {
        throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
    }

    if (($file_pointer = fopen($filename, 'r')) === FALSE) {
        throw new RuntimeException('Could not open file for reading');
    }

    $startTag = '<x:xmpmeta';
    $endTag = '</x:xmpmeta>';
    $buffer = NULL;
    $hasXmp = FALSE;

    while (($chunk = fread($file_pointer, $chunkSize)) !== FALSE) {

        if ($chunk === "") {
            break;
        }

        $buffer .= $chunk;
        $startPosition = strpos($buffer, $startTag);
        $endPosition = strpos($buffer, $endTag);

        if ($startPosition !== FALSE && $endPosition !== FALSE) {
            $buffer = substr($buffer, $startPosition, $endPosition - $startPosition + 12);
            $hasXmp = TRUE;
            break;
        } elseif ($startPosition !== FALSE) {
            $buffer = substr($buffer, $startPosition);
            $hasXmp = TRUE;
        } elseif (strlen($buffer) > (strlen($startTag) * 2)) {
            $buffer = substr($buffer, strlen($startTag));
        }
    }

    fclose($file_pointer);
    return ($hasXmp) ? $buffer : NULL;
}

回答by Fluxine

A simple way on linux is to call the exiv2 program, available in an eponymous package on debian.

在 linux 上的一个简单方法是调用 exiv2 程序,该程序在 debian 上的同名软件包中可用。

$ exiv2 -e X extract image.jpg

will produce image.xmp containing embedded XMP which is now yours to parse.

将生成包含嵌入式 XMP 的 image.xmp,现在您可以对其进行解析。

回答by infiniteegolfer

I know... this is kind of an old thread, but it was helpful to me when I was looking for a way to do this, so I figured this might be helpful to someone else.

我知道......这是一个旧线程,但是当我正在寻找一种方法时它对我有帮助,所以我认为这可能对其他人有帮助。

I took this basic solution and modified it so it handles the case where the tag is split between chunks. This allows the chunk size to be as large or small as you want.

我采用了这个基本解决方案并对其进行了修改,以便处理标签在块之间拆分的情况。这允许块大小随您的需要而变大或变小。

<?php
function getXmpData($filename, $chunk_size = 1024)
{
 if (!is_int($chunkSize)) {
  throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
 }

 if ($chunkSize < 12) {
  throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
 }

 if (($file_pointer = fopen($filename, 'rb')) === FALSE) {
  throw new RuntimeException('Could not open file for reading');
 }

 $tag = '<x:xmpmeta';
 $buffer = false;

 // find open tag
 while ($buffer === false && ($chunk = fread($file_pointer, $chunk_size)) !== false) {
  if(strlen($chunk) <= 10) {
   break;
  }
  if(($position = strpos($chunk, $tag)) === false) {
   // if open tag not found, back up just in case the open tag is on the split.
   fseek($file_pointer, -10, SEEK_CUR);
  } else {
   $buffer = substr($chunk, $position);
  }
 }

 if($buffer === false) {
  fclose($file_pointer);
  return false;
 }

 $tag = '</x:xmpmeta>';
 $offset = 0;
 while (($position = strpos($buffer, $tag, $offset)) === false && ($chunk = fread($file_pointer, $chunk_size)) !== FALSE && !empty($chunk)) {
  $offset = strlen($buffer) - 12; // subtract the tag size just in case it's split between chunks.
  $buffer .= $chunk;
 }

 fclose($file_pointer);

 if($position === false) {
  // this would mean the open tag was found, but the close tag was not.  Maybe file corruption?
  throw new RuntimeException('No close tag found.  Possibly corrupted file.');
 } else {
  $buffer = substr($buffer, 0, $position + 12);
 }

 return $buffer;
}
?>

回答by Liam

If you have ExifTool available (a very useful tool) and can run external commands, you can use it's option to extract XMP data (-xmp:all) and output it in JSON format (-json), which you can then easily convert to a PHP object:

如果您有可用的 ExifTool(一个非常有用的工具)并且可以运行外部命令,您可以使用它的选项来提取 XMP 数据 ( -xmp:all) 并以 JSON 格式 ( -json)输出它,然后您可以轻松地将其转换为 PHP 对象:

$command = 'exiftool -g -json -struct -xmp:all "'.$image_path.'"';
exec($command, $output, $return_var);
$metadata = implode('', $output);
$metadata = json_decode($metadata);

回答by Sebastien B.

Bryan's solution was the best one so far, but it had a few issues so I modified it to simplify it, and remove some functionality.

Bryan 的解决方案是迄今为止最好的解决方案,但它存在一些问题,因此我对其进行了修改以简化它,并删除了一些功能。

There were three issues I found with his solution:

我发现他的解决方案存在三个问题:

A) If the chunk extracted falls right in between one of the strings we're searching for, it won't find it. Small chunk sizes are more likely to cause this issue.

A) 如果提取的块正好位于我们正在搜索的字符串之一之间,它将找不到它。小块大小更可能导致此问题。

B) If the chunk contains both the start AND the end, it won't find it. This is an easy one to fix with an extra if statement to recheck the chunk that the start is found in to see if the end is also found.

B)如果块包含开始和结束,它不会找到它。这是一个很容易修复的额外 if 语句,以重新检查在其中找到开始的块以查看是否也找到了结尾。

C) The else statement added to the end to break the while loop if it doesn't find the xmp data has a side effect that if the start element isn't found on the first pass, it will not check anymore chunks. This is likely easy to fix too, but with the first issue it's not worth it.

C) 如果没有找到 xmp 数据,将 else 语句添加到末尾以中断 while 循环有副作用,如果在第一遍中找不到开始元素,它将不再检查块。这也很可能很容易解决,但对于第一个问题,这是不值得的。

My solution below isn't as powerful, but it's more robust. It will only check one chunk, and extract the data from that. It will only work if the start and end are in that chunk, so the chunk size needs to be large enough to ensure that it always captures that data. From my experience with Adobe Photoshop/Lightroom exported files, the xmp data typically starts at around 20kB, and ends at around 45kB. My chunk size of 50k seems to work nicely for my images, it would be much less if you strip some of that data on export, such as the CRS block that has a lot of develop settings.

我下面的解决方案没有那么强大,但它更强大。它只会检查一个块,并从中提取数据。只有当开始和结束在那个块中时它才会工作,因此块大小需要足够大以确保它始终捕获该数据。根据我使用 Adob​​e Photoshop/Lightroom 导出文件的经验,xmp 数据通常从 20kB 左右开始,到 45kB 左右结束。我的 50k 块大小似乎很适合我的图像,如果您在导出时剥离一些数据,例如具有很多开发设置的 CRS 块,它会少得多。

function getXmpData($filename)
{
    $chunk_size = 50000;
    $buffer = NULL;

    if (($file_pointer = fopen($filename, 'r')) === FALSE) {
        throw new RuntimeException('Could not open file for reading');
    }

    $chunk = fread($file_pointer, $chunk_size);
    if (($posStart = strpos($chunk, '<x:xmpmeta')) !== FALSE) {
        $buffer = substr($chunk, $posStart);
        $posEnd = strpos($buffer, '</x:xmpmeta>');
        $buffer = substr($buffer, 0, $posEnd + 12);
    }
    fclose($file_pointer);
    return $buffer;
}

回答by Luká? ?ádek

Thank you Sebastien B. for that shortened version :). If you want to avoid the problem, when chunk_size is just too small for some files, just add recursion.

谢谢塞巴斯蒂安 B. 的缩短版本:)。如果你想避免这个问题,当 chunk_size 对于某些文件来说太小,只需添加递归。

function getXmpData($filename, $chunk_size = 50000){      
  $buffer = NULL;
  if (($file_pointer = fopen($filename, 'r')) === FALSE) {
    throw new RuntimeException('Could not open file for reading');
  }

  $chunk = fread($file_pointer, $chunk_size);
  if (($posStart = strpos($chunk, '<x:xmpmeta')) !== FALSE) {
      $buffer = substr($chunk, $posStart);
      $posEnd = strpos($buffer, '</x:xmpmeta>');
      $buffer = substr($buffer, 0, $posEnd + 12);
  }

  fclose($file_pointer);

// recursion here
  if(!strpos($buffer, '</x:xmpmeta>')){
    $buffer = getXmpData($filename, $chunk_size*2);
  }

  return $buffer;
}

回答by Mathias Vitalis

I've developped the Xmp Php Tookit extension : it's a php5 extension based on the adobe xmp toolkit, which provide the main classes and method to read/write/parse xmp metadatas from jpeg, psd, pdf, video, audio... This extension is under gpl licence. A new release will be available soon, for php 5.3 (now only compatible with php 5.2.x), and should be available on windows and macosx (now only for freebsd and linux systems). http://xmpphptoolkit.sourceforge.net/

我已经开发了 Xmp Php Tookit 扩展:它是一个基于 adobe xmp 工具包的 php5 扩展,它提供了从 jpeg、psd、pdf、视频、音频...读取/写入/解析 xmp 元数据的主要类和方法。扩展在gpl许可下。一个新版本即将推出,适用于 php 5.3(现在只与 php 5.2.x 兼容),并且应该可以在 windows 和 macosx 上使用(现在只适用于 freebsd 和 linux 系统)。 http://xmpphptoolkit.sourceforge.net/