hachoir-metadata extracts metadata from multimedia files: music, picture, video, but also archives. It supports most common file formats: * Archives: bzip2, gzip, zip, tar * Audio: MPEG audio ("MP3"), WAV, Sun/NeXT audio, Ogg/Vorbis (OGG), MIDI, AIFF, AIFC, Real audio (RA) * Image: BMP, CUR, EMF, ICO, GIF, JPEG, PCX, PNG, TGA, TIFF, WMF, XCF * Misc: Torrent * Program: EXE * Video: ASF format (WMV video), AVI, Matroska (MKV), Quicktime (MOV), Ogg/Theora, Real media (RM) It tries to give as much information as possible. For some file formats, it gives more information than libextractor for example, such as the RIFF parser, which can extract creation date, software used to generate the file, etc. But hachoir-metadata cannot guess informations. The most complex operation is just to compute duration of a music using frame size and file size. hachoir-metadata has three modes: * classic mode: extract metadata, you can use --level=LEVEL to limit quantity of information to display (and not to extract) * --type: show on one line the file format and most important informations * --mime: just display file MIME type The command 'hachoir-metadata --mime' works like 'file --mime', and 'hachoir-metadata --type' like 'file'. But today file command supports more file formats then hachoir-metadata. Website: http://hachoir.org/wiki/hachoir-metadata Example ======= Example on AVI video (RIFF file format):: $ hachoir-metadata pacte_des_gnous.avi Common: - Duration: 4 min 25 sec - Comment: Has audio/video index (248.9 KB) - MIME type: video/x-msvideo - Endian: Little endian Video stream: - Image width: 600 - Image height: 480 - Bits/pixel: 24 - Compression: DivX v4 (fourcc:"divx") - Frame rate: 30.0 Audio stream: - Channel: stereo - Sample rate: 22.1 KHz - Compression: MPEG Layer 3 Modes --mime and --type ======================= Option --mime ask to just display file MIME type (works like UNIX "file --mime" program):: $ hachoir-metadata --mime logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico logo-Kubuntu.png: image/png sheep_on_drugs.mp3: audio/mpeg wormux_32x32_16c.ico: image/x-ico Option --file display short description of file type (works like UNIX "file" program):: $ hachoir-metadata --type logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico logo-Kubuntu.png: PNG picture: 331x90x8 (alpha layer) sheep_on_drugs.mp3: MPEG v1 layer III, 128.0 Kbit/sec, 44.1 KHz, Joint stereo wormux_32x32_16c.ico: Microsoft Windows icon: 16x16x32 What's new in hachoir-metadata 1.0? =================================== Version 1.0.1 ------------- * Only use hachoir_core.profiler with --profiler command line option so 'profiler' Python module is now optional * Set shebang to "#!/usr/bin/python" Version 1.0 ----------- * Real audio: read number of channel, bit rate, sample rate and compute compression rate * JPEG: Read user commment * Windows ANI: Read frame rate * Use Language from hachoir_core to store language from ID3 and MKV * OLE2 and FLV: Extractors are now fault tolerant What's new in hachoir-metadata 0.10.0? ====================================== hachoir-metadata is now fault tolerant (like hachoir-core and hachoir-parser)! It is also robust against fuzzing tests. New supported formats: * Microsoft Archive (.mar) * Microsoft Office documents: Word (.doc), Excel (.xls), Powerpoint (.ppt) * X11 Portable Compiled Font (.pcf) * New-style Executable (Windows 16-bits program) New features: * Make a distinction between the raw value and the formated value, so it's possible to reuse data. Eg. number of channel raw value is 2 (integer) and formated value is "stereo" (string) * Add plugin for Nautilus program (of Gnome project) * Add plugin for Konqueror program (of KDE project) * New API: * hasattr(meta, "key") => meta.has("key") * meta.key[0] => meta.get("key") or meta.getText("key") * "quality" argument to limit extraction complexity (quality=0 is the fastest extraction, quality=1.0 is the best but also the slowest) * Code is now fault tolerant: don't crash on error but just display error message and continue the extraction * creation_date and last_modification value type is datetime.date() or datetime.datetime() (and not a string) * duration type is datetime.timedelta(): precision is now microseconds and not milliseconds Changes: * track_number and track_total raw value type is an integer * MPEG audio: read music genre and language from ID3v2 and compute approximation of bit rate and duration of VBR MP3 * JPEG: support progressive JPEG (all start of frame types) * JPEG (IPTC): Support creation date (keys 55 and 60) * Convert more strings to Unicode * Reject sample rate smalller than 1 kHz * MKV: don't read metadata tags ("SimpleTags") if quality