- Improve WAV parser by focusing on performance rather than on attempting a best-effort when extracting metadata from files that do not strictly follow the format spec.
- Improve WAV parser by performing a best-effort when extracting metadata from files that do not strictly follow the format spec.
- Add support for Ruby 3.2 and 3.3.
- Improved stability for mp4 parser when dealing with corrupted FTYP boxes.
- Fixed bug with WAV file wrongly parsed as MP3.
- JSON format support.
- Prevent the default loading of thumbnails on TIFF-based formats to improve I/O.
- Add
avc1
andxavc
as brand codes in the mp4 format parser to allow more file types to be parsed correctly.
- Disable
udta
ISOBMFF box parsing, since their contents are not guaranteed to be consistent with the spec.
- Prevent infinite loops when parsing ISOBMFF boxes with size = 0 (meaning that the box extends to the end of the file).
- Improve resiliency in ISOBMFF parsing to missing mandatory boxes and fields.
- Simplify ISOBMFF frame rate calculations.
- Refactor.
- Added support for PDF 2.0
- Expanded test coverage for PDF parsing
- Revert change where variable frame rates in MOV and MP4 files would result in an array value for
frame_rate
.
- Adapt the ISOBMFF based decoder for parsing MOV and MP4 parsing.
- Fix MOV/MP4 issues:
- MP4 files being misidentified as MOV files.
- Dimensions being miscalculated when files include multiple tracks or transformations.
- Add support for
RW2
files.
- Bug fix for
CR3
files being misidentified asMOOV
.
- Add support for
CR3
files. - Add ISO base file format decoding functionality.
- Require minimum 2.6 ruby version.
- Bring back 2.6 to test matrix, we have jruby there which is still compatible with 2.6
- Drop
ks
dependency.
- Drop explicit support for Ruby
<2.7
. - Drop faraday dependencies.
- Loosen version constraints on other dependencies.
- Update measurometer metrics for consistency and clarity.
- Add support for
ARW
files.
- Add support for
AAC
files.
- Add support for
NEF
files.
- Fix
MP3Parser
taking precedence when parsingWEBP
files.
- Skip Exif chunks that are malformed during
WEBP
parsing.
- Add support for
WEBP
lossy, lossless and extended file formats.
- Add
heif_parser
and support forHEIF
andHEIC
formats. Exif parsing is still missing.
- Resolve bug when
stts
atom isnil
- Add support for
codecs
in moov_parser for video metadata
- Add support for
frame_rate
in moov_parser
- Dropping support for Ruby 2.2.X, 2.3.X and 2.4.X
- MP3: Fix negative length reads in edge cases by bumping
id3tag
version tov0.14.2
- Fix handling of 200 responses with
parse_http
as well as handling of very small responses which do not need range access
- Add option
headers:
toFormatParser.parse_http
- Change
FormatParser.parse_http
to follow HTTP redirects
- Add
#content_type
onResult
return values which makes sense for the detected filetype
- Add support for M3U format files
- Fix FormatParser.parse (with
results: :first
) to be deterministic
- DPX: Fix DPXParser to support images without aspect ratio
- MP3: Fix MP3Parser to return nil for TIFF files
- Add support to ruby 2.7
- MP3: Fix parser to not skip the first bytes if it's not an ID3 header
- Hotfix Moov parser
- MOV: Fix error "negative length"
- MOV: Fix reading dimensions in multi-track files
- MP3: Fix parse of the Xing header to not raise errors
- MP3: add suport to id3 v2.4.x
- JPEG: Update gem exifr to 1.3.8 to fix a bug
- Update gem id3tag to 0.14.0 to fix MP3 issues
- Fix MP3 frames reading to jump correctly to the next bytes
- The TIFF parser will now return :arw as format for Sony ARW files insted of :tif so that the caller can decide whether it wants to deal with RAW processing or not
- Updated gem exifr to fix problems related to jpeg files from Olympos microscopes, which often have bad thumbnail data
- Add ActiveStorage analyzer which can analyze ActiveStorage blobs. Enable it by setting
config.active_storage.analyzers.prepend FormatParser::ActiveStorage::BlobAnalyzer
- Ignore empty ID3 tags and do not allow them to overwrite others
- Update the id3tag dependency so that we can fallback to UTF8 instead of raising an error when parsing MP3 files
- Fix Zip parser to not raise error for invalid zip files, with an invalid central directory
- Adds option
stringify_keys: true
to #as_json methods (fix #151)
- MPEG: Ensure parsing does not inadvertently return an Integer instead of Result|nil
- MPEG: Scan further into the MPEG file than previously (scan 32 1KB chunks)
- MPEG: Ensure the parser does not raise an exception when there is no data to read for scanning beyound the initial header
- Adds support for MPEG video files
- Make sure EXIF results work correctly with ActiveSupport JSON encoders
- Correctly tag the license on Rubygems as MIT (Hippocratic) for easier audit
- Improve handling of Sony ARW files (make sure the width/height is correctly recognized)
- Update Travis matrix and gitignore
- Mark m4v as one of the filename extensions likely to parse via the MOOV parser
- Adopt Hippocratic license v. 1.2 Note that this might make the license conditions unacceptable for your project. If that is the case, you can use the 0.17.X branch of the library which stays under the original, exact MIT license.
- Remove parser factories. A parser should respond to
likely_match?
andcall
. If a parser has to be instantiated anew for every call the parser should take care of instantiating itself. - Add support for BMP files with core headers (older version of the BMP format)
- All EXIF: Deal with EXIF orientations that get parsed as an array of [Orientation, nil] due to incorrect padding
- All EXIF: Make sure the 0 orientation does not get silently treated as orientation 8, mislabling images which are not rotated as being rotated (orientation changed)
- All EXIF: Make sure the 0 orientation (
unknown
) is correctly passed and represented - JPEG: Make sure multiple EXIF tags in APP1 markers get handled correctly (via overlays)
- Add
filename_hint
keyword argument toFormatParser.parse
. This can hint the library to apply the parser that will likely match for this filename first, and the other parsers later. This helps avoiding extra work when parsing less-popular file formats, and can be optionally used if the caller knows the filename of the original file. Note that the filename is only that: a hint, it helps apply parsers more efficiently but does not specify the actual format of the file that is going to be detected.
- Relax the "ks" dependency version since we do not need the constraint to be so strict
- Allow setting
:priority
when registering a parser, to make sure certain parsers are applied earlier - depending on detection confidence and file format popularity at WT.
- Care caching: Clear pages more deliberately instead of relegating them to GC
- JPEG: Clear the EXIF buffer explicitly
- PDF: Reduce the PDF parser to the basic binary detection (PDF/not PDF) until we have a better/more robust PDF parser
- MP3: Fix the byte length of MPEG frames calculation to correctly account for ID3V1 and ID3V2 instead of ID3V1 twice
- MP3: Remove the workaround for
id3tag
choking on non-matching genre strings (bumps dependency onid3tag
) - Use Measurometer provided by the measurometer gem
- Ogg: Add support for the Ogg format
- Make all reads in the MOOV decoder strict - fail early if reads are improperly sized
- Disable parsing for
udta
atoms in MP4/MOV since we do not have a good way of parsing them yet
- Use the same TIFF parsing flow for CR2 files as it seems we are not very reliable yet. The CR2 parser will need some work.
- Make sure JSON data never contains NaN, fix the test that was supposed to verify that but didn't
- Forcibly UTF-8 sanitize all EXIF data when building JSON
- Add a fixture to make sure all parsers can cope with an empty file when using
parse_http
- Terminate the ZIP parser early with empty input
- Terminate the MP3 parser early with empty or too small input
- Handle BMP files with pixel array offsets larger than 54
- Avoid ZIP checks in the JPEG parser which are no longer necessary
- Replace the homegrown ID3 parser with id3tag - this introduces id3tag
as a dependency in addition to
exifr
, but the gains are substantial.
- Ensure JPEG recognition only runs when the JPEG SOI marker is detected at the start of file. Previously the JPEG parser would scan for the marker, sometimes finding it (appropriately) in places like... MP3 album artwork inside ID3 tags. Or Keynote documents. Or whatnot - lots of things have JPEG thumbnails embedded.
- Make sure all strings going to the JSON representations of parse results are encoded as UTF-8 or escaped
- Make sure the
VERSION
constant is available in the loaded gem. Previously the constant would be made available by Bundler when developing the library - since it loads the.gemspec
which, in turn, requires the version.rb file, but when used as a gem the version.rb file would not end up being loaded.
- Reinstate support for Ruby 2.2.0
- Fix support for JRuby 9.0
- Relay upstream status from
RemoteIO
in thestatus_code
attribute (returns anInteger
)
- Add
Image#display_width_px
andImage#display_height_px
for EXIF/aspect corrected display dimensions, and provide those values from a few parsers already. Also make full EXIF data available for JPEG/TIFF inintrinsics[:exif]
- Adds
limits_config
option toFormatParser.parse()
for tweaking buffers and read limits externally
- Adds the
format_parser_inspect
binary for parsing a file from the commandline and returning results in JSON - Adds the
FormatParser.parse_at(path)
convenience method
- Fix a TIFF parsing regression introduced in 0.3.1 that led to all TIFFs being incorrectly parsed
- Fix a JPEG parsing regression introduced in 0.9.1
- Make sure MP3 parser returns
nil
when encountering infinite duration - Do not read JPEG APP1 markers that contain no EXIF data
- Explicitly replace
Float::INFINITY
values inAttributesJSON
withnil
as per JSON convention - Make sure the cached pages in
Care
are explicitly deleted after eachparse
call (should help GC) - Raise the pagefaults restriction to 16 to cope with "too many useless markers in JPEGs" scenario once more
- Perf: Make JPEG parser bail out earlier if no marker is found while scanning through 1024 bytes of data
- Add a parser for the BMP image file format
- Add
Measurometer
for applying instrumentation around FormatParser operaions. See documentation for usage.
- Configure read limits / pagefault limits centrally so that those limits make sense together
- Double the cache page size once more
- We no longer need exifr/jpeg
- Fix EXIF parsing in JPEG files
- Reject Keynote documents in JPEG parser
- Do not raise EXIFR errors for keynote files
- Correct broken comment for the audio nature
- Raise the cache page size during detection
- Fix ZIP entry filename parsing
- Add FLAC parser
- Add parse_atom_children_and_data_fields support
- Add basic detection of Office files
- Optimize EOCD signature lookup
- Adds a basic PDF parser
- Make sure root: and to_json without arguments work
- ZIP file format support
- Fix the bug with EXIF dimensions being used instead of pixel dimensions
- Pagefault limit
- Add seek modes required by exifr
- Implement a sane to_json as well
- Add default as_json
- Test on 2.5.0
- Remove post install warning
- Moved aiff_parser_spec.rb to spec/parsers
- CR2 file support
- Add require 'set' to format_parser.rb
- Use register_parser for natures/fmts
- Reverse API changes to support :first as default and add opts to parse_http
- Implement and comply with rubocop
- JPEG parser and Care fixes
- Add format and count options to parse_http
- Return first result as default
- Use hashes for MOOV atom default fields
- Implement parser DSL
- Fix read(0) on Care::IOWrapper, introduce top-level tests
- Fix mp3 parsing bug
- Add MOOV parser
- Add FDX parser
- Remove dry-structs
- New interface updates
- Add WAV parser
- Add MP3 parser
- Add FileInformation#intrinsics
- Disallow negative Care offsets
- Introduce a restrictive IO subset wrapper
- Switch rewind for seek in exif parser
- Prep for OSS release
- Add fuzz spec
- Improve orientation parsing
- Optimisation for PNG and invalid input protection on JPEG
- Add AIFF parser
- Add parsers for PNG, JPG, TIFF, PSD
- Add GIF parser
- Add DPX parser