Skip to content

Old implementation of the MaxTract system for re-engineering mathematical PDF documents.

Notifications You must be signed in to change notification settings

assistech-iitdelhi/MaxTract

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MaxTract

Old implementation of the MaxTract system for re-engineering mathematical PDF documents.

Directory Structure

  • src: Source files.

    • ccl-tiff: Connected component labelling for TIFF files
    • pdfExtract: PDF extraction
    • linearize: Linearization grammar
    • drivers: Output drivers (not yet uploaded)
    • main: Maxtract main file to pull it all together (not yet uploaded)
  • samples: Some sample files.

    • pdf: Sample PDF documents.
    • tif: Sample Tiff files for testing ccl-tiff. Some are multipage tiffs.
    • json: Sample json output files.

Installation

  • Subdirectories of src contain sources for submodules and contain Makefiles that compile the code. They also have README files with additional details.
  • Descend into each subdirectory and run 'make' after installing the pre-requisites noted below:-
    • ccl-tiff has a dependency on libtiff for tiff image processing. Install libtiff5-dev using 'apt-get install libtiff5-dev'
    • pdfExtract/linearizer has a dependency on json-wheel for processing json files and on pdftk for decompressing pdf files. Install pdftk using 'apt-get install pdftk'. Install json-wheel from the tar file in this directory. If you see the error 'Package `netstring' not found' then run 'apt install libocamlnet-ocaml-dev'. If that doesn't work try 'opam install netstring'. opam is package manager for ocmal i.e. the equivalent of pip for python.)
    • linearize module has no additional pre-requisites

About

Old implementation of the MaxTract system for re-engineering mathematical PDF documents.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • OCaml 78.8%
  • Makefile 12.9%
  • C 5.9%
  • Python 2.1%
  • Shell 0.3%