From 6e8f237eea66c917ea556a064e156d0d06e226d4 Mon Sep 17 00:00:00 2001 From: griff-rees <60181741+griff-rees@users.noreply.github.com> Date: Tue, 6 Dec 2022 02:29:58 +0000 Subject: [PATCH] Add a footnote reference following #55 --- docs/README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/README.md b/docs/README.md index 39d6d64..a26ae2c 100644 --- a/docs/README.md +++ b/docs/README.md @@ -4,7 +4,7 @@ `alto2txt` converts `XML` `ALTO`/`METS` Optical Character Recognition (OCR) scans into plaintext files with minimal metadata. -`ALTO` and `METS` are industry standards targeting newspaper digitization used by hundreds of modern, large-scale newspaper digitization projects. One text file is output per article, each complemented by one `XML` metadata file. +`ALTO` and `METS` are industry standards maintained by the US Library of Congress targeting newspaper digitization used by hundreds of modern, large-scale newspaper digitization projects. One text file is output per article, each complemented by one `XML` metadata file[^CHNC]. [`METS` (Metadata Encoding and Transmission Standard)](http://www.loc.gov/standards/mets/) is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed in `XML`. [`ALTO` (Analyzed Layout and Text Objects)](https://www.loc.gov/standards/alto/) is an [`XML schema`](https://en.wikipedia.org/wiki/XML_schema) for technical metadata describing the layout and content of text resources such as book or newspaper pages. `ALTO` is often used in combination with `METS` but can also be used independently. Details of the `ALTO` schema are avilable at https://github.com/altoxml/schema. @@ -147,4 +147,6 @@ Quality assurance is performed to check for: * Check and ensure that articles that span multiple pages are pulled into a single article file. * Smarter handling of articles spanning multiple pages. +[^CHNC]: For a more detailed description see: https://www.coloradohistoricnewspapers.org/forum/what-is-metsalto/ + > Last updated 2022-12-06