Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintaining roman numerals for pagination #115

Merged
merged 2 commits into from
Jul 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 0 additions & 11 deletions adsingestp/parsers/elsevier.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,24 +57,13 @@ def _parse_issue(self):
self.base_metadata["issue"] = self.record_header.find("prism:number").get_text()

def _parse_page(self):
regex_roman = re.compile(r"[ivxIVX]+")
# TODO the perl has some code for first/last pages that start with L, e, CO, IFC - add that
# TODO there's also some regex in the perl checking for - or , - check/add that
if self.record_header.find("prism:startingPage"):
fpage = self.record_header.find("prism:startingPage").get_text()
if regex_roman.match(fpage):
try:
fpage = utils.ROMAN_TO_NUMBER[fpage.lower()]
except KeyError:
logger.warning("Can't convert Roman numeral %s to a number", fpage)
self.base_metadata["page_first"] = fpage
if self.record_header.find("prism:endingPage"):
lpage = self.record_header.find("prism:endingPage").get_text()
if regex_roman.match(lpage):
try:
lpage = utils.ROMAN_TO_NUMBER[lpage.lower()]
except KeyError:
logger.warning("Can't convert Roman numeral %s to a number", lpage)
self.base_metadata["page_last"] = lpage
if self.record_meta.find("ce:article-number"):
self.base_metadata["electronic_id"] = self.record_meta.find(
Expand Down
124 changes: 124 additions & 0 deletions tests/stubdata/input/els_roman_num_1.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
<doc:document
xmlns:doc="http://www.elsevier.com/xml/document/schema"
xmlns:dp="http://www.elsevier.com/xml/common/doc-properties/schema"
xmlns:cps="http://www.elsevier.com/xml/common/consyn-properties/schema"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dct="http://purl.org/dc/terms/"
xmlns:prism="http://prismstandard.org/namespaces/basic/2.0/"
xmlns:oa="http://vtw.elsevier.com/data/ns/properties/OpenAccess-1/"
xmlns:bam="http://vtw.elsevier.com/data/voc/ns/bam-vtw-1/"
xmlns:cp="http://vtw.elsevier.com/data/ns/properties/Copyright-1/"
xmlns:cja="http://www.elsevier.com/xml/cja/schema"
xmlns:ja="http://www.elsevier.com/xml/ja/schema"
xmlns:bk="http://www.elsevier.com/xml/bk/schema"
xmlns:ce="http://www.elsevier.com/xml/common/schema"
xmlns:mml="http://www.w3.org/1998/Math/MathML"
xmlns:cals="http://www.elsevier.com/xml/common/cals/schema"
xmlns:tb="http://www.elsevier.com/xml/common/table/schema"
xmlns:sa="http://www.elsevier.com/xml/common/struct-aff/schema"
xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/schema"
xmlns:xlink="http://www.w3.org/1999/xlink">
<rdf:RDF>
<rdf:Description rdf:about="http://dx.doi.org/10.1016/0038-0717(95)90036-5">
<dct:format>application/xml</dct:format>
<dct:title>Preface</dct:title>
<dct:creator>H.V.A. Bushby</dct:creator>
<dct:creator>R.A. Date</dct:creator>
<dct:creator>P.J. Dart</dct:creator>
<dct:description>Soil Biology and Biochemistry 27 (1995) vii-viii. doi:10.1016/0038-0717(95)90036-5</dct:description>
<prism:aggregationType>journal</prism:aggregationType>
<prism:publicationName>Soil Biology and Biochemistry</prism:publicationName>
<prism:copyright>Copyright @ unknown. Published by Elsevier Ltd</prism:copyright>
<dct:publisher>Elsevier Ltd</dct:publisher>
<prism:issn>0038-0717</prism:issn>
<prism:volume>27</prism:volume>
<prism:number>4</prism:number>
<prism:coverDisplayDate/>
<prism:coverDate>1995</prism:coverDate>
<prism:pageRange>vii-viii</prism:pageRange>
<prism:startingPage>vii</prism:startingPage>
<prism:endingPage>viii</prism:endingPage>
<prism:doi>10.1016/0038-0717(95)90036-5</prism:doi>
<prism:url>http://dx.doi.org/10.1016/0038-0717(95)90036-5</prism:url>
<dct:identifier>doi:10.1016/0038-0717(95)90036-5</dct:identifier>
<dp:availableOnlineInformation>
<bam:availableOnline
xmlns:cp="http://www.elsevier.com/xml/common/consyn-properties/schema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">2003-12-09T00:00:00.000Z

</bam:availableOnline>
<bam:vorAvailableOnline
xmlns:cp="http://www.elsevier.com/xml/common/consyn-properties/schema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>
</dp:availableOnlineInformation>
</rdf:Description>
</rdf:RDF>
<dp:document-properties>
<dp:raw-text
xmlns:cp="http://www.elsevier.com/xml/common/consyn-properties/schema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> Soil Biol. Biochem. Vol. 27, No. 4/5, pp. vii-viii, 1995
Pergamon Elsevier Science Ltd. Printed in Great Britain

PREFACE

This issue of Soil Biology &amp; Biochemistry contains many of the contributions presented at the 10th Australian Nitrogen Fixation Conference held in Brisbane, Australia on the 7-10 September 1993. These conferences are presented by the Australian Society for Nitrogen Fixation.
The first conference (1955) arose from a recognition of a critical need for attention to the quality and use of legume inoculants, especially in the context of problems faced by the then fledgling commercial inoculant manufacturing industry. The two subsequent conferences concentrated on a similar theme, whereas the 4th and 5th conferences emphasized the importance of selection of strains of Rhizobium and Bradyrhizobium, factors affecting nodulation and the physiology of fixation. Reflecting the increased diversity (and specialization) of research on all aspects of N2 fixation, the 6th and 7th conferences (1979 and 1984, respectively) included sections on the ecology of root-nodule bacteria in soil and rhizosphere, biochemistry and physiology of the N2-reduction process and techniques of N measurement. Also included in the 7th conference were sections on non-legume symbioses and asymbiotic N2 fixation. Interest in the molecular genetics of nodulation and the N2-fixation process has expanded rapidly since the 6th meeting, in which there was only one paper, to be a predominant proportion of the contributions at the last three conferences. There were fewer papers on the ecology of rhizobia or bradyrhizobia in this meeting compared to the 9th conference (5 vs 14 papers) but an increase in the number of papers reporting on N2 fixation in agricultural production systems (11 vs 16). The emphasis in molecular studies changed from methodology and sequencing details to the significance of results for the end user. This may reflect the impact of simpler and more reliable techniques and may herald a resurgence of interest in ecological studies.
Recognizing the diverse specialities of those involved in research on N2 fixation, the theme of the 10th conference, Genetics, Microbial Ecology and Nitrogen Fixation-Is there a Sustainable Symbiosis?, was an attempt to integrate the sometimes separate disciplines of microbial ecology, molecular biology, genetics and the varied studies on N2 fixation. The theme also reflected an increasingly difficult research environment in which resources are more limited than in earlier times and the words "relevance", "sustainability" and "increased efficiency" consistently appear in reports from scientific administrators, funding bodies and politicians.
Participants in the 10th conference were invited to contribute to discussion of seven topics: metabolic pathways associated with N2 fixation; nodulation and N2 fixation with trees and shrubs; non-symbiotic N2 fixation; microbe-plant genetic interactions in N2 fixation; N2 fixation in agricultural production systems; nodulation and N2 fixation under environmental stress; and advances in inoculant and seed inoculation technology. In addition, three workshops considered the special topics of: the future of N2 fixation research in Australia; t5N methodology; and research and commercial legume growers.
Although primarily a national conference, international participants from 16 countries reflected the wide interest in N2 fixation for both primary production and sustainability.
The 11th Australian Nitrogen Fixation Conference will be in Perth, Western Australia, 23-27 September 1996. For further information contact:

The Secretary

Australian Society for Nitrogen Fixation
Faculty of Agriculture/CLIMA
University of Western Australia
Nedlands, WA 6009
Australia

All papers were accepted for publication on the basis of scientific merit between 1 December 1993 and 31 January 1994.

vii
viii Preface

Acknowledgements-We are grateful to John S. Waid, Executive editor of Soil Biology &amp; Biochemistry for his encouragement and editorial assistance in preparing the papers for publication in the journal.The Australian Society for Nitrogen Fixation gratefully acknowledges the support received for the conference and presenting these proceedings from the following businesses and institutions: Australian International Development Assistance Bureau; Bio-Care Technology Pty Ltd; CSIRO, Division of Tropical Crops and Pastures; Grains Research and Development Corporation; Inoculant Services Australia Pty Ltd; Pacific Seeds Pty Ltd; Queensland Department of Primary Industry, Plant Protection Division; University of Queensland, Department of Agriculture.

H. V. A. BUSHBY, R. A. DATE and P. J. DART

Guest Editors


</dp:raw-text>
<dp:aggregation-type>Journals</dp:aggregation-type>
<dp:version-number>S350.2</dp:version-number>
</dp:document-properties>
<ja:simple-article version="5.0" xml:lang="en" docsubtype="edi">
<ja:item-info>
<ja:jid>SBB</ja:jid>
<ja:aid>90036</ja:aid>
<ce:pii>0038-0717(95)90036-5</ce:pii>
<ce:doi>10.1016/0038-0717(95)90036-5</ce:doi>
<ce:copyright type="unknown" year="1995"/>
</ja:item-info>
<ja:simple-head>
<ce:title>Preface</ce:title>
<ce:author-group>
<ce:author>
<ce:given-name>H.V.A.</ce:given-name>
<ce:surname>Bushby</ce:surname>
<ce:roles>Guest Editor</ce:roles>
</ce:author>
<ce:author>
<ce:given-name>R.A.</ce:given-name>
<ce:surname>Date</ce:surname>
<ce:roles>Guest Editor</ce:roles>
</ce:author>
<ce:author>
<ce:given-name>P.J.</ce:given-name>
<ce:surname>Dart</ce:surname>
<ce:roles>Guest Editor</ce:roles>
</ce:author>
</ce:author-group>
</ja:simple-head>
</ja:simple-article>
</doc:document>
24 changes: 24 additions & 0 deletions tests/stubdata/input/els_roman_num_2.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
<doc:document xmlns:doc="http://www.elsevier.com/xml/document/schema" xmlns:dp="http://www.elsevier.com/xml/common/doc-properties/schema" xmlns:cps="http://www.elsevier.com/xml/common/consyn-properties/schema" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dct="http://purl.org/dc/terms/" xmlns:prism="http://prismstandard.org/namespaces/basic/2.0/" xmlns:oa="http://vtw.elsevier.com/data/ns/properties/OpenAccess-1/" xmlns:bam="http://vtw.elsevier.com/data/voc/ns/bam-vtw-1/" xmlns:cp="http://vtw.elsevier.com/data/ns/properties/Copyright-1/" xmlns:cja="http://www.elsevier.com/xml/cja/schema" xmlns:ja="http://www.elsevier.com/xml/ja/schema" xmlns:bk="http://www.elsevier.com/xml/bk/schema" xmlns:ce="http://www.elsevier.com/xml/common/schema" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:cals="http://www.elsevier.com/xml/common/cals/schema" xmlns:tb="http://www.elsevier.com/xml/common/table/schema" xmlns:sa="http://www.elsevier.com/xml/common/struct-aff/schema" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/schema" xmlns:xlink="http://www.w3.org/1999/xlink"><rdf:RDF><rdf:Description rdf:about="http://dx.doi.org/10.1016/S0047-2484(09)00231-0"><dct:format>application/xml</dct:format><dct:title>Book Reviews</dct:title><dct:description>Journal of Human Evolution 58 (2010) I-I. doi:10.1016/S0047-2484(09)00231-0</dct:description><prism:aggregationType>journal</prism:aggregationType><prism:publicationName>Journal of Human Evolution</prism:publicationName><prism:copyright>Copyright @ 2009 Published by Elsevier Ltd All rights reserved.</prism:copyright><dct:publisher>Elsevier Ltd</dct:publisher><prism:issn>0047-2484</prism:issn><prism:volume>58</prism:volume><prism:number>2</prism:number><prism:coverDisplayDate/><prism:coverDate>2010</prism:coverDate><prism:pageRange>I-I</prism:pageRange><prism:startingPage>I</prism:startingPage><prism:endingPage>I</prism:endingPage><prism:doi>10.1016/S0047-2484(09)00231-0</prism:doi><prism:url>http://dx.doi.org/10.1016/S0047-2484(09)00231-0</prism:url><dct:identifier>doi:10.1016/S0047-2484(09)00231-0</dct:identifier><dp:availableOnlineInformation><bam:availableOnline xmlns:cp="http://www.elsevier.com/xml/common/consyn-properties/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">2010-01-25T00:00:00.000Z</bam:availableOnline><bam:vorAvailableOnline xmlns:cp="http://www.elsevier.com/xml/common/consyn-properties/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/></dp:availableOnlineInformation></rdf:Description></rdf:RDF><dp:document-properties><dp:raw-text xmlns:cp="http://www.elsevier.com/xml/common/consyn-properties/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Book Reviews
Reviewed in February at http://www.elsevierblogs.com/jhevReviews
Aziz F, Morwood MJ, van den Bergh GD, eds. (2009) Pleistocene Geology, Palaeontology and Archaeology of the Soa Basin, Central Flores,
Indonesia. Publication of the Centre for Geological Survey, Ministry of Energy and Mineral Resources, Republic of Indonesia. 146 pp.
Haile-Selassie Y &amp; G WoldeGabriel (2009) Ardipithecus kadabba. University of California Press. 664 pp. ISBN 978-0520254404.
Wilkins JS (2009) Species: the History of an Idea. U of California Press. 305 pp. ISBN 978-0-520-26085-6.
Book Review Editor: Dr. Rebecca Rogers Ackermann, Ph.D.
Contents lists available at ScienceDirect
Journal of Human Evolution
journal homepage: www.elsevier.com/locate/jhevol
Journal of Human Evolution 58 (2010) I
doi:10.1016/S0047-2484(09)00231-0
</dp:raw-text><dp:aggregation-type>Journals</dp:aggregation-type><dp:version-number>S300.1</dp:version-number></dp:document-properties><ja:simple-article docsubtype="edb" version="5.0" xml:lang="en">
<ja:item-info>
<ja:jid>YJHEV</ja:jid>
<ja:aid>80000859</ja:aid>
<ce:pii>S0047-2484(09)00231-0</ce:pii>
<ce:doi>10.1016/S0047-2484(09)00231-0</ce:doi>
<ce:copyright type="other" year="2009"/>
</ja:item-info>
<ja:simple-head>
<ce:title>Book Reviews</ce:title>
</ja:simple-head>
</ja:simple-article></doc:document>
Loading
Loading