Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancies in citation format when using citar-insert-reference #784

Open
benthamite opened this issue Jun 14, 2023 · 11 comments
Open

Discrepancies in citation format when using citar-insert-reference #784

benthamite opened this issue Jun 14, 2023 · 11 comments

Comments

@benthamite
Copy link
Contributor

Apologies if I'm missing something obvious, but I notice that when I insert a formatted reference with citar-insert-reference, there are various discrepancies between the inserted reference and the same reference as it appears when exported with one of the org-mode export commands, such as org-md-export-to-markdown. Most notably, the titles are not capitalized correctly (e.g. the braces surrounding a word are not respected).

As an example, consider the following bibtex entry:

@online{Hanson2023CanHumansBe,
  abstract =	 {It is one of the most fundamental questions in the
                  social and human sciences: how culturally plastic are
                  people? Many anthropologists have long championed the
                  view that humans are very plastic; with matching
                  upbringing people can be made to behave a very wide
                  range of ways, and to want a very wide range of
                  things. Others say human nature is far more
                  constrained, and collect descriptions of "human
                  universals" (See Brown's 1991},
  author =	 {Hanson, Robin},
  langid =	 {english},
  timestamp =	 {2023-06-14 15:12:51 (GMT)},
  title =	 {Can humans be the {FORTRAN} of creatures?},
  url =
                  {https://www.overcomingbias.com/p/how-plastic-are-peoplehtml},
  urldate =	 {2023-06-14},
}

Inserting this reference by invoking citar-insert-reference results in

[1]R. Hanson, “Can humans be the fortran of creatures?” https://www.overcomingbias.com/p/how-plastic-are-peoplehtml (accessed Jun. 14, 2023).

Whereas exporting a file that cites that work via org-md-export-to-markdown will show it in the "bibliography" section as

R. Hanson, “Can humans be the FORTRAN of creatures?” https://www.overcomingbias.com/p/how-plastic-are-peoplehtml (accessed Jun. 14, 2023).

I have used the IEEE csl citation style in this case, but the issue occurs with all the styles I tried.

@bdarcus
Copy link
Contributor

bdarcus commented Jun 14, 2023

Are you using the default formatter, or the citeproc-el one?

I realize it's not documented in the README (PR welcome), but I suspect that's it; either you aren't using the citeproc formatter, or you're using a different style for that?

@benthamite
Copy link
Contributor Author

Thanks for the quick reply.

In my config, the values of citar-citeproc-csl-styles-dir and citar-citeproc-csl-locales-dir are set to org-cite-csl-styles-dir and org-cite-csl-locales-dir, respectively, and citar-format-reference-function is set to citar-citeproc-format-reference. Finally, citar-citeproc-select-csl-style is set to ieee.csl, which is a file that exists in citar-citeproc-csl-styles-dir. Is there anything else that needs to be done for citar-citeproc.el to work properly?

In case it helps understand what might be going on, I interned citar-citeproc-format-reference and copied the output of each step in the evaluation to the attached file.

debugger-output.txt

@bdarcus
Copy link
Contributor

bdarcus commented Jun 14, 2023

OK.

To go back to this:

Most notably, the titles are not capitalized correctly (e.g. the braces surrounding a word are not respected.

Here we're using the citar cache, which we populated using parsebib-parse with the display option, rather than parsing the bib on its own. That strips intra-field markup.

Obviously that enhances responsiveness, at the expense of some correctness.

Not sure if there's an easy way to resolve that, or if we could make it configurable.

@bdarcus
Copy link
Contributor

bdarcus commented Aug 24, 2023

@benthamite can you confirm my hunch in my last reply?

@benthamite
Copy link
Contributor Author

Apologies, I hadn't seen your previous message. I should be able to look into this within the next couple of days.

@benthamite
Copy link
Contributor Author

Hi @bdarcus,

For testing purposes, I created bibliography.bib:

@online{Hanson2023CanHumansBe,
  abstract =	 {It is one of the most fundamental questions in the
                  social and human sciences: how culturally plastic are
                  people? Many anthropologists have long championed the
                  view that humans are very plastic; with matching
                  upbringing people can be made to behave a very wide
                  range of ways, and to want a very wide range of
                  things. Others say human nature is far more
                  constrained, and collect descriptions of "human
                  universals" (See Brown's 1991},
  author =	 {Hanson, Robin},
  langid =	 {english},
  timestamp =	 {2023-06-14 15:12:51 (GMT)},
  title =	 {Can humans be the {FORTRAN} of creatures?},
  url =
                  {https://www.overcomingbias.com/p/how-plastic-are-peoplehtml},
  urldate =	 {2023-06-14},
}

and config.el:

(setq org-cite-global-bibliography '("bibliography.bib"))
(setq org-cite-export-processors
      '((t . (csl "ieee.csl"))))
(setq citar-bibliography '("bibliography.bib"))

After evaluating the latter, I evaluate (citar-citeproc--itemgetter '("Hanson2023CanHumansBe")), which returns

(("Hanson2023CanHumansBe" (URL . "https://www.overcomingbias.com/p/how-plastic-are-peoplehtml") (title . "Can humans be the fortran of creatures?") (blt-type . "online") (type . "webpage") (language . "en-US") (abstract . "It is one of the most fundamental questions in the social and human sciences: how culturally plastic are people? Many anthropologists have long championed the view that humans are very plastic; with matching upbringing people can be made to behave a very wide range of ways, and to want a very wide range of things. Others say human nature is far more constrained, and collect descriptions of \"human universals\" (See Brown’s 1991") (author ((family . "Hanson") (given . "Robin"))) (accessed (date-parts (2023 6 14)))))

By contrast, if I create document.org

[cite:@Hanson2023CanHumansBe]

#+print_bibliography:

and run org-md-export-to-markdown, I get

<a href="#citeproc_bib_item_1">[1]</a>  

<style>.csl-left-margin{float: left; padding-right: 0em;}
 .csl-right-inline{margin: 0 0 0 1em;}</style><div class="csl-bib-body">
  <div class="csl-entry"><a id="citeproc_bib_item_1"></a>
    <div class="csl-left-margin">[1]</div><div class="csl-right-inline">R. Hanson, “Can humans be the FORTRAN of creatures?” <a href="https://www.overcomingbias.com/p/how-plastic-are-peoplehtml">https://www.overcomingbias.com/p/how-plastic-are-peoplehtml</a> (accessed Jun. 14, 2023).</div>
  </div>
</div>

As you can see, the word "FORTRAN" is in all caps in the exported Markdown, but not in the output of (citar-citeproc--itemgetter '("Hanson2023CanHumansBe")).

I'm not entirely sure this is the kind of test you wanted me to run. Please let me know if there's anything else I should do. I'm attaching the relevant files in case it helps you reproduce the issue.
files.zip

@bdarcus
Copy link
Contributor

bdarcus commented Aug 29, 2023

Thanks.

I'm almost certain my assumption is correct; that using our cache for the formatting means the TeX markup gets stripped before citeproc sees it.

Still not sure what we can, or should, do about that.

@leinfink
Copy link

leinfink commented Jul 9, 2024

I wanted to bump this, as I just stumbled upon the same issue.

@leinfink
Copy link

leinfink commented Jul 9, 2024

Maybe a quick fix would just be a manual (prefix) toggle whether to use the cache or not?

@leinfink
Copy link

leinfink commented Jul 9, 2024

Oh, nevermind, I think that was just an issue with a CSL style. Am I right in assuming that this got fixed, actually?

@bdarcus
Copy link
Contributor

bdarcus commented Jul 9, 2024

Nothing changed on this end.

And I should clarify: it's not per se the cache, but rather that we use parsebib-parse with the display option to populate it.

We would do the same without a cache, and the alternative would be us parsing the input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants