A document answering some questions we've received and thought to note down.
This will be updated as we receive more questions/feedback, Contributions welcome!
Answer:
fieldsOfStudy
exist for a paper if we have this information from an external source. It is a list of strings.s2FieldsOfStudy
is a list ofDict[str, str]
, each with keyscategory
andsource
:- if
source
isexternal
, this is the same asfieldsOfStudy
- if
source
iss2-fos-model
, this is determined by our internally developed classifier.
- if
s2FieldsOfStudy
is a replacement forfieldsOfStudy
, the latter will be deprecated and removed eventually.
>>> r = requests.get('https://api.semanticscholar.org/graph/v1/paper/corpusid:4698432', params={'fields': 'fieldsOfStudy,s2FieldsOfStudy'})
>>> print(json.dumps(r.json(), indent=2))
{
"paperId": "1fec9d41d372267b4474f18cbeadd806c8b67adb",
"fieldsOfStudy": [
"Computer Science"
],
"s2FieldsOfStudy": [
{
"category": "Computer Science",
"source": "external"
},
{
"category": "Computer Science",
"source": "s2-fos-model"
}
]
}
- When you request the embedding for a given paper, we return a version string too.
- This is an internal identifier, and does not correspond to any particular version of the HuggingFace SPECTER model.
- We are working on SPECTER 2.0 and will serve both embeddings for some period of time.
For example this link returns:
{
"paperId": "cb92a7f9d9dbcf9145e32fdfa0e70e2a6b828eb1",
"embedding": {
"model": "[email protected]",
"vector": [
-5.399959087371826,
-4.762187957763672,
....
]
}
}
See https://www.semanticscholar.org/faq#influential-citations
See https://www.semanticscholar.org/faq#citation-intent
My suggestion is to handle 5xxs gracefully, incorporating some exponential backoff retry algorithm in your code. We have hit scaling issues in the past and cannot guarantee perfect availability, although we are trying our best to adhere to an SLA internally.
Unrelated, here's some 4xx error codes you can expect to receive:
403
- the API key you've sent is incorrect429
- you're sending too many requests, please slow down
This is perhaps a disclaimer: missing/ambiguous/incorrect data could be present for any field.
Please consult the FAQ for more information on how you can either make the correction yourself or report it to customer support.