This repository has been archived by the owner on Aug 12, 2021. It is now read-only.
maziyarpanahi
released this
19 Mar 15:08
·
368 commits
to master
since this release
Russian Models and Pipelines
We are happy to announce Spark NLP pre-trained Russian models and pipelines.
Models:
Model | name | language |
---|---|---|
LemmatizerModel (Lemmatizer) | lemma |
ru |
PerceptronModel (POS UD) | pos_ud_gsd |
ru |
NerDLModel | wikiner_6B_100 |
ru |
NerDLModel | wikiner_6B_300 |
ru |
NerDLModel | wikiner_840B_300 |
ru |
Pipelines:
Pipeline | name | language |
---|---|---|
Explain Document (Small) | explain_document_sm |
ru |
Explain Document (Medium) | explain_document_md |
ru |
Explain Document (Large) | explain_document_lg |
ru |
Entity Recognizer (Small) | entity_recognizer_sm |
ru |
Entity Recognizer (Medium) | entity_recognizer_md |
ru |
Entity Recognizer (Large) | entity_recognizer_lg |
ru |
Example:
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
import com.johnsnowlabs.nlp.SparkNLP
SparkNLP.version()
val pipeline = PretrainedPipeline("explain_document_sm", lang="ru")
val testData = spark.createDataFrame(Seq(
(1, "Пик распространения коронавируса и вызываемой им болезни Covid-19 в Китае прошел, заявил в четверг агентству Синьхуа официальный представитель Госкомитета по гигиене и здравоохранению КНР Ми Фэн.")
)).toDF("id", "text")
val annotation = pipeline.transform(testData)
annotation.show()
Spark NLP:
- PUBLIC
Last update
12/03/2020
Works with
Spark NLP 2.4.4 and above