Skip to content
This repository has been archived by the owner on Aug 12, 2021. It is now read-only.

New Russian models and pipelines pack

Latest
Compare
Choose a tag to compare
@maziyarpanahi maziyarpanahi released this 19 Mar 15:08
· 368 commits to master since this release
eb5e0bc

Russian Models and Pipelines

We are happy to announce Spark NLP pre-trained Russian models and pipelines.

Models:

Model name language
LemmatizerModel (Lemmatizer) lemma ru
PerceptronModel (POS UD) pos_ud_gsd ru
NerDLModel wikiner_6B_100 ru
NerDLModel wikiner_6B_300 ru
NerDLModel wikiner_840B_300 ru

Pipelines:

Pipeline name language
Explain Document (Small) explain_document_sm ru
Explain Document (Medium) explain_document_md ru
Explain Document (Large) explain_document_lg ru
Entity Recognizer (Small) entity_recognizer_sm ru
Entity Recognizer (Medium) entity_recognizer_md ru
Entity Recognizer (Large) entity_recognizer_lg ru

Example:

import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
import com.johnsnowlabs.nlp.SparkNLP

SparkNLP.version()

val pipeline = PretrainedPipeline("explain_document_sm", lang="ru")

val testData = spark.createDataFrame(Seq(
(1, "Пик распространения коронавируса и вызываемой им болезни Covid-19 в Китае прошел, заявил в четверг агентству Синьхуа официальный представитель Госкомитета по гигиене и здравоохранению КНР Ми Фэн.")
)).toDF("id", "text")

val annotation = pipeline.transform(testData)

annotation.show()

Spark NLP:

  • PUBLIC

Last update

12/03/2020

Works with

Spark NLP 2.4.4 and above