This repository has been archived by the owner on Oct 8, 2020. It is now read-only.

15 Jan 21:36

GezimSejdiu

SANSA 0.7.1 Latest

Latest

Dependency Changes

Apache Spark 2.4.3 → 2.4.4
Apache Flink 1.8.0 → 1.9.1
Apache Jena 3.11.0 → 3.13.1

Assets 4

02 Jul 10:57

GezimSejdiu

SANSA ML 0.6.0

Features

Spark
- Add coverall integration
- Further improvement of unit test coverage
- #12 Refactor vandalism detection package
- #20 Align with RDF layer

Bug Fixes

#13 Classes with names that only differ in casing
#15 Hard-coded spatial partitioning value (DBSCAN)
#17 geospark scope removed

Dependency Changes

Apache Spark 2.4.0 → 2.4.3
Apache Flink 1.7.0 → 1.8.0
Apache Jena 3.9.0 → 3.11.0

Assets 4

12 Dec 13:14

GezimSejdiu

SANSA ML 0.5.0

Features

Spark
- Numerical outlier detection
- RDF Graph Kernels (Alpha)
- Update: knowledge graph embedding support(pre Alpha)
- Decision trees (pre Alpha)
- Update: Clustering
  - Power iteration clustering
  - DBScan (experimental)
  - Unified interface for all clustering algorithms

Dependency Changes

Apache Spark 2.4.0
Apache Flink 1.7.0
Apache Jena 3.9.0

Assets 4

26 Jun 14:29

GezimSejdiu

SANSA 0.4.0

Features

Spark
- New: Numerical outlier detection(Beta status)
- New: RDF Graph Kernels (Alpha)
- Update: knowledge graph embedding support(pre Alpha)
- Update: Decision trees (pre Alpha)
- Update: Clustering(Beta status)
- Update: Semantic similarity measures
- Update: Vandalism Detection in WikiData(Beta status)(Beta status)

Dependency changes

Apache Spark 2.3.1
Apache Flink 1.5.0
Apache Jena 3.7.0

Assets 4

14 Dec 22:23

GezimSejdiu

SANSA-ML 0.3.0

Features

Updated: Rule mining algorithm for RDF graphs based on AMIE+ further developed (still beta status)
Updated: semantic similarity measures: They can be defined as a function of common and distinctive features among different entities. We have implemented the following measures:
- Jaccard similarity,
- Rodr ́ıguez and Egenhofer similarity
- Tversky Ratio Model
- Batet Similarity
Updated: Clustering algorithms further extended and evaluated (Experimental)
- Silvia Link Clustering
- Border Flow (Extended for RDF)
- Power Iteration Clustering (Extended for RDF)
- Modularity Clustering
New : Anomaly detection (beta status)
New : Vandalism Detection (beta status)
Knowledge graph embedding approaches integrated into the SANSA core: TransE (beta status), DistMult (beta status)
In-Progress: Terminological Decision Trees for the classification of concepts

Dependency changes

Scala 2.11.11
Apache Spark 2.2.1
Apache Flink 1.4.0
Apache Jena 3.5.0

Assets 4

13 Jun 12:48

HajiraJabeen

SANSA ML 0.2.0

Features

Spark
- Rule mining algorithm for RDF graphs based on AMIE+ further developed (still beta status)
- Distributed Tensor Factorisation (very experimental, not fully integrated)
- Several semantic similarity measures implemented (experimental)
- Power Iteration Clustering with custom similarity measures
- Border Flow Clustering
Flink
- RDF By Modularity Clustering algorithm (introduced by Newman- DOI: https://doi.org/10.1103/PhysRevE.69.066133) (beta status)

Dependency changes

Spark 2.1.1
Flink 1.3.0
JenaAPI 3.1.1

Assets 4

09 Dec 13:53

GezimSejdiu

SANSA ML 0.1.0

Features

Spark
- An RDF clustering algorithm based on an approach for undirected graphs maximizing a modularity function, which was first introduced by Newman (DOI: https://doi.org/10.1103/PhysRevE.69.066133) (beta status)
- A rule mining algorithm for RDF graphs based on AMIE+ (beta status)
Flink is not supported in this release

Assets 3