Conllu data - number of edge labels is different from Yamakata et al. 2020 #3

paolo-gajo · 2024-11-04T23:11:20Z

Hello, I was trying to remake the data starting directly from Yamakata et al.'s data and, going through the gold training data and the dataloaders, I noticed that the parser training data, e.g. train.conllu, has 17 labels once loaded:

{'root': 18802, 't': 4863, 'o': 2191, 'd': 1564, 'f-eq': 742, 't-comp': 511, 'v-tm': 444, 'a': 434, 'f-part-of': 294, 'f-comp': 235, 'a-eq': 137, 't-eq': 135, 't-part-of': 115, 's': 78, 'v': 58, 'f-set': 8, '-': 2}

This seems incoherent with the r_NE classes from Yamakata et al. 2020, which has 13 classes (14 when using 'root' for the absence of an edge).

Could you confirm if this is indeed a bug or if I am missing something? Thank you!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conllu data - number of edge labels is different from Yamakata et al. 2020 #3

Conllu data - number of edge labels is different from Yamakata et al. 2020 #3

paolo-gajo commented Nov 4, 2024 •

edited

Loading

Conllu data - number of edge labels is different from Yamakata et al. 2020 #3

Conllu data - number of edge labels is different from Yamakata et al. 2020 #3

Comments

paolo-gajo commented Nov 4, 2024 • edited Loading

paolo-gajo commented Nov 4, 2024 •

edited

Loading