Skip to content

Commit

Permalink
update boards_data and config.yaml
Browse files Browse the repository at this point in the history
  • Loading branch information
small-starriest committed Nov 10, 2024
1 parent 9a48825 commit de1cc60
Show file tree
Hide file tree
Showing 61 changed files with 7,629 additions and 6,706 deletions.
Empty file.
42 changes: 42 additions & 0 deletions boards_data/acd/data_tasks/MAIR/default.jsonl

Large diffs are not rendered by default.

Empty file.
42 changes: 42 additions & 0 deletions boards_data/cd/data_tasks/MAIR/default.jsonl

Large diffs are not rendered by default.

75 changes: 37 additions & 38 deletions boards_data/da/data_tasks/Classification/default.jsonl

Large diffs are not rendered by default.

301 changes: 301 additions & 0 deletions boards_data/de/data_tasks/mynew/mynew.jsonl

Large diffs are not rendered by default.

105 changes: 52 additions & 53 deletions boards_data/en-x/data_tasks/BitextMining/default.jsonl

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions boards_data/en-x/data_tasks/mynew/default.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"index":235,"Rank":1,"Model":"<a target=\"_blank\" style=\"text-decoration: underline\" href=\"https:\/\/huggingface.co\/nvidia\/NV-Embed-v2\">NV-Embed-v2<\/a>","1":2,"!!!":99999}
301 changes: 301 additions & 0 deletions boards_data/en-x/data_tasks/myold/default.jsonl

Large diffs are not rendered by default.

623 changes: 303 additions & 320 deletions boards_data/en/data_overall/default.jsonl

Large diffs are not rendered by default.

624 changes: 304 additions & 320 deletions boards_data/en/data_tasks/Classification/default.jsonl

Large diffs are not rendered by default.

611 changes: 297 additions & 314 deletions boards_data/en/data_tasks/Clustering/default.jsonl

Large diffs are not rendered by default.

627 changes: 305 additions & 322 deletions boards_data/en/data_tasks/PairClassification/default.jsonl

Large diffs are not rendered by default.

613 changes: 298 additions & 315 deletions boards_data/en/data_tasks/Reranking/default.jsonl

Large diffs are not rendered by default.

600 changes: 291 additions & 309 deletions boards_data/en/data_tasks/Retrieval/default.jsonl

Large diffs are not rendered by default.

627 changes: 306 additions & 321 deletions boards_data/en/data_tasks/STS/default.jsonl

Large diffs are not rendered by default.

603 changes: 293 additions & 310 deletions boards_data/en/data_tasks/Summarization/default.jsonl

Large diffs are not rendered by default.

Empty file.
42 changes: 42 additions & 0 deletions boards_data/fnc/data_tasks/MAIR/default.jsonl

Large diffs are not rendered by default.

112 changes: 55 additions & 57 deletions boards_data/fr/data_overall/default.jsonl

Large diffs are not rendered by default.

191 changes: 94 additions & 97 deletions boards_data/fr/data_tasks/Classification/default.jsonl

Large diffs are not rendered by default.

111 changes: 54 additions & 57 deletions boards_data/fr/data_tasks/Clustering/default.jsonl

Large diffs are not rendered by default.

179 changes: 88 additions & 91 deletions boards_data/fr/data_tasks/PairClassification/default.jsonl

Large diffs are not rendered by default.

177 changes: 87 additions & 90 deletions boards_data/fr/data_tasks/Reranking/default.jsonl

Large diffs are not rendered by default.

172 changes: 85 additions & 87 deletions boards_data/fr/data_tasks/Retrieval/default.jsonl

Large diffs are not rendered by default.

184 changes: 91 additions & 93 deletions boards_data/fr/data_tasks/STS/default.jsonl

Large diffs are not rendered by default.

169 changes: 83 additions & 86 deletions boards_data/fr/data_tasks/Summarization/default.jsonl

Large diffs are not rendered by default.

Empty file.
42 changes: 42 additions & 0 deletions boards_data/lg/data_tasks/MAIR/default.jsonl

Large diffs are not rendered by default.

Empty file.
42 changes: 42 additions & 0 deletions boards_data/mdc/data_tasks/MAIR/default.jsonl

Large diffs are not rendered by default.

75 changes: 37 additions & 38 deletions boards_data/no/data_tasks/Classification/default.jsonl

Large diffs are not rendered by default.

335 changes: 163 additions & 172 deletions boards_data/other-cls/data_tasks/Classification/default.jsonl

Large diffs are not rendered by default.

621 changes: 303 additions & 318 deletions boards_data/other-sts/data_tasks/STS/default.jsonl

Large diffs are not rendered by default.

Empty file.
41 changes: 41 additions & 0 deletions boards_data/ovr/data_tasks/MAIR/default.jsonl

Large diffs are not rendered by default.

110 changes: 54 additions & 56 deletions boards_data/pl/data_overall/default.jsonl

Large diffs are not rendered by default.

134 changes: 66 additions & 68 deletions boards_data/pl/data_tasks/Classification/default.jsonl

Large diffs are not rendered by default.

108 changes: 53 additions & 55 deletions boards_data/pl/data_tasks/Clustering/default.jsonl

Large diffs are not rendered by default.

116 changes: 57 additions & 59 deletions boards_data/pl/data_tasks/PairClassification/default.jsonl

Large diffs are not rendered by default.

106 changes: 52 additions & 54 deletions boards_data/pl/data_tasks/Retrieval/default.jsonl

Large diffs are not rendered by default.

122 changes: 60 additions & 62 deletions boards_data/pl/data_tasks/STS/default.jsonl

Large diffs are not rendered by default.

77 changes: 38 additions & 39 deletions boards_data/ru/data_overall/default.jsonl

Large diffs are not rendered by default.

85 changes: 42 additions & 43 deletions boards_data/ru/data_tasks/Classification/default.jsonl

Large diffs are not rendered by default.

77 changes: 38 additions & 39 deletions boards_data/ru/data_tasks/Clustering/default.jsonl

Large diffs are not rendered by default.

77 changes: 38 additions & 39 deletions boards_data/ru/data_tasks/MultilabelClassification/default.jsonl

Large diffs are not rendered by default.

77 changes: 38 additions & 39 deletions boards_data/ru/data_tasks/PairClassification/default.jsonl

Large diffs are not rendered by default.

77 changes: 38 additions & 39 deletions boards_data/ru/data_tasks/Reranking/default.jsonl

Large diffs are not rendered by default.

77 changes: 38 additions & 39 deletions boards_data/ru/data_tasks/Retrieval/default.jsonl

Large diffs are not rendered by default.

85 changes: 42 additions & 43 deletions boards_data/ru/data_tasks/STS/default.jsonl

Large diffs are not rendered by default.

75 changes: 37 additions & 38 deletions boards_data/se/data_tasks/Classification/default.jsonl

Large diffs are not rendered by default.

Empty file.
42 changes: 42 additions & 0 deletions boards_data/wb/data_tasks/MAIR/default.jsonl

Large diffs are not rendered by default.

615 changes: 300 additions & 315 deletions boards_data/zh/data_overall/default.jsonl

Large diffs are not rendered by default.

612 changes: 299 additions & 313 deletions boards_data/zh/data_tasks/Classification/default.jsonl

Large diffs are not rendered by default.

603 changes: 295 additions & 308 deletions boards_data/zh/data_tasks/Clustering/default.jsonl

Large diffs are not rendered by default.

607 changes: 297 additions & 310 deletions boards_data/zh/data_tasks/PairClassification/default.jsonl

Large diffs are not rendered by default.

613 changes: 300 additions & 313 deletions boards_data/zh/data_tasks/Reranking/default.jsonl

Large diffs are not rendered by default.

613 changes: 299 additions & 314 deletions boards_data/zh/data_tasks/Retrieval/default.jsonl

Large diffs are not rendered by default.

612 changes: 299 additions & 313 deletions boards_data/zh/data_tasks/STS/default.jsonl

Large diffs are not rendered by default.

327 changes: 327 additions & 0 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,334 @@ tasks:
metric: "p-MRR"
metric_description: "paired mean reciprocal rank (p-MRR)"
task_description: "Retrieval w/Instructions is the task of finding relevant documents for a query that has detailed instructions."
MAIR:
icon: "🍋"
metric: "NDCG@10"
metric_description: "NDCG@10"
task_description: "MAIR: A Massive Benchmark for Evaluating Instructed Retrieval. Evaluate your retrieval models on 126 diverse tasks [EMNLP 2024]. https://github.com/sunnweiwei/MAIR"
boards:
ovr:
title: "Overall"
language_long: "English"
has_overall: false
acronym: null
icon: null
special_icons: null
credits: "https://arxiv.org/abs/2410.10127"
tasks:
MAIR:
- Competition-Math
- ProofWiki_Proof
- ProofWiki_Reference
- Stacks_Proof
- Stacks_Reference
- Stein_Proof
- Stein_Reference
- Trench_Proof
- Trench_Reference
- TAD
- TAS2
- StackMathQA
- SciDocs
- SciFact
- LitSearch
- FairRanking_2020
- APPS
- CodeEditSearch
- CodeSearchNet
- Conala
- HumanEval-X
- LeetCode
- MBPP
- RepoBench
- TLDR
- SWE-Bench-Lite
- FoodAPI
- HuggingfaceAPI
- PytorchAPI
- SpotifyAPI
- TMDB
- TensorAPI
- ToolBench
- WeatherAPI
- AY2
- ELI5
- Fever
- TREx
- WnCw
- WnWi
- WoW
- zsRE
- ArguAna
- CQADupStack
- Quora
- TopiOCQA
- Touche
- ACORDAR
- CPCD
- ChroniclingAmericaQA
- NTCIR
- PointRec
- ProCIS-Dialog
- ProCIS-Turn
- QuanTemp
- WebTableSearch
- MISeD
- SParC
- SParC-SQL
- Spider
- Spider-SQL
- CAsT_2019
- CAsT_2020
- CAsT_2021
- CAsT_2022
- Core_2017
- Microblog_2011
- Microblog_2012
- Microblog_2013
- Microblog_2014
- DD_2015
- DD_2016
- DD_2017
- FairRanking_2021
- FairRanking_2022
- NeuCLIR-Tech_2023
- NeuCLIR_2022
- NeuCLIR_2023
- ProductSearch_2023
- ToT_2023
- ToT_2024
- ExcluIR
- Core17
- News21
- Robust04
- InstructIR
- NevIR
- IFEval
- AILA2019-Case
- AILA2019-Statutes
- BSARD
- BillSum
- CUAD
- GerDaLIR
- LeCaRDv2
- LegalQuAD
- REGIR-EU2UK
- REGIR-UK2EU
- TREC-Legal_2011
- NFCorpus
- Trec-Covid
- Monant
- CARE
- PrecisionMedicine_2017
- PrecisionMedicine_2018
- PrecisionMedicine_2019
- PrecisionMedicine-Article_2019
- PrecisionMedicine-Article_2020
- CliniDS_2014
- CliniDS_2015
- CliniDS_2016
- ClinicalTrials_2021
- ClinicalTrials_2022
- ClinicalTrials_2023
- Genomics-AdHoc_2004
- Genomics-AdHoc_2005
- Genomics-AdHoc_2006
- Genomics-AdHoc_2007
- Apple
- ConvFinQA
- FinQA
- FinanceBench
- HC3Finance
- TAT-DQA
- Trade-the-event
- FiQA
wb:
title: "Web"
language_long: "English"
has_overall: false
acronym: null
icon: null
special_icons: null
credits: "https://arxiv.org/abs/2410.10127"
tasks:
MAIR:
- AY2
- ELI5
- Fever
- TREx
- WnCw
- WnWi
- WoW
- zsRE
- ArguAna
- CQADupStack
- Quora
- TopiOCQA
- Touche
- ACORDAR
- CPCD
- ChroniclingAmericaQA
- NTCIR
- PointRec
- ProCIS-Dialog
- ProCIS-Turn
- QuanTemp
- WebTableSearch
- MISeD
- SParC
- SParC-SQL
- Spider
- Spider-SQL
- CAsT_2019
- CAsT_2020
- CAsT_2021
- CAsT_2022
- Core_2017
- Microblog_2011
- Microblog_2012
- Microblog_2013
- Microblog_2014
- DD_2015
- DD_2016
- DD_2017
- FairRanking_2021
- FairRanking_2022
- NeuCLIR-Tech_2023
- NeuCLIR_2022
- NeuCLIR_2023
- ProductSearch_2023
- ToT_2023
- ToT_2024
- ExcluIR
- Core17
- News21
- Robust04
- InstructIR
- NevIR
- IFEval
acd:
title: "Academic"
language_long: "English"
has_overall: false
acronym: null
icon: null
special_icons: null
credits: "https://arxiv.org/abs/2410.10127"
tasks:
MAIR:
- Competition-Math
- ProofWiki_Proof
- ProofWiki_Reference
- Stacks_Proof
- Stacks_Reference
- Stein_Proof
- Stein_Reference
- Trench_Proof
- Trench_Reference
- TAD
- TAS2
- StackMathQA
- SciDocs
- SciFact
- LitSearch
- FairRanking_2020
lg:
title: "Legal"
language_long: "English"
has_overall: false
acronym: null
icon: null
special_icons: null
credits: "https://arxiv.org/abs/2410.10127"
tasks:
MAIR:
- AILA2019-Case
- AILA2019-Statutes
- BSARD
- BillSum
- CUAD
- GerDaLIR
- LeCaRDv2
- LegalQuAD
- REGIR-EU2UK
- REGIR-UK2EU
- TREC-Legal_2011
mdc:
title: "Medical"
language_long: "English"
has_overall: false
acronym: null
icon: null
special_icons: null
credits: "https://arxiv.org/abs/2410.10127"
tasks:
MAIR:
- NFCorpus
- Trec-Covid
- Monant
- CARE
- PrecisionMedicine_2017
- PrecisionMedicine_2018
- PrecisionMedicine_2019
- PrecisionMedicine-Article_2019
- PrecisionMedicine-Article_2020
- CliniDS_2014
- CliniDS_2015
- CliniDS_2016
- ClinicalTrials_2021
- ClinicalTrials_2022
- ClinicalTrials_2023
- Genomics-AdHoc_2004
- Genomics-AdHoc_2005
- Genomics-AdHoc_2006
- Genomics-AdHoc_2007
fnc:
title: "Finance"
language_long: "English"
has_overall: false
acronym: null
icon: null
special_icons: null
credits: "https://arxiv.org/abs/2410.10127"
tasks:
MAIR:
- Apple
- ConvFinQA
- FinQA
- FinanceBench
- HC3Finance
- TAT-DQA
- Trade-the-event
- FiQA
cd:
title: "Code"
language_long: "English"
has_overall: false
acronym: null
icon: null
special_icons: null
credits: "https://arxiv.org/abs/2410.10127"
tasks:
MAIR:
- APPS
- CodeEditSearch
- CodeSearchNet
- Conala
- HumanEval-X
- LeetCode
- MBPP
- RepoBench
- TLDR
- SWE-Bench-Lite
- FoodAPI
- HuggingfaceAPI
- PytorchAPI
- SpotifyAPI
- TMDB
- TensorAPI
- ToolBench
- WeatherAPI
en:
title: English
language_long: "English"
Expand Down

0 comments on commit de1cc60

Please sign in to comment.