awesome-reasoning

Adding reasoning to your AI? Take these resources, they may help you on your way.

Datasets

AGI/causality/frml grammar
Deepmind Chomsky Hierarchy	Problems crafted for FSM/PDA/TM	`[1]`
automata	a neurallambda tool to gen from grammars	`[1]`
im a strange dataset	Tough for LLMs because of self-references.	`[1]`
DiagGSM8k	NL Reasoning Benchmark	`[1]`
CLadder	Causal reasoning	`[1]`
Cause-Effect Pairs	108 datasets of 2 var dynamics (not NL)	`[1]`
MNLI Entailment	sentence parsing + entailment	`[1]`

AGENT/TOOL
THUDM AgentInstruct	long form dialogs	`[1]`
WANG AgentInstruct	gpt3 synthesized instructions	`[1]`
KnowLM Tool	prompt + tool call + answer	`[1]`
Glaive Tool Usage	sys prompt says tools + prompt + answer	`[1]`
opentoolformer retrieval	prompt + tool call	`[1]`

CODE
rosetta	same program, many diff languages	`[1]`
EvoEval Tool Use	100 prompt + code + tests	`[1]`

MATH/LOGIC
gsm8k	Grade School Math 8k	`[1]`
MetaMath	one-shot math	`[1]`
MetaMathFewShot	few-shot math	`[1]`
MathPile	9B tok from filtered internet	`[1]`
LogiQA	NL multi choice, requires abstraction	`[1]`
Logic-LM	a model combining auto theorem provers and llms	`[1]`
Coq Facts	270k cog theorem prover programs	`[1]`

NATURAL LANGUAGE
Nous Open Reasoning	community contrib tasks	`[1]`
UltraInteract_sft	GPT generated iterated reasoning dialogs	`[1]`
CoGnition	NL compositional generalization	`[1]`
Winogrande	ambiguous sentences, fill in 1 word	`[1]`
Winograd_wsc	ambiguous sentences, choose the right word	`[1]`
Contradiction	2 phrases, do they contradict	`[1]`
Recognizing Textual Entailment	2 phrases, do they entail each other	`[1]`
Textual Entailment Pool	more entailment	`[1]`
Answer Validation	2 phrases, does the answer solve question	`[1]`
Monotonicity Entailment	x is true, does y follow	`[1]`
entailment	passage, question -> T/F	`[1]`
Commonsense QA	muti choice QA	`[1]`
GLUE	several datasets	`[1]`
custom multi-hop	use wikipedia's graph of articles
MUD videogames	(various could be training data)
skunkworks/reasoning	wide variety of NL tasks	`[1]`

TOY PROBLEMS
arc-like	1D visual puzzles, great seq reasoning	`[1]`
re-arc	2D reverse engineered ARC	`[1]`
ARC	competition	`[1]`
(misc)	xLSTM paper lists several in appendix	`[1]`
expand polynomials	algebraic expansion	`[Abstractor]`
linear eq	solve algebraic eqs	`[Abstractor]`
Match-To-Sample	cogsci test for relational reasoning	`[1] MLPs Learn In Context`
Oddball Detection	cogsci test for relational reasoning	`[1] MLPs Learn In Context`
regression	with incontext learning, good reasoning test	`[1] MLPs Learn In Context`
clustering	with incontext learning, good reasoning test	`[1] MLPs Learn In Context`
COGS	compositional generalization	`[1]`
SCAN	systematicity, "$x to the left"	`[1]` `[2]`
clevr	2d img of 3d shapes + natural language QA	`[1]` `[2]`
lambda calc + beta reductions	generator code, single+multistep	`[1]`
lichess-puzzles	chess puzzles	`[1]`
pointer net problems	convex hull, TSP, triangulation	`[1]`
Big Bench Hard	23 challenges (only 6k datapoints)	`[1]`
logical entailment dataset	logic strings by deepmind	`[1]`
logical entailment dataset code	(generate it yourself)	`[1]`
FSM Game	generate strings according to grammar
Adaptive Grammar	grammar rule might change
String/Graph Rewriting		`string_rewriting.py`
LibraryOfLogic	generate NL from multiple games	`[1]`
AB-XY Game
word ladder
parser
longest cmn subseq
string reversal
wisconsin card sorting
anagram
palindrome
permutation composition

TOKEN AUGMENTED REASONING
Reasoning tokens	Self-Reasoning Tokens, teaching models to think ahead	`[1]`
Quiet-STaR	LLMs Can Teach Themselves to Think Before Speaking	`[1]`

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
README.md		README.md
sword.png		sword.png