25 Sep 14:45

klieret

dc18a74

Latest

SWE-agent is SOTA on offensive cybersecurity

SWE-agent EnIGMA (Enhanced Interactive Generative Model Agent) is SOTA on offensive cybersecurity challenges, with a 3.3x improvement over previous agents on the NYU CTF challenge dataset. The EnIGMA project introduces multiple novelties that are available to all use cases of SWE-agent, such as Interactive Agent Tools and a Summarizer to handle long outputs.

Major additions

Capability to run over CTF challenges
Interactive Agent Tools, including gdb
Summarizers to handle long outputs

Smaller additions

Add filemap command in the spirit of repomap by @samuela in #619
Create config to run human eval style challenges by @ofirpress in #658
Add claude 3.5 sonnet to models by @carlosejimenez in #601
Enh: Warn if scrolling >= 3 times by @klieret in #626
feat: support deepseek-coder LLM by @jcraftsman in #638
Enh: Make timeout for agent commands configurable by @klieret in #674
Add support for new gpt-4o-mini model by @ivan4722 in #693
Groq Models Integration by @MohammedNagdy in #721
Make log level configurable; add TRACE level by @klieret in #612

Fixes

Compatibility with SWE-bench 2.0 by @klieret in #671
ensure variables work in special command docstring by @forresty in #628
Important fix: Catch CostLimitExceeded in retry because of format/block by @klieret in #682
Fix: Handle empty traj in should_skip by @klieret in #616
Fix for end-marker communicate: Exit status always 0/invalid by @klieret in #644
Fix: Insufficient quoting of git commit message by @klieret in #646
Fix nonsensical trajectory formatting for PRs by @klieret in #647
Fix: sweunexpected keyword 'python_version' by @klieret in #692
Fix: Use LONG_TIMEOUT for pre_install commands by @klieret in #695
Fix: UnboundLocalError when catching decoding issue by @klieret in #709
Also create empty patch files for completeness by @klieret in #725
Fix: Raise ContextWindowExceeded instead of exit_cost by @klieret in #727
Fix: Deal with non-utf8 encoded bytes in comm by @klieret in #731
Fix: Handle spaces in repo names by @klieret in #734
Fix: Ensure utils is part of package by @klieret in #742
Fix: Submitting ' ' in human mode crashes container by @klieret in #749
Fix: Block su as command by @klieret in #752
Fix: SWE_AGENT_MODEL_MAX_RETRIES needs casting by @klieret in #757

New Contributors

🎉 @talorabr, @udiboy1209, @haoranxi, @NickNameInvalid, @rollingcoconut joined the team to build EnIGMA 🎉

@carlosejimenez made their first contribution in #601
@samefarrar made their first contribution in #606
@hubstrauss made their first contribution in #625
@samuela made their first contribution in #619
@forresty made their first contribution in #628
@jcraftsman made their first contribution in #638
@ivan4722 made their first contribution in #693
@JoshuaPurtell made their first contribution in #703
@MohammedNagdy made their first contribution in #721
@pdemro made their first contribution in #729

Contributors

forresty, samuela, and 15 other contributors

Assets 2

20 Jun 15:21

klieret

v0.6.1

5f42dc8

v0.6.1

This is (mostly) a patch release, in particular fixing several issues that had been introduced by the speed improvements of v0.7.0.
We also solve a bug where existing linter errors in a file left SWE-agent unable to edit (because of our lint-retry-loop).

Breaking changes

Change: sparse clone method is now correctly called "shallow" by @klieret in #591

Improved

Enh: Show commands when encountering timeout error by @klieret in #582
Enh: Configuration option to show time in log by @klieret in #583
Enh: Allow to configure LONG_TIMEOUT for SWEEnv by @klieret in #584
Enh: Always write log to traj directory by @klieret in #588

Fixed

fix docker.errors.NotFound by @klieret in #587
Fix: Revert to full clone method when needed by @klieret in #589
Fix: Refresh container_obj before querying status by @klieret in #590
Fixed #571 - show message that model arg is ignored in case of using Azure OpenAI by @jank in #592
Fix: Linting blocks for existing lint errors by @klieret in #593
Fix: Process done marker not found in read with timeout by @klieret in #596

Contributors

jank and klieret

Assets 2

05 Jun 13:16

klieret

v0.6.0

14a5189

v0.6.0

What's Changed

We sped up SWE-agent by 2x (timed with GPT4o). This is mostly due to faster communication with the running processes inside of the Docker container and other container setup & installation related improvements. Here are a few relevant PRs:

Switch to fast communicate and shallow clone by default by @klieret in #530
Change: Only wait 1s for docker to start by @klieret in #541
Feat: experimental shallow cloning by @klieret in #498
Enh: Start from clone of python conda environment for speedup by @klieret in #548
Enh: Use uv for editable install by default by @klieret in #547

Fixed

Web UI: Remove -n option to wait by @klieret in #487
Web UI: Kill the Flask server on exit. by @kwight in #479
Web UI: Avoid proxy errors on MacOS by @klieret in #506
Ensure container_name is reset for non-persistent containers by @klieret in #463
Fix: Do not allow persistent container with cache task imgs by @klieret in #551

Improved

Improve scrolling behavior in web UI by @anishfish2 in #420
Web UI: Render Markdown in agent feed messages. by @kwight in #486
Enh: Remove redundant 'saved traj to X' messages by @klieret in #528
Allow to disable config dump to log by @klieret in #537
Resolve relative paths to demonstrations and commands by @klieret in #444

New Contributors

@panozzaj made their first contribution in #476
@kwight made their first contribution in #482
@anishfish2 made their first contribution in #420
@ofirpress made their first contribution in #489
@milaiwi made their first contribution in #469
@burnettk made their first contribution in #533

Full Changelog: v0.5.0...v0.6.0

Contributors

burnettk, panozzaj, and 5 other contributors

Assets 2

28 May 17:14

klieret

v0.5.0

c8e8ba6

v0.5.0

What's Changed

✨ The big news is our brand new documentation ✨

Secondly, @ollmer added a new flag --cache_task_images that will significantly speed up SWE-agent when running on the same environment/repository multiple times (no more waiting for cloning and installation!)

Breaking changes

We have reformatted our codebase. If you create a PR based on a previous commit, make sure you install our pre-commit hook to avoid merge-conflicts because of formatting. See our docs for more information.
Remove direct imports in __init__.py (you can no longer from sweagent import Agent by @klieret in #436

Added

Running the web UI is now supported when running swe-agent completely in docker
Speed up evaluation by caching task environments as docker images by @ollmer in #317

Improved

Add gpt-4o model by @raymyers in #344
Web: Allow to specify commit hash by @klieret in #358
Add default environment_setup config by @klieret in #351
Enh: Suppress openai logging; improve formatting of stats by @klieret in #416
Remove signal dependency by @klieret in #428
Do not use select if running on Windows by @klieret in #429
Use custom Config class to support env and keys.cfg (this allows passing keys as environment variables) by @klieret in #430

Fixes

Web: Fix script_path input by @klieret in #334
Fix: Don't print patch msg for exit_cost patch by @klieret in #343
Fix: Do not request job control in bash by @klieret in #345
Fix: --base_commit not used for gh urls by @klieret in #346
Fix: Separate data path/traj dir cause exception by @klieret in #348
Add docker-py lower bound by @klieret in #406
Fix: IndexError when replaying incomplete trajectories by @klieret in #410

New Contributors

@raymyers made their first contribution in #344
@nims11 made their first contribution in #332
@khangich made their first contribution in #274
@ollmer made their first contribution in #317

Full Changelog: v0.4.0...v0.5.0

Contributors

raymyers, nims11, and 3 other contributors

Assets 2

09 May 14:58

klieret

v0.4.0

1e065f8

0.4.0 Web UI

What's Changed

We’re excited to launch the SWE-agent web UI! Specify a bug, press start and watch SWE-agent do the magic ✨

New Contributors

@tam-ng0905 made their first contribution in #321
@nonparibus made their first contribution in #310
@RainRat made their first contribution in #320

Full Changelog: v0.3.0...v0.4.0

Contributors

vilkinsons, RainRat, and tamdogood

Assets 2

02 May 15:47

klieret

v0.3.0

43b8de5

0.3.0

What's Changed

✨ Features

Run SWE-agent in the cloud using GitHub Codespaces
Add GPT4-turbo model by @zgrannan in #252
feat: Amazon Bedrock support (Claude models) by @JGalego in #207

🐛 Fixes

Better error handling for --open_pr by @klieret in #239
Fixed a potential error by @DanjieTang in #242
fix: TARGETARCH not set on some OS/docker setups by @mspronesti in #249
Pass Python version to get_environment_yml by @waterson in #271
Fix Together model validation error by @mikanfactory in #236
Doc: Avoid invalid github token by @klieret in #292

❤️ New Contributors

@DanjieTang made their first contribution in #242
@zgrannan made their first contribution in #252
@nfedyashev made their first contribution in #254
@JGalego made their first contribution in #207
@Borda made their first contribution in #256
@waterson made their first contribution in #271

Full Changelog: v0.2.0...v0.3.0

Contributors

nfedyashev, waterson, and 7 other contributors

Assets 2

15 Apr 19:01

klieret

v0.2.0

58aa046

v0.2.0

What's Changed

Added

Allow to run on local repos (new flag: --repo_path) by @klieret in #193
Patch files are now saved separately to a patch directory by @klieret in #126
Allow to supply custom installation commands when running on gh issues or locally (--environment_setup) by @klieret in #153
Allow to specify openapi base url in keys.cfgby @bvandorf in #118

Improved

Improve error handling of docker issues by @klieret in #165
Make github token fully optional by @klieret in #189

Fixed

Fix opening PR from fork by @klieret in #229
Fix: Choosing TogetherAI models by @klieret in #130

New Contributors

@bvandorf made their first contribution in #118
@pre-commit-ci made their first contribution in #141
@moresearch made their first contribution in #147
@brandco made their first contribution in #155
@YeonwooSung made their first contribution in #72
@foragerr made their first contribution in #212
@zhipengzuo made their first contribution in #210
@mikanfactory made their first contribution in #218
@mspronesti made their first contribution in #216

Full Changelog: v0.1.2...v0.2.0

Contributors

mikanfactory, brandco, and 8 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SWE-agent is SOTA on offensive cybersecurity

Major additions

Smaller additions

Fixes

New Contributors

Contributors

Breaking changes

Improved

Fixed

Contributors

What's Changed

Fixed

Improved

New Contributors

Contributors

What's Changed

Breaking changes

Added

Improved

Fixes

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

✨ Features

🐛 Fixes

❤️ New Contributors

Contributors

What's Changed

Added

Improved

Fixed

New Contributors

Contributors

Releases: princeton-nlp/SWE-agent

SWE-agent EnIGMA (0.7.0)

SWE-agent is SOTA on offensive cybersecurity

Major additions

Smaller additions

Fixes

New Contributors

Contributors

v0.6.1

Breaking changes

Improved

Fixed

Contributors

v0.6.0

What's Changed

Fixed

Improved

New Contributors

Contributors

v0.5.0

What's Changed

Breaking changes

Added

Improved

Fixes

New Contributors

Contributors

0.4.0 Web UI

What's Changed

New Contributors

Contributors

0.3.0

What's Changed

✨ Features

🐛 Fixes

❤️ New Contributors

Contributors

v0.2.0

What's Changed

Added

Improved

Fixed

New Contributors

Contributors