Releases: princeton-nlp/SWE-agent
SWE-agent EnIGMA (0.7.0)
SWE-agent is SOTA on offensive cybersecurity
SWE-agent EnIGMA (Enhanced Interactive Generative Model Agent) is SOTA on offensive cybersecurity challenges, with a 3.3x improvement over previous agents on the NYU CTF challenge dataset. The EnIGMA project introduces multiple novelties that are available to all use cases of SWE-agent, such as Interactive Agent Tools and a Summarizer to handle long outputs.
Major additions
- Capability to run over CTF challenges
- Interactive Agent Tools, including
gdb
- Summarizers to handle long outputs
Smaller additions
- Add filemap command in the spirit of repomap by @samuela in #619
- Create config to run human eval style challenges by @ofirpress in #658
- Add claude 3.5 sonnet to models by @carlosejimenez in #601
- Enh: Warn if scrolling >= 3 times by @klieret in #626
- feat: support deepseek-coder LLM by @jcraftsman in #638
- Enh: Make timeout for agent commands configurable by @klieret in #674
- Add support for new gpt-4o-mini model by @ivan4722 in #693
- Groq Models Integration by @MohammedNagdy in #721
- Make log level configurable; add TRACE level by @klieret in #612
Fixes
- Compatibility with SWE-bench 2.0 by @klieret in #671
- ensure variables work in special command docstring by @forresty in #628
- Important fix: Catch CostLimitExceeded in retry because of format/block by @klieret in #682
- Fix: Handle empty traj in should_skip by @klieret in #616
- Fix for end-marker communicate: Exit status always 0/invalid by @klieret in #644
- Fix: Insufficient quoting of git commit message by @klieret in #646
- Fix nonsensical trajectory formatting for PRs by @klieret in #647
- Fix: sweunexpected keyword 'python_version' by @klieret in #692
- Fix: Use LONG_TIMEOUT for pre_install commands by @klieret in #695
- Fix: UnboundLocalError when catching decoding issue by @klieret in #709
- Also create empty patch files for completeness by @klieret in #725
- Fix: Raise ContextWindowExceeded instead of exit_cost by @klieret in #727
- Fix: Deal with non-utf8 encoded bytes in comm by @klieret in #731
- Fix: Handle spaces in repo names by @klieret in #734
- Fix: Ensure utils is part of package by @klieret in #742
- Fix: Submitting ' ' in human mode crashes container by @klieret in #749
- Fix: Block su as command by @klieret in #752
- Fix: SWE_AGENT_MODEL_MAX_RETRIES needs casting by @klieret in #757
New Contributors
🎉 @talorabr, @udiboy1209, @haoranxi, @NickNameInvalid, @rollingcoconut joined the team to build EnIGMA 🎉
- @carlosejimenez made their first contribution in #601
- @samefarrar made their first contribution in #606
- @hubstrauss made their first contribution in #625
- @samuela made their first contribution in #619
- @forresty made their first contribution in #628
- @jcraftsman made their first contribution in #638
- @ivan4722 made their first contribution in #693
- @JoshuaPurtell made their first contribution in #703
- @MohammedNagdy made their first contribution in #721
- @pdemro made their first contribution in #729
v0.6.1
This is (mostly) a patch release, in particular fixing several issues that had been introduced by the speed improvements of v0.7.0.
We also solve a bug where existing linter errors in a file left SWE-agent unable to edit (because of our lint-retry-loop).
Breaking changes
Improved
- Enh: Show commands when encountering timeout error by @klieret in #582
- Enh: Configuration option to show time in log by @klieret in #583
- Enh: Allow to configure LONG_TIMEOUT for SWEEnv by @klieret in #584
- Enh: Always write log to traj directory by @klieret in #588
Fixed
- fix
docker.errors.NotFound
by @klieret in #587 - Fix: Revert to full clone method when needed by @klieret in #589
- Fix: Refresh container_obj before querying status by @klieret in #590
- Fixed #571 - show message that model arg is ignored in case of using Azure OpenAI by @jank in #592
- Fix: Linting blocks for existing lint errors by @klieret in #593
- Fix: Process done marker not found in read with timeout by @klieret in #596
v0.6.0
What's Changed
We sped up SWE-agent by 2x (timed with GPT4o). This is mostly due to faster communication with the running processes inside of the Docker container and other container setup & installation related improvements. Here are a few relevant PRs:
- Switch to fast communicate and shallow clone by default by @klieret in #530
- Change: Only wait 1s for docker to start by @klieret in #541
- Feat: experimental shallow cloning by @klieret in #498
- Enh: Start from clone of python conda environment for speedup by @klieret in #548
- Enh: Use uv for editable install by default by @klieret in #547
Fixed
- Web UI: Remove -n option to wait by @klieret in #487
- Web UI: Kill the Flask server on exit. by @kwight in #479
- Web UI: Avoid proxy errors on MacOS by @klieret in #506
- Ensure container_name is reset for non-persistent containers by @klieret in #463
- Fix: Do not allow persistent container with cache task imgs by @klieret in #551
Improved
- Improve scrolling behavior in web UI by @anishfish2 in #420
- Web UI: Render Markdown in agent feed messages. by @kwight in #486
- Enh: Remove redundant 'saved traj to X' messages by @klieret in #528
- Allow to disable config dump to log by @klieret in #537
- Resolve relative paths to demonstrations and commands by @klieret in #444
New Contributors
- @panozzaj made their first contribution in #476
- @kwight made their first contribution in #482
- @anishfish2 made their first contribution in #420
- @ofirpress made their first contribution in #489
- @milaiwi made their first contribution in #469
- @burnettk made their first contribution in #533
Full Changelog: v0.5.0...v0.6.0
v0.5.0
What's Changed
✨ The big news is our brand new documentation ✨
Secondly, @ollmer added a new flag --cache_task_images
that will significantly speed up SWE-agent when running on the same environment/repository multiple times (no more waiting for cloning and installation!)
Breaking changes
- We have reformatted our codebase. If you create a PR based on a previous commit, make sure you install our
pre-commit
hook to avoid merge-conflicts because of formatting. See our docs for more information. - Remove direct imports in
__init__.py
(you can no longerfrom sweagent import Agent
by @klieret in #436
Added
- Running the web UI is now supported when running swe-agent completely in docker
- Speed up evaluation by caching task environments as docker images by @ollmer in #317
Improved
- Add gpt-4o model by @raymyers in #344
- Web: Allow to specify commit hash by @klieret in #358
- Add default environment_setup config by @klieret in #351
- Enh: Suppress openai logging; improve formatting of stats by @klieret in #416
- Remove signal dependency by @klieret in #428
- Do not use select if running on Windows by @klieret in #429
- Use custom Config class to support env and keys.cfg (this allows passing keys as environment variables) by @klieret in #430
Fixes
- Web: Fix script_path input by @klieret in #334
- Fix: Don't print patch msg for exit_cost patch by @klieret in #343
- Fix: Do not request job control in bash by @klieret in #345
- Fix: --base_commit not used for gh urls by @klieret in #346
- Fix: Separate data path/traj dir cause exception by @klieret in #348
- Add docker-py lower bound by @klieret in #406
- Fix: IndexError when replaying incomplete trajectories by @klieret in #410
New Contributors
- @raymyers made their first contribution in #344
- @nims11 made their first contribution in #332
- @khangich made their first contribution in #274
- @ollmer made their first contribution in #317
Full Changelog: v0.4.0...v0.5.0
0.4.0 Web UI
What's Changed
We’re excited to launch the SWE-agent web UI! Specify a bug, press start and watch SWE-agent do the magic ✨
New Contributors
- @tam-ng0905 made their first contribution in #321
- @nonparibus made their first contribution in #310
- @RainRat made their first contribution in #320
Full Changelog: v0.3.0...v0.4.0
0.3.0
What's Changed
✨ Features
- Run SWE-agent in the cloud using GitHub Codespaces
- Add GPT4-turbo model by @zgrannan in #252
- feat: Amazon Bedrock support (Claude models) by @JGalego in #207
🐛 Fixes
- Better error handling for --open_pr by @klieret in #239
- Fixed a potential error by @DanjieTang in #242
- fix: TARGETARCH not set on some OS/docker setups by @mspronesti in #249
- Pass Python version to get_environment_yml by @waterson in #271
- Fix Together model validation error by @mikanfactory in #236
- Doc: Avoid invalid github token by @klieret in #292
❤️ New Contributors
- @DanjieTang made their first contribution in #242
- @zgrannan made their first contribution in #252
- @nfedyashev made their first contribution in #254
- @JGalego made their first contribution in #207
- @Borda made their first contribution in #256
- @waterson made their first contribution in #271
Full Changelog: v0.2.0...v0.3.0
v0.2.0
What's Changed
Added
- Allow to run on local repos (new flag:
--repo_path
) by @klieret in #193 - Patch files are now saved separately to a patch directory by @klieret in #126
- Allow to supply custom installation commands when running on gh issues or locally (
--environment_setup
) by @klieret in #153 - Allow to specify openapi base url in
keys.cfg
by @bvandorf in #118
Improved
- Improve error handling of docker issues by @klieret in #165
- Make github token fully optional by @klieret in #189
Fixed
New Contributors
- @bvandorf made their first contribution in #118
- @pre-commit-ci made their first contribution in #141
- @moresearch made their first contribution in #147
- @brandco made their first contribution in #155
- @YeonwooSung made their first contribution in #72
- @foragerr made their first contribution in #212
- @zhipengzuo made their first contribution in #210
- @mikanfactory made their first contribution in #218
- @mspronesti made their first contribution in #216
Full Changelog: v0.1.2...v0.2.0