
safearena/README.md at main · McGill-NLP/safearena · GitHub
SafeArena is a benchmark for assessing the harmful capabilities of web agents - safearena/README.md at main · McGill-NLP/safearena
Network Graph · McGill-NLP/safearena · GitHub
SafeArena is a benchmark for assessing the harmful capabilities of web agents - Network Graph · McGill-NLP/safearena
Releases · McGill-NLP/safearena · GitHub
SafeArena is a benchmark for assessing the harmful capabilities of web agents - Releases · McGill-NLP/safearena
Code frequency · McGill-NLP/safearena · GitHub
SafeArena is a benchmark for assessing the harmful capabilities of web agents - Code frequency · McGill-NLP/safearena
Community Standards · GitHub
SafeArena is a benchmark for assessing the harmful capabilities of web agents - Community Standards · McGill-NLP/safearena
Forks · McGill-NLP/safearena · GitHub
SafeArena is a benchmark for assessing the harmful capabilities of web agents - Forks · McGill-NLP/safearena
Timeout error · Issue #11 · McGill-NLP/safearena · GitHub
Jun 27, 2025 · Hi there, thanks for the great work! When I was running evaluation I found a lot of TimeoutErrors like the following one. Do you have some intuition about why this is happening? If this …
Issues · McGill-NLP/safearena - GitHub
SafeArena is a benchmark for assessing the harmful capabilities of web agents - McGill-NLP/safearena
Could the author share the AMI of the container used in SafeArena ...
Jun 15, 2025 · Hi, I noticed that SafeArena uses Docker containers that are different from those used in WebArena. Would it be possible to share the specific AMI or other setup details used for the …
Milestones - McGill-NLP/safearena · GitHub
SafeArena is a benchmark for assessing the harmful capabilities of web agents - Milestones - McGill-NLP/safearena