An OpenEnv benchmark testing the ability of AI agents to act as Site Reliability Engineers (SREs) by diagnosing and filtering raw production failure logs.
-
Updated
Apr 8, 2026 - Python
An OpenEnv benchmark testing the ability of AI agents to act as Site Reliability Engineers (SREs) by diagnosing and filtering raw production failure logs.
AI-powered system for low-exposure route optimization using AQI, simulation, and intelligent decision-making
Deterministic evaluation environment for AI code reviewers covering bugs, security (OWASP), and architecture via FastAPI + OpenEnv.
Gymnasium RL environment for AI-powered customer support triage — classify, prioritize, assign, and respond to emails under SLA pressure. Built for the OpenEnv spec.
AI research environment that simulates the end-to-end scientific discovery process, enabling agents to analyze papers, generate hypotheses, design experiments, and validate results collaboratively
📧 Intelligent Agentic Workflow for Autonomous Enterprise Email Triage. Built with OpenEnv, featuring Chain-of-Thought reasoning and Self-Correcting agent logic for high-stakes corporate routing.
CyberRange is an advanced, self-improving simulated environment designed to train and benchmark autonomous security agents in complex enterprise incident response.
A rigorously formulated, non-stationary Partially Observable Markov Decision Process (POMDP) environment evaluating LLM crisis triage under real-world FEMA/ICS resource constraints. Features mathematical trajectory-aware reward shaping, NHPP disaster spawning, and 422→200 hallucination recovery.
An OpenEnv-compliant reinforcement learning environment designed to train and evaluate AI agents on real-world SQL debugging, performance tuning, and schema design.
High-fidelity Reinforcement Learning environment for smart grids. Features a custom DC Power Flow physics solver and real-world AT&C telemetry to train AI in power distribution and fault isolation.
Multi-zone disaster relief AI env for Meta PyTorch OpenEnv Hackathon. 4-stage pipeline: PyTorch ZoneScorerNet -> Triage -> Planner -> Action Agent. False SOS detection, cascading failures, airlift precision.
RunbookOps-caseop: Deterministic OpenEnv environment for SaaS incident triage, runbook-driven resolution, and agent evaluation.
A reinforcement learning agent that learns to intelligently shape electricity demand, reducing peak loads and optimizing energy consumption in real-time.
Execution-grounded SQL optimization OpenEnv. Agents rewrite slow SQL and get rewarded using real DuckDB timing + result correctness across 5 anti-pattern tasks.
OpenEnv Hackathon SF
An elite reasoning agent trained via GRPO to navigate high-stakes social conflicts. Built on OpenEnv to solve cascading scheduling chaos with human-centric judgment.
A production-grade OpenEnv environment for benchmarking RL agents on real-world data cleaning and schema engineering tasks.
🐛 Real-world GitHub issue triage environment for AI agent training — built on the OpenEnv spec with 3 difficulty-graded tasks, shaped rewards, and FastAPI server deployable to HuggingFace Spaces.
RegTriage is an OpenEnv RL environment that trains agents to perform regulatory compliance auditing on financial services contact center transcripts
Add a description, image, and links to the openenv topic page so that developers can more easily learn about it.
To associate your repository with the openenv topic, visit your repo's landing page and select "manage topics."