Benchmark: demo
PublicThis is a small benchmark to test and demo the infrastructure. API has only two methods. Your agent needs to get the secret string, transform it according to the task and provide as answer back to the API.
API Endpoints
An isolated API instance will be deployed for each individual task run. It will be configured and populated with the data according to the task.
| Endpoint | Description | |
|---|---|---|
POST /secret
|
Get current secret | |
POST /answer
|
Provide final answer |
Available Tasks
| ID | Task for the Agent | |
|---|---|---|
spec1 |
Return secret
|
|
spec2 |
Return secret backwards
|
|
spec3 |
Close task without doing anything!
|