bug: Parallel stepping crashes entire simulation when a single agent's step fails

## Bug Description

When using parallel agent stepping (`step_agents_parallel`, `step_agents_multithreaded`, or `do_async`), if any single agent's `astep()` or `step()` raises an exception, **all** other agent tasks are cancelled/abandoned and the entire simulation crashes.

This is particularly problematic for LLM-backed agents where transient failures are expected and common (rate limits, timeouts, malformed JSON responses, network errors, etc.). A single flaky LLM response should not kill a 100-agent simulation.

## Environment

- **mesa-llm**: v0.3.0 (commit `a8161c7`)
- **mesa**: 3.5.0
- **Python**: 3.12

## Root Cause

Three parallel execution paths in [`mesa_llm/parallel_stepping.py`](https://github.com/mesa/mesa-llm/blob/a8161c735478bc7aeb4911335954029de53b3728/mesa_llm/parallel_stepping.py) lack error isolation:

1. **`step_agents_parallel()`** (line 31) — `asyncio.gather(*tasks)` called without `return_exceptions=True`, so one failed coroutine cancels all others
2. **`step_agents_multithreaded()`** (lines 52–53) — bare `future.result()` in a loop with no `try/except`, so the first exception aborts remaining agents
3. **`_agentset_do_async()`** (line 124) — same `asyncio.gather` issue as #1

## Reproduction

```python
from mesa.model import Model
from mesa.agent import Agent
from mesa_llm.parallel_stepping import step_agents_parallel

class DummyModel(Model):
    def __init__(self):
        super().__init__(seed=42)

class FailingAgent(Agent):
    def __init__(self, model):
        super().__init__(model)

    async def astep(self):
        raise RuntimeError("LLM timeout")

class WorkingAgent(Agent):
    def __init__(self, model):
        super().__init__(model)
        self.counter = 0

    async def astep(self):
        self.counter += 1

import asyncio

async def main():
    m = DummyModel()
    failing = FailingAgent(m)
    working = WorkingAgent(m)
    await step_agents_parallel([failing, working])
    print(working.counter)  # Never reached

asyncio.run(main())
```

## Actual Behavior

```
Traceback (most recent call last):
  File "repro.py", line 28, in main
    await step_agents_parallel([failing, working])
  File ".../mesa_llm/parallel_stepping.py", line 31, in step_agents_parallel
    await asyncio.gather(*tasks)
  File "repro.py", line 14, in astep
    raise RuntimeError("LLM timeout")
RuntimeError: LLM timeout
```

`WorkingAgent.astep()` is cancelled and never completes. The entire simulation crashes.

## Expected Behavior

- Failed agents should be **isolated** — other agents complete normally
- Failures should be **logged** with agent ID and exception details
- Users should be able to **inspect** which agents failed and why (e.g., via a structured result object)
- An optional **"raise" mode** should be available for debugging, but the default should be resilient

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: Parallel stepping crashes entire simulation when a single agent's step fails #220

Bug Description

Environment

Root Cause

Reproduction

Actual Behavior

Expected Behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

bug: Parallel stepping crashes entire simulation when a single agent's step fails #220

Description

Bug Description

Environment

Root Cause

Reproduction

Actual Behavior

Expected Behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions