Skip to content
Strategy8 min read

Polymarket vs Kalshi: What Building an Autonomous Prediction Trader Taught Us About Liquidity

Forty-nine days of zero executions on an $1,100 bankroll exposed that our execution freeze was caused by an environment variable name mismatch, not by market structure.

EF

EganForge Team

May 24, 2026

Coinbase reported that prediction markets on its platform generated a $100M annualized revenue run rate in less than two months in Q1 2026. Polymarket has recorded $2.7B in aggregate lifetime volume, including $128M in the crypto category and $32.9M specifically on markets tied to Coinbase itself. Kalshi launched with explicit US regulatory approval, a different risk and compliance profile. Those are the headline numbers any builder in this space sees first.

EchoSwarm is our autonomous prediction trader. It fields five specialist agents: `ai_probability_specialist`, `smart_money_tracker`, `logical_arb_scanner`, `correlation_arb_scanner`, and `lp_rewards_lane`. The swarm runs in twenty Docker containers. Every trade requires multi-agent consensus. Lifetime the system has delivered 1,992 signals. The single `ai_probability_specialist` agent alone produces 641 signals per day. Total revenue captured from those signals is $337.02 USDC. The bankroll currently sits at $1,100 USDC. The last executed trade was 2026-04-05.

That is 49 days of zero trade execution as of today.

The root cause was not liquidity, spread width, or regulatory access. It was a string mismatch between documentation and live configuration. The documented variables were `MIN_EDGE` and `MIN_CONFIDENCE`. The variables actually present in the `.env` file were `MISPRICE_THRESHOLD=10.0` and `CONSENSUS_EDGE_THRESHOLD=25.0`. Every audit that grepped for the documented names found nothing and reported that threshold logic was absent or disabled. The code continued to run, the specialists continued to emit proposals, and the consensus step continued to evaluate against the real but differently named variables. Because the names never matched, the effective edge and confidence gates were never applied the way the design assumed.

Paper mode was another open loop. The explicit test window ended 2026-05-18. No hard decision gate was wired at the end of that window. The flag remained in the environment, the swarm continued to propose, and the proposals continued to be discarded by the paper-mode guard. Without an explicit deadline plus a required human or automated verdict, paper mode became indefinite shadow.

During the same period the TauricResearch TradingAgents repository was released. Multi-agent LLM trading systems moved from internal experiment to open source artifact. The external bar for "this is possible" dropped while our own execution path remained frozen.

From the perspective of someone building an autonomous system that must size, enter, and exit without human intervention, the Polymarket versus Kalshi comparison is not primarily about total volume or regulatory status. It is about the shape of liquidity on the specific contracts the agents are likely to select and the reliability of the unwind path.

Polymarket offers deep books on many contracts and meaningful volume on crypto-adjacent events. The depth is real, but the ability to exit a 2-3% position in under a minute without moving the mid by several points varies sharply by contract and by time of day. Kalshi's regulated status reduces certain tail risks for a US entity, but the order book depth on the narrower event contracts that an autonomous specialist would actually want to size into is thinner. For code that calculates expected slippage on exit before it ever submits the entry, the relevant data is the current ladder and recent trade history for that exact contract, not the 24-hour aggregate volume banner.

We now have a short list of changes that would have prevented the 49-day stall:

  1. On every deploy, assert that the environment variable names referenced in the running code match the names documented in the deployment notes and in the risk spec. A one-line diff between the two sources would have caught the drift.
  1. Any paper-mode window must carry an explicit end timestamp and a required decision record before the flag can be extended. The absence of that record should fail the next cycle start.
  1. The exit-position flow must be exercised and measured before any entry-position logic is allowed to activate. In our case the entry proposals were being generated and the exit path was never invoked because no positions existed.
  1. Begin with a single market category and a single specialist rather than five parallel agents. The coordination overhead and the surface area for configuration drift scale with the number of independent specialists.

The open source release of TradingAgents changes the cost calculation for future work. The base capability is now cheaper to stand up. The expensive part remains the same: making sure the autonomous system actually executes when the model says the edge is present, and that the configuration the code reads is the configuration the documentation claims.

prediction-marketsliquidityautonomous-agentspost-mortem

Try our AI tools for free

Every EganForge tool has a free tier. No credit card required.

Explore Products →

Trade Smarter with AI Signals

Multi-model AI consensus (Claude + GPT-4o + Grok) delivers high-conviction crypto signals to your Telegram. Every trade published. Starting at $9/month.

Get AI Signals →