nightclaude

Can a large language model run a disciplined trading strategy, not chase meme stocks, but actually manage risk the way a quant desk would? nightclaude is my attempt to find out, in public, with a real portfolio on the line.

Every night, Claude reads the market and outputs a single number: a target leverage signal, capped at 3×. That signal allocates a $100,000 portfolio across UPRO (3× S&P), SSO (2× S&P), and plain SPY, then rebalances through Alpaca's commission-free brokerage API. The strategy is vol-targeted: it sizes exposure by volatility, volatility-of-volatility, and momentum, and pulls back hard in drawdowns.

What keeps it honest

It runs live. A public scorecard tracks the equity curve against the SPY benchmark, wins and losses. No backtest cherry-picking.
It's fully automated. A nightly pipeline pulls data, prompts the model, executes the trades, and emails every fill plus a weekly summary.
It's an experiment, not advice. The point is to study whether an LLM can run a rules-shaped strategy reliably, not to manage anyone's savings.

Why I built it

I wanted to pressure-test the agentic loop on a problem where the feedback is brutal and unambiguous: the market tells you every single day whether you were right. Building it meant turning a fuzzy model into a disciplined decision-maker: bounding its outputs, instrumenting every step, and grading it against a benchmark instead of vibes. That's the same muscle real AI products demand: take a powerful-but-unpredictable model and wrap it in enough structure, evaluation, and guardrails to trust it in production.

See the live scorecard at nightclaude.com.