The main design choice we made is a strict cloud/local split.
The cloud handles research, backtests, rollout artifacts, and redacted telemetry. A local agent (running in the client environment) stores broker keys, sends orders, and enforces hard risk caps. The cloud can only send lifecycle commands (start/stop), not order instructions. Any trade-impacting change requires local approval.
We chose this because we wanted a cleaner boundary: the cloud helps with research and operations, but execution stays client-controlled.
I’d really appreciate feedback from people who’ve built or operated trading systems: What would you challenge first in this design (failure modes, trust assumptions, operational risks)?
Not selling anything here — just looking for honest technical feedback before we take it further.