HN – Show HN: Smile-Serve – Inference Server for ML, ONNX, and LLM

SMILE Serve is a production-ready inference server built on [Quarkus](https://quarkus.io/) that brings together three complementary inference capabilities on the JVM:

  - **Classic ML**: `/api/v1/models` for serialized SMILE models (`.sml`)
  - **ONNX Runtime**: `/api/v1/onnx` for any model in the ONNX open format (`.onnx`)
  - **LLM Chat**: `/api/v1/chat` for Llama 3 chat completions

A React-based web UI is bundled and served from the same process.

Show HN: Smile-Serve – Inference Server for ML, ONNX, and LLM

0 comments