Show HN: Smile-Serve – Inference Server for ML, ONNX, and LLM

  • Posted 2 hours ago by haifeng
  • 2 points
https://github.com/haifengl/smile/tree/master/serve
SMILE Serve is a production-ready inference server built on [Quarkus](https://quarkus.io/) that brings together three complementary inference capabilities on the JVM:

  - **Classic ML**: `/api/v1/models` for serialized SMILE models (`.sml`)
  - **ONNX Runtime**: `/api/v1/onnx` for any model in the ONNX open format (`.onnx`)
  - **LLM Chat**: `/api/v1/chat` for Llama 3 chat completions
A React-based web UI is bundled and served from the same process.

0 comments