But maintaining it became a true albatross. Polars is an amazing project, but the development process is fast and a lot is very focused on the Python lib. We found that trying to maintain Explorer against Polars was a maintenance nightmare and eventually hit points where we had to give up features and found it extremely difficult to update to the latest.
We also tried distributing Explorer and only got so far. A reasonable alternative to Spark was always what I wanted, and I could (tantalisingly, frustratingly) see the pieces there in dataframes and the BEAM, but couldn't make it happen.
We also always knew that the right direction was to be 'lazy by default', accumulating ops and only executing when the dataframe needs to be realised. But this was very difficult with Polars's Series API and eager/lazy split.
Enter DuckDB. A few weeks ago, I made a duckdb backend for Explorer. But in doing so I saw that DuckDB would allow us to realise the lazy-by-default and distributed vision. So I went for it.
And here we are. Dux as in ducks as in multiple ducks. Plus an 'x' in the name because, you know, it's Elixir.
It's faster than Explorer on a single node. It has a simple, dataframe only API. It distributes arbitrarily on Erlang clusters on the BEAM. Startup is faster than Spark and for many use cases it's simpler and faster. DuckDB functions are all transparently available, as are custom SQL macros. We have a full graph API, as in GraphX/NetworkX. You can install and use any duckdb extensions, including in distribution. And on the maintenance side, it doesn't use a NIF (it depends on the ADBC library[2] and a DuckDB driver) -- the API is primarily about compiling to SQL.
DuckDB is incredible for OLAP on out of memory data. Distribution enables fast exploration of in-memory data and real-time applications. The BEAM gives us battle-hardened distribution almost for free.
Give it a shot! I'd love feedback and of course PRs are welcome. Oh, I also made a webpage for it[3].
[1] https://github.com/elixir-explorer/explorer
[2] https://github.com/livebook-dev/adbc
[3] https://dux.now