COMING SOON

Primo Software

Primo is an advanced AI software platform specifically designed to optimize the inference of generative AI models, ensuring seamless deployment and performance at scale. It consists of components to support model optimization, model fine tuning, RAG-based inference, and orchestration across complex model pipelines. The initial version of Primo is focused on supporting applications that require low latency performance across several industry verticals such as healthcare and life sciences, robotics, supply chain management, and public safety. The Primo platform runs on top of Esperanto’s ET-SoC-1 based platforms produced in partnership with Penguin Solutions.

Esperanto delivers the first RISC-V support for Ollama

Esperanto focuses on supporting open-source models, such as small language models and vision language models. As a demonstration of our ability to run a variety of open-source generative models, we are introducing a backend AI infrastructure consisting of ET-SoC-1 based servers from Penguin, Primo software stack running on these servers, family of open-source models, and example applications running on these systems. We wrapped the ET-SoC-1 and Primo based generative AI infrastructure around Ollama to enable people to access our generative AI system from the web to run their own applications or evaluate one of our existing demonstration applications. Esperanto will be making all of the models supported on the Primo system available in HuggingFace.

The integration of the Esperanto Inference Server into Ollama marks a significant milestone in the democratization of Large Language Models (LLMs). This integration introduces the first RISC-V architecture-based hardware support to this popular open-source LLM framework, opening new horizons for innovation and accessibility in AI.

We invite developers, researchers, and enthusiasts to explore these new possibilities and contribute to the growing ecosystem of open and accessible AI technologies.

Transforming Open-Source SLM
into RISC-V Optimized Code

The Primo AI/ML Model Development SDK enables developers to efficiently create and deploy AI/ML models on Esperanto’s RISC-V based accelerator, leveraging a suite of open-source tools for optimization at various stages of the development pipeline.

The Primo AI/ML Model Development SDK consists
of four main categories of tools

Fine-tuning

Including
LoRA and Flash
Attention

Quantization

Including
AWQ and other
quantization-related
tools

Exporting

Including
Jupyter and Torch
Dynamo

ML Compilation

Including
ONNX Runtime and
other ML compiler
technologies

3rd party applications deployed on cloud services like AWS Microservice environment can interface via Ollama’s Web API with any number of Ollama Server instances, attached to an Esperanto RISC-V backend.

Check out Esperanto’s supported applications

Open Web UI

In browser chat application
Open WebUI is an extensible, feature-rich, and user-friendly selfhosted WebUI designed to operate entirely offline. It supports various LLM runners, including Ollama and OpenAI-compatible APIs. In this sample application, our intent is to show how we integrated OpenWebUI to the Esperanto Inference Server via Ollama framework.

Each chat prompt is serviced by a single ET-SoC accelerator card.

How to use:

  1. Login to chat.esperanto.ai
  2. Select the model from the dropdown menu.
  3. Start chatting with the selected model running on the Esperanto cards.

Page Assist

Side Bar based Chrome Extension
A chrome extension which assists you as you browse the web. Page Assist can be launched as a Sidebar or Web UI.

RAG Flow

Web based RAG flow
An open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

TWINNY

Visual Studio Co-Pilot Extension
The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code – like GitHub Copilot but completely free and 100% private.

MORE DETAILS COMING SOON

Explore how our stack supports workloads
in a variety of industries

Healthcare

Pharmaceutical

Our stack supports workloads
in a variety of industries

Healthcare

Pharmaceutical