Nexa SDK
About this tool
Name
Nexa SDKCategory
toolsNexa SDK is a versatile on-device AI inference framework that allows you to run any model on any device, across CPUs, GPUs, and NPUs, with support for backends like CUDA, Metal, Vulkan, and Qualcomm NPU. It handles multiple input modalities, including text, image, and audio, and features an OpenAI-compatible API server with JSON schema-based function calling and streaming. Supporting model formats like GGUF, MLX, and Nexa AI’s proprietary .nexa format, Nexa SDK delivers efficient, quantized inference across diverse platforms, making it ideal for developers building AI applications with high performance and flexibility.
How to use
Integrate Nexa SDK into your application on your target device
Load your AI model in a supported format (GGUF, MLX, or .nexa)
Configure the backend (CPU, GPU, or NPU) for optimal performance
Send input data—text, image, or audio—to the model via the OpenAI-compatible API
Retrieve results in real time and leverage streaming or function calls as needed
tools
Genve AI
tools
Hypotenuse AI
tools
MetaVoice Studio
tools
Open Voice OS
tools
Shuffll
tools
Topaz Video AI
tools
Muse.ai
tools