Skip to content
NVIDIA DGX Spark and DGX Station: Deskside AI Supercomputers Powered by Grace Blackwell

NVIDIA DGX Spark and DGX Station: Deskside AI Supercomputers Powered by Grace Blackwell

Deskside AI Supercomputers Come to the Desktop

NVIDIA is bringing serious data center power right to the desk with its new DGX Spark and DGX Station systems. These compact AI supercomputers are designed for developers, researchers, creators and advanced users who want to run massive open and frontier AI models locally without relying entirely on the cloud.

Both systems are powered by the NVIDIA Grace Blackwell architecture and deliver petaflop class AI performance with large unified memory. That combination lets them handle models that used to demand a full data center server rack.

DGX Spark focuses on giving a wide range of developers a powerful local AI platform, while DGX Station targets frontier AI labs and enterprises that need to experiment with extremely large models up to one trillion parameters.

This shift to deskside AI matters for anyone interested in high performance computing and next generation workloads. It makes it far easier to prototype, fine tune and deploy advanced AI while keeping data and source code fully local.

DGX Spark and DGX Station Hardware and Model Capabilities

DGX Spark is built to run modern open source models that previously needed data center class hardware. With the latest NVIDIA CUDA X libraries and AI software preinstalled, it acts as a plug and play system for AI work.

Key capabilities of DGX Spark include:

  • Support for open source and enterprise models that reach 100 billion parameters on a local deskside system
  • Petaflop level AI performance powered by the new Blackwell architecture
  • Preconfigured NVIDIA AI software stack and CUDA X libraries for fast setup
  • Compatibility with popular open models and frameworks such as the NVIDIA Nemotron 3 family

The Blackwell architecture introduces a new NVFP4 data format, which allows AI models to be compressed by up to about 70 percent while maintaining accuracy. That means higher effective performance and the ability to fit larger models into local memory without sacrificing quality.

NVIDIA has also been working closely with the open source ecosystem. Its collaboration with llama.cpp, for example, delivers around a 35 percent average performance uplift for state of the art language models running on DGX Spark. Llama.cpp now also speeds up model loading times, which improves the quality of life for developers constantly iterating on large language models.

DGX Station is the higher end sibling for users who need extreme scale from a deskside system. It uses the GB300 Grace Blackwell Ultra superchip with 775 gigabytes of coherent memory at FP4 precision, which allows it to run models up to one trillion parameters.

That capability opens the door to experimenting with advanced frontier models such as:

  • Kimi K2 Thinking
  • DeepSeek V3.2
  • Mistral Large 3
  • Meta Llama 4 Maverick
  • Qwen3 families
  • OpenAI gpt oss 120b class models

Developers behind frameworks like vLLM and SGLang highlight a big practical advantage of DGX Station. Instead of needing access to full rack scale GB300 deployments, they can now test GB300 specific features and huge model configurations directly on a single deskside box. That reduces costs, shortens iteration cycles and removes a lot of friction from framework development.

At CES, NVIDIA is using DGX Station to demonstrate high level workloads such as:

  • Language model pretraining at roughly 250,000 tokens per second
  • Large data visualization of millions of clustered data points using NVIDIA cuML acceleration
  • Text to Knowledge Graph pipelines powered by Llama 3.3 Nemotron Super 49B

Creator Workflows, Local AI Agents and Real World Use Cases

NVIDIA is not limiting these systems to pure research. DGX Spark and DGX Station are designed to support the full AI lifecycle across many industries including healthcare, robotics, retail and creative content.

For creators, the focus is on modern image, diffusion and video generation models. Popular models like Black Forest Labs FLUX.2 and FLUX.1 and Alibaba Qwen Image now support NVFP4, which cuts memory usage and boosts performance on compatible NVIDIA GPUs. Lightricks LTX 2 video model is also optimized with NVFP8 quantized checkpoints, delivering quality comparable to top cloud hosted models but runnable locally.

Live CES demos show DGX Spark offloading demanding video generation workloads from creator laptops. In one comparison, DGX Spark delivers about an eight times speed up against a high end MacBook Pro with an M4 Max chip. The idea is to let creators keep using their regular machines for editing and design while the heavy generation tasks run on the DGX box in the background.

For modders and game artists, NVIDIA is tying DGX Spark into the open source RTX Remix platform. The goal is to let mod teams offload all asset creation to DGX Spark while they continue to tweak levels and gameplay on their own PCs. With generative AI helping build textures and assets, teams can instantly see their changes in game without pauses or long waits.

Developer productivity is another focus. NVIDIA is showing a local CUDA coding assistant powered by NVIDIA Nsight running directly on DGX Spark. This assistant helps write and optimize GPU code while keeping the entire source tree on premises. For teams with strict IP and security requirements, that combination of local data and powerful AI assistance can be a strong draw.

Industry partners are already using DGX Spark as a local AI engine for edge and agentic workflows. Hugging Face is pairing it with the Reachy Mini robot to build embodied AI agents that can see, listen and respond with expressive motion. IBM is using DGX Spark with its OpenRAG stack to deliver a complete retrieval augmented generation box with extraction, embedding, retrieval and inference all running locally.

JetBrains is positioning DGX Spark for customers that want high performance AI with strict control over governance and intellectual property, whether they deploy in cloud, on premises or hybrid environments.

There are also more futuristic examples. TRINITY, an intelligent self balancing three wheeled single passenger urban vehicle, uses DGX Spark as its AI brain for real time vision language model inference. The idea is to treat the vehicle as an AI agent with conversational and goal tracking workflows designed for connected cities.

To help developers get started quickly, NVIDIA is expanding its library of DGX Spark playbooks. New and updated playbooks cover areas such as small scale Nemotron 3 Nano models, robotics training, vision language models, fine tuning using two DGX Spark systems, genomics and financial analysis. DGX Station will receive its own set of playbooks as it reaches general availability.

Both DGX Spark and partner GB10 systems are already available from major OEMs and retailers, and DGX Station is expected to arrive from partners like ASUS, Dell Technologies, HP, MSI and others starting in spring 2026. With these systems, NVIDIA is clearly betting on a future where powerful AI is not just in the cloud or data center but also sitting right next to the keyboard.

Original article and image: https://blogs.nvidia.com/blog/dgx-spark-and-station-open-source-frontier-models/

Cart 0

Your cart is currently empty.

Start Shopping