WebGPU Transformers Sohbet

aka11

WebGPU Transformers

Native Client-side Inference

Model: Not loaded

System

Welcome! Load a GGUF model to start chatting natively in your browser using WebGPU.

Note: This is an experimental toy implementation. Loading >1GB models into Float32 memory arrays will likely crash browser tabs. Try loading a TinyLlama or extremely small model!