WebGPU Transformers
Native Client-side Inference
Model: Not loaded
System
Welcome! Load a GGUF model to start chatting natively in your browser using WebGPU.
Note: This is an experimental toy implementation. Loading >1GB models into Float32 memory arrays will likely crash browser tabs. Try loading a TinyLlama or extremely small model!
Note: This is an experimental toy implementation. Loading >1GB models into Float32 memory arrays will likely crash browser tabs. Try loading a TinyLlama or extremely small model!
