By Levi Li, DIGITIMES Asia, Taipei
Publication Date: 2026-02-25 01:45:00
Toronto-based AI chip startup Taalas says it can hardwire a large language model directly into silicon to accelerate inference beyond what conventional GPUs can deliver. Founded in 2023, its first product — the HC1 inference chip — generates nearly 17,000 tokens per second for a single user running Meta’s Llama 3.1 8B. In company benchmarks, performance is reported at 48 times that of Nvidia’s B200 under the same configuration.