Stephen Panaro (@flat) 's Twitter Profile
Stephen Panaro

@flat

making coffee and other things. @BrewTimerApp

ID: 1426135542

linkhttps://stephenpanaro.com calendar_today13-05-2013 18:54:57

853 Tweet

518 Followers

26 Following

Stephen Panaro (@flat) 's Twitter Profile Photo

Wondering if the tiny codebook (16 elements) opens any opportunities for GPU kernels (or if the scaling vectors negate it).

Stephen Panaro (@flat) 's Twitter Profile Photo

WWDC wishes (all long shots): - low-level ANE access (a la kernels) - actual quantized activations (for KV cache) - CoreML fast Hadamard transform - share weights between CoreML and MLX (or MLX ANE backend) - ANE HW metrics: GB/s, FLOPs

Stephen Panaro (@flat) 's Twitter Profile Photo

Incoming new coremltools looks like it has some nice bits: - 8 bit input/output tensors (previously all 8bit compute was kept internal) - >1 input can be enumerated shapes (👀ANE)