Training neural networks cannot be done with discrete values. The common solution is to approximate the quantized values with continuous ones which suffer accuracy degradation for extreme low precision especially for small models.
The technology deals with training of low precision neural networks for efficient inference which is essential for running neural networks on limited hardware. The method includes adding noise to values during training, simulating the quantization noise to be observed in inference time.
- Makes the network immune to quantization noise and make it achieve high accuracy in inference time
- Advantage for small neural networks that are more likely to be used in limited hardware
Applications and Opportunities
- NN inference on FPGAs, Mobile devices, embedded platform and ASICs