Algorithm Optimization
There are many computer scientists that are researching various ways to reduce the amounts of computation needed to train neural networks. Two of these methods to reduce computation are called pruning and quantization. Pruning involves removing parameters that do not contribute to the overall performance, and therefore making the model a little bit smaller. Quantization takes the remaining parameters and makes them leaner, by lowering the amount of memory each parameter occupies within the computer. These changes have minor effects at first, but when applied to billions of parameters, which most modern AI systems contain, can make a massive difference. For example, quantization reduces memory requirements by up to 51 percent. Both of these involve changing the internal process of training AI to lower the amount of energy needed, this is where the name algorithm optimization comes from. (ARS Technica)