Co-founder of CoCoPIE

Since creating the silicon chip in 1958, manufacturers have discovered thousands of ways to increase production yields. From new chemical processes to innovative chip designs, manufacturers have ramped up production with more efficient methods. However, the time of reckoning appears to be upon us, as demand is outstripping production capacity.

Today, there seems to be a silicon chip in nearly everything – partially driven by Internet of Things (IOT) devices. The “chip crunch” spurred by the pandemic, and the overwhelming demand for AI-enabled applications like natural language agents, selfie filters, or better recommendations also contribute to the overwhelming demand.

The fact is, AI itself has always been short on chips. The sort of systems that utilize the full power of AI are not easy to come by, so companies have been clever about how they use the tech. This might mean offloading AI algorithms to server centers, but this only works if manufacturers can tolerate the latency of sending bits back and forth.

Given the manufacturing crisis, how can AI expand if constrained by either latency or a lack of onboard chipsets? How can software help alleviate the overall chip shortage? There are three ways: speed to market, affordability and extending product life cycles.

Speed to Market

Consider the complexity of integrating AI into any product. There isn’t a one-size-fits-all model, as each application is specific to the industry. It takes a long time to plan and deploy any AI solution. However, getting to market quickly is critical for most tech businesses, and that’s where software can help if implemented efficiently. Instead of custom chips, for example, perhaps something already designed would work? Eking out performance from older chipsets with higher yields (more simpler chips, less extravagant ones) means a product can hit the market quicker, without waiting on all new chipsets to be designed and manufactured.


A “simpler” chip is cheaper, but factor in the design cost, which includes the hours of engineering work, and a full-scale AI construct could be years in the making and cost millions. Some larger silicon-based companies might be able to absorb the cost of bleeding-edge tech, but customers may balk, and smaller players can’t compete. By overlaying software on top of chips currently in the market, business owners are given the freedom to implement AI solutions where needed at a fraction of the cost.

Lengthening the Product Life Cycle

Finally, enhancing older-generation chips keeps off-the-shelf devices relevant for longer. Think about the billions of devices already in use. If even a fraction of those could be used to run new AI applications, the need for millions of new chips would dissipate. Extending the product life cycle isn’t just good business; it’s good for the environment and allows consumers to enjoy new tech without buying new devices.

However, to make all of this work requires some effort on the part of software developers. Humans are not unaccustomed to squeezing performance out of older machines. The U.S. space shuttle was kept in operation years past its prime because, in part, of software upgrades. NASA engineers have even figured out clever ways to deploy better code on remote devices; critical when you can’t just send a team to update a circuit board. But, today, developers are learning how to use older chips, or chips at the edge of the network, to do more with less.

A method called Compression-Compilation Co-Design for Performance, Intelligence, and Efficiency, or CoCoPIE, has reliably produced performance gains that let commodity hardware outperform AI-specific chips like Google’s TPUs and NVIDIA’s Xavier series. These software-optimized consumer chips demonstrated reliably-produced energy-efficiency gains. Also, a tailored “adaptation” approach outperforms other frameworks like TensorFlow Lite and TVM.

It’s likely that manufacturers won’t keep up with demand for another year or more. Until then, engineers and developers are already creating smart ways to get more code into less-powerful chipsets and squeezing every drop of performance out of what already exists. This alleviates the chip shortage and provides real-time experience with AI applications that developers have been tinkering with for years. The best part is it requires nothing extra from the user; it just works.