Algorithms provide more performance gain than hardware in many cases

There are various examples where algorithm improvements provided more performance gain than hardware improvement.

* a linear programming problem that would take 82 years to solve in 1988 could be solved in one minute in 2003. Hardware accounted for 1,000 times speedup, while algorithmic advance accounted for 43,000 times.
* algorithm speedup between 1991 and 2013 for mixed integer solvers was 580,000 times, while the hardware speedup of peak supercomputers increased only 320,000 times.
* Similar results are rumored to take place in other classes of constrained optimization problems and prime number factorization.

Given the rate and the magnitude of algorithm evolution, many of those alternative AI chip designs may become obsolete even before their commercial releases. AI algorithms of tomorrow might demand different compute architectures, memory resources, data-transfer capabilities, etc.

DeepScale, spun-out of UC Berkeley research, squeezes AI for advanced driver assistance systems (ADAS) and autonomous driving onto automotive grade-chips (as opposed to GPUs). Their neural network models have demonstrated 30 times speedup compared to leading object detection models using algorithms alone while reducing energy and memory footprint enough to run on existing hardware in just a couple of years.

Another example of such algorithmic leapfrogging came from researchers from the Allen Institute of Artificial Intelligence. Using a novel mathematical approach employing binarization of neural networks, they showed that they can drastically increase speed as well as reduce power and memory requirements. This enables even the most advanced deep-learning model to be deployed on a chip as small as a $5 Raspberry Pi. The researchers recently spun out the algorithms and processing tools as XNOR.ai* to deploy AI on edge devices and drive further algorithmic advances for AI.

39 thoughts on “Algorithms provide more performance gain than hardware in many cases”

  1. First there was bubble sort (1956), then there was quick sort (1959), then there was heap sort (1964). Heap-sort is similar to quicksort, but with better worst-case performance.

    Reply
  2. First there was bubble sort (1956) then there was quick sort (1959) then there was heap sort (1964). Heap-sort is similar to quicksort but with better worst-case performance.

    Reply
  3. Improvements in CPU speed have stagnated. Future improvements in solving difficult problems will have to be by improving the algorithms.

    Reply
  4. Better algorithms can provide orders of magnitude improvements upon the second best. Things of working with exponential vs polynomial execution times. What hardware is very good at, is at improving the cost of constant overhead & operations, like those of doing arithmetic or other operations, moving data from some place to another or to/from a I/O device. The best algorithms couldn’t get us many near miraculous things hardware can do just playing with or giving you more of those constant cost operations per unit of time. But a good new algorithm can open a whole set of applications that weren’t possible before it.

    Reply
  5. Well, yeah, that’s not a surprise. Doesn’t GLADIS manage to survive in Portal 2, even when she is reduced to a single motherboard and a potato battery?

    Reply
  6. Improvements in CPU speed have stagnated. Future improvements in solving difficult problems will have to be by improving the algorithms.

    Reply
  7. Better algorithms can provide orders of magnitude improvements upon the second best. Things of working with exponential vs polynomial execution times.What hardware is very good at is at improving the cost of constant overhead & operations like those of doing arithmetic or other operations moving data from some place to another or to/from a I/O device.The best algorithms couldn’t get us many near miraculous things hardware can do just playing with or giving you more of those constant cost operations per unit of time. But a good new algorithm can open a whole set of applications that weren’t possible before it.

    Reply
  8. Well yeah that’s not a surprise. Doesn’t GLADIS manage to survive in Portal 2 even when she is reduced to a single motherboard and a potato battery?

    Reply
  9. Hardware is just a physical representation of an algorithm. If you get algorithm gains you can make hardware to to run it even faster. If you are going to assume fixed hardware tech, like an instruction based CPU, then you’d expect algorithmic gains to be more effective.

    Reply
  10. If this is true, then we should see a dramatic speed up of software perfornance running on any given hardware platform starting in the near future. In other words, we should see a reversal of the “bloatware” progression that has been the case for the past 25 years. I’ll believe it when I see it.

    Reply
  11. Hardware is just a physical representation of an algorithm. If you get algorithm gains you can make hardware to to run it even faster. If you are going to assume fixed hardware tech like an instruction based CPU then you’d expect algorithmic gains to be more effective.

    Reply
  12. If this is true then we should see a dramatic speed up of software perfornance running on any given hardware platform starting in the near future. In other words we should see a reversal of the bloatware”” progression that has been the case for the past 25 years. I’ll believe it when I see it.”””

    Reply
  13. So-called ‘bloatware’ is a result of using a substandard, spyware program as your operating system. Linux of today can run on computers of a decade ago and more without issue. Windows can’t run on computers of today without issue.

    Reply
  14. So-called ‘bloatware’ is a result of using a substandard spyware program as your operating system. Linux of today can run on computers of a decade ago and more without issue. Windows can’t run on computers of today without issue.

    Reply
  15. Several alternatives are improving the computer architecture and running the algorithm in parallel on multiple processors. Note, processors aren’t getter faster, they’re getter smaller… though there is a limit to that too.

    Reply
  16. The current state of the art deep learning is in the low millions for number of variables. I have mighty big doubts these can be run on small embedded processors. Also, as the state of the art advances and the number of variables continues to grow there is no way an off the shelve processor can keep up. Note, we’ve also been talking about the evaluation and not the determination of the networks. The evaluation is trivial compared to the determination which requires huge server farms worth of GPUs.

    Reply
  17. Several alternatives are improving the computer architecture and running the algorithm in parallel on multiple processors. Note processors aren’t getter faster they’re getter smaller… though there is a limit to that too.

    Reply
  18. The current state of the art deep learning is in the low millions for number of variables. I have mighty big doubts these can be run on small embedded processors. Also as the state of the art advances and the number of variables continues to grow there is no way an off the shelve processor can keep up. Note we’ve also been talking about the evaluation and not the determination of the networks. The evaluation is trivial compared to the determination which requires huge server farms worth of GPUs.

    Reply
  19. We’ve known that for decades. First there was heap-sort, then bubble sort, then quicksort – each step being much faster than the last. Now we have the advantage of both faster hardware and better algoeithms

    Reply
  20. We’ve known that for decades. First there was heap-sort then bubble sort then quicksort – each step being much faster than the last. Now we have the advantage of both faster hardware and better algoeithms

    Reply
  21. First there was bubble sort (1956), then there was quick sort (1959), then there was heap sort (1964). Heap-sort is similar to quicksort, but with better worst-case performance.

    Reply
  22. We’ve known that for decades. First there was heap-sort, then bubble sort, then quicksort – each step being much faster than the last. Now we have the advantage of both faster hardware and better algoeithms

    Reply
  23. Several alternatives are improving the computer architecture and running the algorithm in parallel on multiple processors. Note, processors aren’t getter faster, they’re getter smaller… though there is a limit to that too.

    Reply
  24. The current state of the art deep learning is in the low millions for number of variables. I have mighty big doubts these can be run on small embedded processors. Also, as the state of the art advances and the number of variables continues to grow there is no way an off the shelve processor can keep up. Note, we’ve also been talking about the evaluation and not the determination of the networks. The evaluation is trivial compared to the determination which requires huge server farms worth of GPUs.

    Reply
  25. So-called ‘bloatware’ is a result of using a substandard, spyware program as your operating system. Linux of today can run on computers of a decade ago and more without issue. Windows can’t run on computers of today without issue.

    Reply
  26. Hardware is just a physical representation of an algorithm. If you get algorithm gains you can make hardware to to run it even faster. If you are going to assume fixed hardware tech, like an instruction based CPU, then you’d expect algorithmic gains to be more effective.

    Reply
  27. If this is true, then we should see a dramatic speed up of software perfornance running on any given hardware platform starting in the near future. In other words, we should see a reversal of the “bloatware” progression that has been the case for the past 25 years. I’ll believe it when I see it.

    Reply
  28. Better algorithms can provide orders of magnitude improvements upon the second best. Things of working with exponential vs polynomial execution times.

    What hardware is very good at, is at improving the cost of constant overhead & operations, like those of doing arithmetic or other operations, moving data from some place to another or to/from a I/O device.

    The best algorithms couldn’t get us many near miraculous things hardware can do just playing with or giving you more of those constant cost operations per unit of time. But a good new algorithm can open a whole set of applications that weren’t possible before it.

    Reply

Leave a Comment