Virtual cores and virtual threaded chips could boost chip performance by 4 times and restore performance per watt scaling

Soft Machines (startup with $125 million in funding and working with Samsung and AMD) developed new VISC™ (Virtual Instruction Set Computing) Architecture (19 page presentation)

Soft Machines demonstrated a 28nm dual-core version of their virtual core approach at the Linley Processor Conference. The 300-400MHz prototype chip ran 32-bit ARM software at performance levels that suggest its technology could provide a leap over current approaches. It looked impressive and could be the gist of a first product.

The startup aimed for 10x improvements but on average expects to deliver still respectable 4x gains. The good news is its technology could be applied to a broad range of chips, from IoT and mobile SoCs to server processors.

The next big target is a quad-core version running at about 1.5 GHz.

• Extracting ILP has significant complexity
• OoO complexity increases quadratically with machine width
• VISC complexity increases linearly with number of virtual cores
• VISC Performance/Watt utilizes linear scaling

The Soft Machines technology involves a so-called global front end (GFE), a new processor element that breaks single-threaded software into smaller operations such as fetches and loads. It feeds those operations into a virtual pipeline, dynamically constructed from the physical resources of an underlying multicore processor based on the workload’s needs.

The GFE unit adds three new stages to the front end of a traditional processor pipeline. It is the wise place to spend your latency budget.

The operating system and higher-level software do not need to know how the code is being dissected into virtual threads, he said. An intermediate software layer below the operating system and hypervisor turns software into the company’s own so-called VISC instructions. Thus, Soft Machines could apply its technology to any processor — ARM, MIPS, Power, or x86.

Commercial units could be available starting in 2015.

The VISC architecture is based on the concept of “virtual cores” and “virtual hardware threads.” This new approach enables dynamic allocation and sharing of resources across cores. Microprocessors based on CISC and RISC architectures make use of “physical cores” and “software threads,” an approach that has been technologically and economically hamstrung by transistor utilization, frequency and power-scaling limitations. The VISC architecture achieves 3-4 times more instructions per cycle (IPC), resulting in 2-4 times higher performance per watt on single- and multi-threaded applications. Moreover, VISC uses a light-weight “virtual software layer” that makes VISC architecture applicable to existing as well as new software ecosystems.

“We founded Soft Machines with the mission of reviving microprocessor performance-per-watt scaling. We have done just that with the VISC architecture, marking the start of a new era of CPU designs,” said Soft Machines co-founder, president and CTO Mohammad Abdallah. “CPU scaling was declared dead when the power wall forced CISC- and RISC-based designs into multi-core implementations that require unrealistically complex multi-threading of sequential applications. The VISC architecture solves this problem ‘under the hood’ by running virtual hardware threads on virtual cores that far exceed the efficiency of software multi-threading.”

The VISC architecture scales by changing the number of virtual cores and virtual threads. This approach provides a single architecture capable of addressing the needs of applications spanning from the Internet of Things (IoT), to mobile, and to data center markets.

“Soft Machines’ VISC architecture takes a big step forward in solving the most critical problem in CPU design today: single-thread performance,” commented Linley Gwennap, principal analyst of The Linley Group. “By shifting the burden to hardware, VISC aims to deliver the benefits of multi-threading to all applications.”

Here is 5 page report on how they are addressing the IPC (instructions per cycle) bottleneck.

SOURCES : Soft Machines, EETimes