ZpЉ


Parallelization is here to challenge us

Processors have evolved to have several cores to overcome the limitations of performance. Multi-core processors have become necessity in various fields. Future processors may have more processor cores compared to the present, and therefore in industry, it is necessary to strategize software development for multi-core processors in their products.
OSCARTech® parallelization technology provides you with unprecedented features. OSCARTech® Compiler enables analysis of multi-grain parallelism in your software and assigns fragments in the program to cores optimally. Furthermore, OSCARTech® Compiler reduces power consumption in your software by inserting DVFS (Dynamic Voltage and Frequency Scaling), Clock Gating and/or Power Gating directives in parallelized program.
This compiler technology is unique in that it compiles C sequential code into parallel C code for multi-core processors with a set of OSCARTech® directives inserted AUTOMATICALLY. The technology has outstanding credentials both in an academic community such as Multicore Special Technical Committee, IEEE Computer Society and in hands-on experiences of industrial products from smartphones, automobiles to large computing servers.






Sphere of OSCARTech® Parallel Programming



(1) Multi-Grain Parallel Processing
Not only does the OSCARTech® compiler extract parallelism in loops, but it extracts latent parallelism hierarchically all through multiple grains in a whole program.
(2) Power Reduction Optimization
The OSCARTech® Compiler automatically analyzes scheduling result to cores and inserts DVFS (Dynamic Voltage and Frequency Scaling) , Clock Gating and/or Power Gating control directives in the parallelized program.
(3) Memory Data Localization Optimization
The OSCARTech® Compiler implements data localization in caches and local memories by taking account of access timing and memory size.
(4) Optimized Synchronization among Cores
A unique invention of automatically preserving synchronization in each cluster of cores in the program is deployed to minimize synchronization time.
(5) Re-targetable Parallelization Platform
The OSCARTech® compiler emits OSCARTech® API directives in parallelized program which retains independency against a target CPU and/or accelerator on the SoC. It turns out to be your portable software assets.
(6) Static and Dynamic Profiling
The OSCARTech® compiler offers not only static but also dynamic profiling at runtime to feed the results back to the compiler for precise optimization in performance boost and power consumption reduction.
(7) Visualzation of Parallelism
Data and control dependencies in the program are analyzed and the result is visualized in a graphical representation. This graph also facilitates the analysis of correspondence of the sequential source program to the parallelized one by the OSCARTech® compiler. The CPU utilization of each core onto which the OSCARTech® compiler assigns a segment of the parallelized program is represented in a time-line manner.

What on earth is the Multi-grain Parallelization?

The OSCARTech® compiler at first analyzes control flow, data dependency, program structure and data structure in the program, resulting in partitioning the whole program into what we call gmacro taskh. There are different kinds of macro tasks defined in the compiler: Basic Block, Function and Loop. The sequential program is decomposed into a number of Basic Blocks, Functions, and Loops. Thus the terminology, the Multi-grain.
The subsequent process is to parallelize the macro tasks, taking account of dependencies and other constraints.

The most outstanding differentiator of the OSCARTech® compiler against other commercial compilers, say, from Intel, ARM, or IBM lies in the fact that only our complier is able to detect and extract parallelism between macro tasks, e.g., function and function, loop and loop; all the possible interdependencies are scrutinized. This is how the OSCARTech® compiler exhibits an edge over competitors.



The more cores, the less power.

Even though the OSCARTech® compiler distributes macro tasks to the cores, based on the execution time of each macro tasks, not are all the cores active all the time, causing idle time in certain cores at certain times. The OSCARTech® compiler capitalizes on these idle times by exerting Clock Gating, DVFS, or Power Gating mechanisms available on chip by the OSCARTech® API directives, gating clock beating, and varying the frequency and voltage of targeted cores on the fly. Remember stopped clocking reduces charge/discharge power, and lowering the voltage/frequency decreases power consumption in proportion to the square of voltage times the frequency, thus contributing to a dramatic power reduction of multi-core CPU chip. It is obvious the more cores we have on the chip, the less power is consumed. Hence, The More Cores, the Less Power.