User Tools

Site Tools


polca_manual

In this page we will explain how to face different C code structures and patterns and how to annotate them using the POLCA programming model following a learn by example approach. For reference a detailed and updated version of the POLCA syntax and semantics can be found here:

POLCA Language Definition

Manual

Blocks (Scopes)

The POLCA directives give information to the POLCA tools about a particular lines of C code. The directives are applied to a specific POLCA block, which corresponds to the next C line of code, or scope between curly brackets, that is {}.

Probably the simplest POLCA directive is polca def, which is used to give a unique name to the POLCA block to which it is applied.

Single Line Blocks

The next two code snippets show how to apply POLCA directives to a single C line of code. In this first example, the line of code is wrapped between curly brackets and the directive is places just in front of it.

#pragma polca def ADD
{
  x = x + y;
}

Alternatively, for a single line block, the curly brackets can be removed and the directives are applied to the next code line. This code snippet is equivalent to the previous one:

#pragma polca def ADD
  x = x + y;

Multi-line Blocks

In case of multiple line blocks, these have to be always braced into curly brackets, otherwise POLCA tools cannot determine where the block ends.

The next code snippet show a multi line POLCA block.

#pragma polca def INIT
{
  x = 0;
  y = N;
}

Embedded Blocks

It is also possible to have a POLCA block embedded into another. The directives that are applied to the outer block also provide information to the inner blocks. The inner blocks can be referenced, by their def name by the outer ones.

This code shows a POLCA block, OUTER, with two embedded blocks, INNER1, a multi line block and INNER2, a single line block:

#pragma polca def OUTER
{
  x = 0;
  y = N;
 
#pragma polca def INNER1
  {
    z = x + y;
    z = z*2;
  }
 
#pragma polca def INNER2
  x = z*y;
}

Block Input and Output variables

The directives input, output and inout are used to give explicit information about the input and output variables for annotated POLCA block. The next code is a single line block that takes the variable x as input (its value is read, but not over written) and y as output (the value of the variable is written, but its initial status is not read):

#pragma polca input x
#pragma polca output y
  y = x;

In case the initial value of a variable's initial value is read and then overwritten (y is increased by x's value) the variable should be declared as input and output,

#pragma polca input x y
#pragma polca output y
  y = y + x;

or, as in the following code, equivalent to the previous one, as inout:

#pragma polca input x
#pragma polca inout y
  y = y + x;

Exactly the same directives can be used for multiple line blocks, just needing to place the directives at the beginning of it (and not where the variables are used inside the block):

#pragma polca input a b c d
#pragma polca output z
{
  z = a;
  z = z + b;
  z = z + c;
  z = z + d;
}

Note that in the previous code, the variable z is only output because its initial value is overwritten before it is read again. That is, as we do not use its value previous value to entering the POLCA block, then it is not input.

Al the data stored in C arrays can also be indicated as input, output or inout for POLCA blocks, by giving the array variable name. This will set all elements in the array with the declared in/out information, even if they are not later used in the POLCA block.

#define N 5
 
#pragma polca varinfo sizeof(int) N
int myArray[N] = {1, 2, 3, 4, 5};
int z = 0;
 
#pragma polca input myArray
#pragma polca inout z
{
  for(int i = 0; i<N; i++)
  {
    z = z + myArray[i];
  }
}

In this case the varinfo directive is needed for POLCA tools to be aware of size of the data elements and number of elements of the array when the array is declared, but this will be explained in detail later.

In case that we are aware that only a portion of the array is going to be accessed in a particular POLCA block we can specify the range of elements. For comparison, the next code is equivalent to the previous one, all the elements of myArray are accessed, but it is explicitly declared.

#define N 5
 
#pragma polca varinfo sizeof(int) N
int myArray[N] = {1, 2, 3, 4, 5};
int z = 0;
 
#pragma polca input myArray[0, N-1]
#pragma polca inout z
{
  for(int i = 0; i<N; i++)
  {
    z = z + myArray[i];
  }
}

In the next code only the three first elements of the array are accessed, and it is indicated by setting the arrange of indexes from 0 to 2 (3 elements), with the notation myArray[0, 2]

#define N 5
 
#pragma polca varinfo sizeof(int) N
int myArray[N] = {1, 2, 3, 4, 5};
int z = 0;
 
#pragma polca input myArray[0, 2]
#pragma polca inout z
{
  for(int i = 0; i<3; i++)
  {
    z = z + myArray[i];
  }
}

The POLCA array indexes directives can also be parametrized. For example in the next example all elements of myArray are read except the first and the last one:

#define N 5
 
#pragma polca varinfo sizeof(int) N
int myArray[N] = {1, 2, 3, 4, 5};
int z = 0;
 
#pragma polca input myArray[1, N-2]
#pragma polca inout z
{
  for(int i = 1; i<N-1; i++)
  {
    z = z + myArray[i];
  }
}

In a similar way access to individual elements can also be explicitly indicated, and they can be parametrized to variables of the C code, as the next code shows:

#define N 5
 
#pragma polca varinfo sizeof(int) N
int myArray[N] = {1, 2, 3, 4, 5};
int z = 2;
 
for(int i = 0; i<N; i++)
{
#pragma polca input myArray[i]
#pragma polca inout z
  z = z * myArray[i];
}

Dealing with Global Variables

Global variables that are read and/or written in a POLCA block should also be annotated as input and/or output to allow the POLCA tools to be aware of the possible data dependencies.

Mathematic Behaviour

Having a full understanding of the mathematical behaviour of the application allows POLCA to introduce more aggressive transformations, while keeping correctness, as well as allowing adaptation to certain platforms such as FPGAs.

The directive smath has a set of basic predefined operators that can be used to express the arithmetics of a POLCA block. These are addition +, subtraction -, multiplication *, division /, exponentiation ^ and module %. The brackets () can also be used to set the order of operations.

In the next example the mathematical behaviour of the POLCA block is described, after the reserved work smath the arithmetics of the block are described followed by the name of the variable where the results is stored:

#pragma polca smath ((x-y)*(x-y)) z
#pragma polca input x y
#pragma polca output z
{
  z = (x-y)*(x-y);
}

Often the arithmetics can be expressed in multiple ways, not having to resemble with accuracy the operations that are taking place in the C code. For the previous example this other equivalent directive could have been used:

#pragma polca smath ((x-y)^2) z

POLCA also provides a set of more complex mathematical functions to be used for the smath directive to ease with the annotation of the mathematical behaviour. These predefined mathematical functions are:

  • sqrt: is the square root of a mathematical expression, that could be a constant, a variable or a more complex arithmetic expression to be evaluated
  • avg: the average value of a list of elements avg(val1, val2, val3, …, valn) or of a vector avg(myArray)
  • min: the minimum value of a list of elements min(val1, val2, val3, …, valn) or of a vector min(myArray)
  • max: the maximum value of a list of elements

Function annotation

Functions can be considered and treated as POLCA blocks but they have some peculiarities:

  • they do not need a def directive as the name can be taken from the name function itself
  • the return value is an output but does not need to be explicitly annotated as in the function definition it is not yet known where the final result is going to be stored (if it is stored)

The next example shows an annotated integer multiply-add function with the directives before the definition of the function:

#pragma polca input a b c
#pragma polca smath(a+b*c)
int multAdd (int a, int b, int c)
{
  int res = a + b * c;
  return res;
}

Another valid option is to situate the directives between the definition of the function and the opening brackets of the C statements:

int multAdd (int a, int b, int c)
#pragma polca input a b c
#pragma polca smath(a+b*c)
{
  int res = a + b * c;
  return res;
}

Hierarchical annotations

Most programming models that use in / out information for functions (or tasks) are able to create flat structures and to a certain level reason about them. But they treat these tasks as black boxes, lacking the details of the internal behaviour of the tasks, that implies that the introduced structure has only one structural level. Naturally, the dependency graph of such models can become large and complex, but the introduced information is interpreted as “task X input depends on task Y results” at runtime, without any hierarchical ordering.

In contrast, POLCA's annotation model addresses this issue, by allowing to annotate the internal behaviour of these tasks, they are no longer black boxes, and making an operator and data inter dependency graph at different level. For example, the next code takes three variables (a, b, c), adds a and b and then the result is multiplied by c, storing the final result in the variable z. The addition block, y = a + b; is given the name ADD and the multiplication block, z = y * c; is given the name MULT. The POLCA block ADDMULT includes the two previous blocks within it and the mathematical description of it uses them as operator:

#pragma polca def ADDMULT
#pragma polca input a b c
#pragma polca output z
#pragma polca smath(MULT (ADD a b) c) z
{
#pragma polca def ADD
#pragma polca input a b
#pragma polca output y
 y = a + b;
 
#pragma polca def MULT
#pragma polca input y c
#pragma polca output z
 z = y * c;
}

Not that if variable y is not read later it does not need to be annotated as output.

Higher Order Functions

One of POLCA's most important features and differentiating factor compared to other directive based programming models is the use of Higher-Order functions (HOFs). Their difference with normal functions is that HOFs take as parameter (argument) not only values but also other functions (or even other HOFs).

While the structure of an individual HOF is simple (these are described in detail in POLCA Language Definition ), it offers a crucial advantage over commonly used structures: it is independent from the underlying operators. Therefore, they can be used in a modular way, leading to emerging complexity of the structural information being expressible. In the next section we will show how to annotate C repetition structures with HOF directives.

Loops

Loop structures are common in high-level programming languages and are used to execute one or more statements up to a desired number of times. In C language there are two loop structures: for lops and while loops.

To annotate C loop structures with POLCA directives usually is necessary to use HOFs. HOFs represent, as well as loops, are repetition structures. In the next subsections we will explain the most common pattern of loop behaviour and explain which HOF directive matches them.

Often in C code the same loop is used to perform different tasks, making the annotations complex. To ease the process, in those cases, it is recommended to modify the C code to have split the “big loop” into different smaller loops, each one with a body with a clear unique task. This way the annotating process will be easier and it will the POLCA tools more successful in applying transformations.

for loop

The for-loop is often distinguished by an explicit loop counter or loop variable. This allows the body of the for-loop (the code that is being repeatedly executed) to know about the sequencing of each iteration. For-loops are also typically used when the number of iterations is known before entering the loop.

iteration

Iteration patterns are characterize for in each iteration of the loop overwrite in the output the variables that were taken as input, this implies that each iteration is data dependent on the previous ones. The directive for this pattern when the number of iterations is know is itn.

int i;
float a = 1.234f;
 
#pragma polca itn OPER a N_ITER a
for(i=0; i<N_ITER; ++i) {
#pragma polca def OPER
#pragma polca inout a
  a = someFunc(a);
}

When the number of iterations is not know and the repetition of the execution pattern depends on some boolean expression the itb directive is used. The next examples calculates uses a while loop to calculate the factorial. Instead of giving the total number of iterations the continuation of the execution of the loop depends on the expression (counter > 1).

int counter = 5;
int factorial = 1;
 
#pragma polca itb OPER zip(factorial, counter) (counter > 1) zip(factorial, counter)
while (counter > 1)
{
#pragma polca def OPER
#pragma polca inout factorial counter
   factorial *= counter--;
}

as this HOF only takes a single parameter as input and output but we need two (factorial and counter), zip is used to join them into a single parameter.

apply to all

This pattern applies the same computation to all the elements of an input array and has no dependency between each iteration, which implies that each iteration could be executed without any particular order. The directive for this pattern is map.

int i;
int a[5] = {1, 2, 3, 4, 5};
int b[5];
 
#pragma polca map OPER a b
for(i=0; i<5; ++i)
#pragma polca def OPER
#pragma polca input a[i]
#pragma polca output b[i]
{
  b[i] = someFunc(a[i]);
}

zip multiple arrays

A zip of two arrays joins pairs of elements of both arrays, applying a function that takes two input parameters to the corresponding elements (situated in the same index position), giving as result a new array. The directive for this pattern is zipWith:

int i;
int a[5] = {1, 2, 3, 4, 5};
int b[5] = {2, 3, 5, 7, 11};
int c[5];
 
#pragma polca zipWith OPER a b c
for(i=0; i<5; ++i)
#pragma polca def OPER
#pragma polca input a[i] b[i]
#pragma polca output c[i]
{
  c[i] = a[i]*b[i];
}

Both arrays should be the same length (or use ranges to fit them to equal sizes) and the array storing the result should be big enough not to overflow. This structure can be generalize to zip more than two array, e.g. use zipWith3 for zipping three arrays, note that the function should be also changed to accept three parameters.

reduction

Reduction operations recombine the results of recursively processing its constituent parts, building up a return value. The directive for this pattern is foldl.

int i;
int a[N_ELEM] = {1, 2, 3, 4, 5};
int b = 0;
 
#pragma polca foldl OPER b a b
for(i=0; i<5; ++i)
#pragma polca def OPER
#pragma polca input b a[i]
#pragma polca output b
{
  b += a[i]
}

If we want to store the result after each iteration (the output is a vector instead of a single value), then the directive for this pattern is scanl:

int i;
float a[5] = {1, 2, 3, 4, 5};
float b[5+1];
float b[0] = 0;
 
#pragma polca scanl OPER b[0] a b
for(i=0; i<5; ++i)
#pragma polca def OPER
#pragma polca input b[i] a[i]
#pragma polca output b[i+1]
{
  b[i+1] = a[i] + b[i]
}

stencil

Stencil codes are a class of kernels which update array elements according to some fixed pattern. In numerical analysis a stencil is a geometric arrangement of a nodal group relative to the point of interest used to offer a numerical solution of partial differential equations or apply filters to images, among other uses. For this reason they are often found in scientific and engineering simulation code, like for computational fluid dynamics.

This kind of code can be annotated using the stencilXD POLCA directives, where 'X' has to be replaced by 1, 2 or 3, depending on the number of dimensions the code is operating on.

int i;
float a[11] = {0, 1, 2, 3, 4, 5, 4, 3, 2, 1, 0};
float b[11];
 
#pragma polca stencil1D OPER 1 a b
for(i=1; i<11-1; ++i)
#pragma polca def OPER
#pragma polca input a[i-1] a[i] a[i+1]
#pragma polca output b[i]
{
  b[i] = someFunc(a[i-1], a[i], a[i+1]);
}

Control structures

Control structures are programming blocks that analyze variables and choose the execution direction on which to go based on the given parameters. The term flow control details the direction the program takes, they direction the program flows. In summary control structures determine how a computer will respond when given certain conditions and parameters.

The POLCA branching mechanism is based on Haskell style guards, in which the parameter a condition statement condX depending on the parameter a which is an input parameter to function f is evaluated and when True the corresponding statementX is evaluated.

f :: A -> B
f a
  | cond1(a)  = expression1
  | cond2(a)  = expression2
  | ...
  | otherwise = expressionN

The guards are evaluated in the order they appear and that otherwise is just an alias to True, and thus the last guard is a catch-all, playing the role of the final else of the if expression. In POLCA the input and output parameters for the statement of each guard has to be identical.

if-then-else

This is a simple form of the branching statements that takes an expression and a set of statements. If the expression is true (the expression will be assumed to be true if its evaluated values is non-zero) then the statements are executed, otherwise they are skipped and, if available, the else branch is executed.

The next examples shows an if-then-else structure, with multiple else-if. Before the starting of the if the guard directive has to be used, giving as parameters a list of pairs of boolean expressions that make valid each branch, followed by the statement to be executed:

#pragma polca guard(cond1(a) : s1; cond2(a) : s2; cond3(a) : s3; otherwise : s4)
if(cond1(a)) {
#pragma polca def s1
  statement1;
}
else if (cond2(a)) {
#pragma polca def s2
  statement2;
}
else if (cond3(a)) {
#pragma polca def s3
  statement3;
}
else (cond4(a)) {
#pragma polca def s3
  statement4;
}

Where the different condX functions return a boolean and their input is annotated:

#pragma polca input a
bool condX(int a)
{
   ...;
}

switch

The switch statement is much like a nested if-then-else statement. If a condition is met in switch case then execution continues on into the next case clause also if it is not explicitly specified that the execution should exit the switch statement. This is achieved by using break keyword.

As with if-then-else, this control structure can be annotated with the guard directive situated at the beginning of the switch. In this case the boolean expression is the comparison of the input value of the switch with the possible values. If many values enable the execution of the same branch, then these can be grouped by the || (or) operator:

#pragma polca guard(a == N : s1; a == M : s2; a == O || a == P || a == Q : s3; otherwise : s4)
switch(a) {
  N:
#pragma polca def s1
    statement1;
    break;
  M:
#pragma polca def s2
    statement2;
    break;
  O:
  P:
  Q:
#pragma polca def s3
    statement3;
    break;
  default:
#pragma polca def s4
    statement4;
    break;
}

The syntax details are explained in POLCA Language Definition.

Variable Information and Memory

Memory annotations allow the POLCA tools to understand the memory management of the procedural code enabling correct transformations. In simple words, it is necessary to indicate how the data is organize in memory, that is, how many elements are in an array and what is the size of each element. For dynamic memory when the memory where the data is stored are also necessary.

Static Memory

Variable Information (varinfo) directives are used to give to POLCA tools information about arrays. Inside the function func() of the next example code an array of N integers is allocated. The varinfo directive first receives the size of the elements, in this case the number of bytes of an int for the local architecture followed by the number of elements. Similarly the second array varinfo directive gets the size of each element, in the number of bytes of the struct myStruct, followed by the number of elements, in this case N+2:

struct myStruct {
  ...
};
 
void func() {
#pragma polca varinfo sizeof(int) N
  int myArray[N]; // N is a defined constant
 
#pragma polca varinfo sizeof(struct myStruct) (N+2)
  struct myStruct structArray[N+2];
 
  ... // Function code
}

In the next example the varinfo directive is not used correctly as the size of the elements and number of elements in the array do not represent the real logical organization of the data, even if the total allocated size is correct (size of element * number of elements):

void func2() {
#pragma polca varinfo (sizeof(int)*N) 1
  int myArray[N]; // N is a defined constant
  ... // Function code
}

Without the correct information the POLCA tools will not be able to correctly transform the code and the data.

Dynamic Memory

The dynamic memory allocation and static memory directives are very similar but the memory allocation ads an extra parameter that is the name of the pointer to the newly allocated memory. The next code examples show how the memAlloc directive can be used to allocate a single element or an array of multiple elements:

/* Single Element Allocation */
#pragma polca memAlloc (sizeof(type)) 1 pointerVar
type *pointerVar = (type *) malloc(sizeof(type));
 
...
 
/* Multiple Elements Allocation */
#pragma polca memAlloc (sizeof(type)) ARRAY_ELEM arrayVar
type *arrayVar = (type *) malloc(sizeof(type) * ARRAY_ELEM);
 
...
 
/* Multiple nullified Elements Allocation */
#pragma polca memCAlloc (sizeof(type)) ARRAY_ELEM arrayVar
type *arrayVar = (type *) calloc(sizeof(type), ARRAY_ELEM);

The third allocation is of an array that is cleared to null (zero) elements, using the memCAlloc directive. In case there is the need to resize the allocated memory with the realloc system function, the memReAlloc directive has to be used to update the information that POLCA has about that array as shown in the next example:

/* Reallocation of arrayVarOrg to a new vector size */
#pragma polca memReAlloc arrayVarOrg (sizeof(type))
ARRAY_ELEM arrayVar
type *arrayVar = (type *) realloc(arrayVarOrg, sizeof(type));

At the end of the usage of the memory, once it is freed the memFree directive has to be used to indicate that the pointer is no longer valid.

/* Memory de-allocationi */
#pragma polca memFree arrayVar
free((void *)arrayVar);

Capturing this information and making the POLCA tools aware of it allows to enable transformations that would not be possible without it, and in particular a partitioning were the memory address space is not common to all processing units of the system. This is because the data has to be partitioned and distributed accordingly across the processing units (a naive approach would be to copy the whole memory array to all processing units, but this would not allow to handle problems where the data-set size is much larger than the memory available for a single processing unit) and injecting the necessary communication and synchronization methods.

Memory Operations

Given that two arrays of the same number of elements of the same type are allocated as shown in the next code:

  float *myArray;
  float *myArray_tmp;
 
#pragma polca memalloc (sizeof(float)) n_elem myArray
  *myArray = malloc(sizeof(float) * n_elem);
 
#pragma polca memalloc (sizeof(float)) n_elem myArray_tmp
  *myArray_tmp = malloc(sizeof(float) * n_elem);

The data copy operations between them can be annotated for the POLCA tools to enable code transformations that take advantage of the specific memory subsystem. The next code shows how to use the memcopy directive to annotate a loop that is copying the content of the array myArray_tmp to myArray:

#pragma polca memcopy myArray_tmp (sizeof(float)) n_elem myArray
  for (int i = 0; i < n_elem; i++)
  {
    myArray[i] = myArray_tmp[i];
  }
  memcpy((void*)myArray, (void*)myArray_tmp, n_elem * sizeof(float));

The next code is performing the same operation, thus the annotations are the same, but instead it is using the system call memcpy:

#pragma polca memcopy myArray_tmp (sizeof(float)) n_elem myArray
  memcpy((void*)myArray, (void*)myArray_tmp, n_elem * sizeof(float));

A third alternative, what would use the same annotation, and is considered as a data copy, would be to swap the value of the pointers:

#pragma polca memcopy myArray_tmp (sizeof(float)) n_elem myArray
{
  float *tmp = myArray;
  myArray = myArray_tmp;
  myArray_tmp = tmp;
}

POLCA Kernels

In many of these cases, hand tuned dedicated platform-specific implementations of the problems exist that are typically provided in specific libraries. POLCA allows to specify such platform-specific functions and potentially overload them for different platforms. In other words, POLCA can make use of provided platform-dedicated kernels, which expose the same interface but are developed and optimized towards a specific platform, or processing type. In such cases, the respective kernel overrides all transformations towards the specified destination platform. This allows the developer to exploit existing optimisations where appropriate without affecting the overall usability of POLCA.

These POLCA blocks receive the name of POLCA Kernels and are annotated using the kernel directive. POLCA Kernels have the uniquie property that they behave as black boxes for the toolchain. The system knows its data interface but will not analyze the code inside it, as it will be considered an analysis end point. These code blocks are treated as a functional unit which the POLCA toolchain does not need to transform. Different versions of the same kernel can be given, for different destination platforms under the condition that they keep the same data interface and that they are functionally equivalent. A non-kernel POLCA block can also be provided to enable adaptation of the problem to architectures where no specifc POLCA kernel has been provided.

#pragma polca def myKernel
#pragma polca kernel C, AVX
#pragma polca inout val
int myKernel(float *val) {
 // C with AVX code
}
 
#pragma polca def myKernel
#pragma polca kernel MaxJ
#pragma polca inout val
{
 // MaxJ code
}
 
#pragma polca def myKernel
#pragma polca kernel CLASH
#pragma polca inout val
{
 // Clash code
}
 
#pragma polca def myKernel
#pragma polca inout val
int myKernel(float *val) {
 // Generic C code
}

Best Practices

Organize the C code

A clean and well organize C code, that does not contain specific optimizations that obfuscates the code is preferred to use with POLCA. The two main reasons is that first, it will be easier to use the POLCA annotations and second it enables a bigger set of transformations that the POLCA tools can perform.

Do not use curly brackets for single line blocks

This makes the code more compact and easier to read

POLCA block names in capital letters

Although not necessary, it is recommended for keeping a POLCA naming style.

No array ranges when using whole array

For keeping the annotations clear, compact and easier to read it is recommended to not use the array ranges when the whole array is used. On the other hand if only a subset of the array is used it is very recommended to indicate the range as this will improve the set of transformations that the POLCA tools can perform.

Use inout

When annotating the data input and output behavior of a POLCA block the use of the keyword inout is preferred over input and output when a variable has both behaviors.

polca_manual.txt · Last modified: 2016/10/31 12:15 by daniel