Advanced features

Lambda variables

Overview

Anonymous functions, also known as lambda expressions or simply lambdas have become quite popular in modern programming languages. They help to reduce the clutter and express the programmer's intent more succinctly. They help making programs more reusable as they allow parameterizing existing algorithms and therefore increasing their usefulness as well as reducing the number of abstractions we have to deal with.
Unfortunately they are not nearly as useful as they could be. The problem is that they are just that, anonymous functions which take some arguments. They can be assigned to variables and passed as arguments, but the knowledge about "functional" nature of the value must still be carried over to all the places where lambda is used. One direct and rather unfortunate consequence of that is that they are not easily composable. Sure, you can define a lambda which takes another lambda as an argument, but when nesting level exceeds two or three it becomes really problematic really quick.
Indeed, suppose we have lambda1 which takes lambda2 as an argument which, in turn, takes lambda3 as an argument. Now imagine we have changed implementation of lambda3 so it now requires an additional argument. Both lambda2 and lambda1 now also have to be changed in order to take and pass that additional argument. This renders the whole approach to program composition based on lambdas virtually useless as any local change must be propagated across the whole system in direct violation of encapsulation principle. In a big system, different parts of which are developed by different people or even different organizations we might not even have access to lambda1 and lambda2 source code or they might be protected by intellectual property rights etc.. The number of logistical problems and cost required to coordinate such changes may very quickly become unacceptable.
Also traditional lambdas always impose the cost of passing the arguments and function call. Even if we happen to know up front the value lambda3 would compute we can't simply pass that value instead of the lambda3 reference and avoid the cost of recomputation. It would break the code as it expects the function, not its value.
On the other hand, we can easily extend the idea of lambda expressions to eliminate all these problems in the first place. Consider the following snippet:

$a = 5;$b = $a + 1;

This is a very conscious and imperative piece of code. We have a clear understanding of what it means and what results it will produce. Now let's remove the first line:

$b = $a + 1;

The intent is still very clear: we want to take the value of variable $a, increment it and store the result as variable $b. We assume that variable $a exists and has a value, but what if it does not? Then we are left with the intent only. We cannot actually compute the value of $b because in order to do so we need the value of $a. What we can do, however, is to capture the intent of this computation and store it in variable $b instead of the actual computed value. In other words we convert the expression into a lambda so we can perform the actual computation sometime later, when we actually have the value of $a.
We have just introduced lambda variable $b. That is, lambda variables hold the intent of the computations and can be computed later when the conditions are right. Those conditions are determined by the presence or absence of other variables participating in the computation.
It is easy to see that lambda variables are automatically and infinitely composable. Indeed in the expression

$c = $a + $b;

we don't even care whether $a or $b are lambda variables or actual values. If they are actual values then $c will also be a value, if one or both of $a and $b are lambdas then $c will also be a lambda. So we simply continue to capture our intent to make these computations even if we can't do actual computations at the moment.
When compared with traditional lambdas in other languages, lambda variables are just expressions. They do not require any formal parameters declarations nor any special syntax. The meaning of an expression (whether it is a value or a lambda) is solely determined by the expression's environment, i.e. whether expression constituents have values or not.
Recall that controlling the environment is one of the BML's design cornerstones so by manipulating variables values and visibility we can control when actual computations take place vs. when they are just captured as lambda variables for later. However, it would be even better if we can make it more explicit and have fine-grain control over what is lambda and what is value.

Explicit lambdas and values (? and ! modifiers)

Any variable name used within an expression or parenthesised expression as a whole can be modified with '?' (question mark) or '!' (exclamation mark) suffix.
Question mark modifier requests that its variable or expression (i.e. variable or expression immediately preceding the mark) considered or allowed to be a lambda expression. This is also known as lambda request modifier
Exclamation mark requires that its variable or expression to be a value (a.k.a value request modifier). That is, in expression

$a = $b? + $c!;

variable $b allows $a to be assigned a lambda expression when $b has no value in the current frame and $c is required to resolve into some value. This will generate a lambda variable $a which captures the intent of adding some value designated by variable $b and a constant obtained as the current value of the variable $c. So if variable $b is defined in the frame, variable $a will be assigned computed value and if not then a lambda expression to compute the value later (if $b is already a lambda then engine will do all possible partial computations). However, if $c is absent or a lambda which cannot be evaluated into a value at this point, the engine will throw an exception and the process will fail.
When lambda modifiers used with parenthesis the meaning of the modifier simply applies to all the variables within the parenthesis. That is,

$a = ($b + $c)?;

is the same as

$a = $b? + $c?;

Note however, that value modifier '!' is "stronger" than lambda modifier '?' so in case of

$a = ($b? + $c)!;

The lambda modifier of $b? will be ignored and overall value request will be enforced instead.
Important: There must not be any whitespace between variable name or closing parenthesis and the modifier.

$a = $b ? + $c; # syntax error: space is not allowed here


To summarize, when dealing with variables in expressions, engine uses the following rules :

  1. When a variable is used in an expression without any modifiers, the engine will try to resolve it in the current environment. When variable is resolved to a value, that value will be used to compute the expression or some part of the expression. Unresolved variables or variables with explicit lambda request will cause the [remaining] expression to be converted into a lambda. It is guaranteed that all variables which can be resolved or requested to be resolved at this point are actually resolved and their values substituted into the expression. In other words, the engine will consult with the environment to determine whether expression is a lambda or a value.
  2. When a variable with explicit lambda request is used in an expression, at attempt is made to resolve the variable. If successful the value is used as usual. If not, the whole expression is converted into a lambda after all other variables are evaluated as per their requests or lack thereof (see rule 1). That is, in order to compose a lambda expression, the variable must have lambda request and have no value in current frame.
  3. When a variable with explicit value request is used in an expression, it must be resolved to a value otherwise an exception is thrown. When resolved, the value is substituted into the expression and engine checks other variables as per above rules. This allows you to tell the engine that at this point this variable must have a value and to fail the process if it does not;

Tips and caveats

There are places when we can't continue to build lambdas and must have values anyway. Mostly such places are decision points like "if" statements or loop conditions, so, obviously, in

if($a? > 5): ...

lambda request will be ignored and in

if($a! > 5): ...

value request is superfluous. Another example is a parameterized message. Obviously we can't send a lambda to a remote system so all variables used in message composition must have values at this point.
It is planned to extend expression syntax to also support more traditional "functional" style of lambda invocation (e.g. $a($b = 1, $c = 2) or similar). Until then a simple name convention can help to achieve almost the same thing if desired:

$mylambda.val = $mylambda.arg1? + $mylambda.arg2?; # compose lambda

and then

$mylambda.arg1 = 1; # set arguments in environment$mylambda.arg2 = 2;$result = 3 + $mylambda.val; # use lambda in expression

Scoping vs. framing

We already familiar with variable scoping in BML. So far we've learned that nested code blocks will automatically create nested scopes by default. Thus, in comparison with other programming languages, BML process can be compared with a function or a method call: the process can "see" all the local variables in all its scopes and cannot see any local variables of other processes. (Let's forget for a moment about other variable stores and consider only local store.) This will work in simple cases, but what if we need to partition our process into several parts and isolate parts from each other so they can be developed and/or reused independently from each other? In other words, we want to be able to invoke a block of code without exposing our local variables or better yet have fine-grain control over what's exposed and what's not.
This is called framing. That is, the block of code would execute in a nested frame and frame boundary would prevent nested code to access any variable defined outside the frame. Inside each nested frame we would continue to create scopes as usual.
Most programming languages would generate frames automatically when you invoke a method or a function so code of function body would run in its own frame. The only way to request a frame would be to define and call a function and there is absolutely no way to penetrate frame boundaries even when we want or need to. Thus you either pass your variables as parameters or "go around" frames and pass your data as instance or static variables.
BML does not bind together code structure with the way how code is used. It provides explicit scope and frame controls so you can choose how you want to invoke a block of code: in a frame, like function, in a scope like a block or even simply "inline" block into the current scope.
Inline or empty tag instruction will simply inline given block like it was defined in place of inlined entry:

inline: $x = 5;

This is the same as simply having $x = 5; without inline instruction
Scope instruction will create a new scope first and then execute the code

scope: $x = 5;


Frame instruction will create a new frame with new scope in it and then execute the code:

frame: $x = 5;

The code running within the frame will not be able to access any variables of this store before the frame was created.
By default frame and scope instructions operate on local store only. There are also specific versions of scope and frame instructions for each variable store, e.g. "scopeL", "frameL" for local store, "scopeP", "frameP" for process data store etc. (Names are designed to resemble generic commands with parameters, e.g. scopeL(...) looks similar to scope(L, ...) and therefore more easily recognized in code)
All scope and frame instructions can also have arguments to explicitly copy some of the visible variables to the newly created scope, e.g.

frameL(var $x = $x, var $y = $a): $a = $x + $y;

Here we copy local variable $x to the new scope under the same name and copy variable $a as $y. This has the effect of having these variables defined in the new scope right before the frame boundary is created. That is, frame is created in two steps: first, new scope is created and all variable definitions given as command parameters are executed, second, the frame boundary is established between new scope and all parent scopes. Thus the only variables visible in new frame are the ones defined as parameters to the frame command.
Generic version of frame and scope instructions can also have [default] type parameter so multiple frames across different variable stores can be created in one operation:

frame(P|L): ...

Here new frames in both local and process stores are created.
Parameter is a simple integer bit-mask where each bit corresponds to a scope/frame type, e.g. "L" corresponds to local store, "P" to process store etc. Actual bit positions are irrelevant and implementation dependent, but it is possible to compute the mask as bitwise OR operation. The following global constants are defined:

Note that there is no flag defined for system store. It is impossible to create a system frame. System scope can be created by "scope$" command.

Code parameterization

Just like variable path can be parameterized with other variables, the code itself can be parameterized. This is where active assignments come handy:

$my_code: # store a piece of code in a variable $a = $b + $c;
...inline = $my_code; # execute variable value in current scope

This way parts of the code can be manipulated with, passed around as arguments etc. Since active assignments are universal, any BML construct can be parameterized this way, e.g.

$cb1: # store code block 1 log = "in part 1: i= " + $i;$cb2: # store code block 2 log = "in part 2: i= " + $i;... for($i = 0; $i < 5; $i = $i + 1) = $cb1 + $cb2; # execute in loop


Recall that "if" statement has single-entry and multi-entry forms so you can choose which form is more convenient if you want to parameterize it.
Code parameterization is another way to capture the intent of the program rather than exact prescribed sequence of operations only this time it allows to do that not just with a single expression, but with whole [reusable] blocks of code.
Performance tip: The last "for" loop example can also be written in more "static" fashion:

for($i = 0; $i < 5; $i = $i + 1): inline = $cb1; inline = $cb2;

From a performance point of view the latter is preferable as the engine does not have to actually combine both blocks together into a single structure (i.e. evaluate $cb1 + $cb2 expression) before giving it to the "for" statement for execution.
Code parameterization allows you to bring a great deal of flexibility and dynamism to your program as well as write elegant code, but, as everything else in life, it may come with some costs. Obviously, the cost of combining two structures together will be proportional to the sizes of the structures so if they are large you might prefer to use latter pattern to avoid this cost. It is not prohibitively expensive by any means (unless structures have size of megabytes or so), but it is not zero nonetheless.
As a general "rule of thumbs", statically defined code executes, obviously, faster than dynamically generated so you might prefer to use static code as much as possible.

Advanced template processing

The concept of template processing has been already introduced earlier. This chapter describes in detail what you actually can do with TP and how it may help to avoid some of the costs of purely dynamic code composition described in the previous chapter.
As the name suggests, template processor takes a data structure called "template" as an input, combines or processes it with some data and yields another result data structure. That is , it always operates on structures even if structure consists of a single entry. Another way of thinking about TP is that it's another programming language which generates structures as its output.
Important: Only variables from template store ($T) and system store ($$) are resolved and substituted during template processing in guarded entries. However,. any variables can be used in TP instructions.

TP instructions

First, we need a way to tell template processor what we want it to do. Just like we have instructions for regular code execution, we have TP instructions for template processing. TP instructions "mirror" regular BML instructions and designated by '@' (at character) prefix. (for keywords the prefix is a part of the name of the instruction so no space allowed between '@' and the name):

@$T.t1 = 5; @while($T.t1 >= 0): @log = "@log: $T.t1 = " + $T.t1; # log during TP @$T.t1 = $T.t1 - 1;

Here we have a TP "while" loop which produces no TP output as it contains only TP instructions and no templates. You can see the similarities with the regular code. You can turn virtually any "normal" BML instruction into a TP instruction by placing '@' prefix in front of it. TP instructions perform the same operations as regular ones except they do it during TP phase. Now let's add some templates:

@$T.t1 = 3; @while($T.t1 >= 0): @if($T.t1 > 1):log = "pi1: $T.t1 = " + $T.t1; @else:log = "pi2: $T.t1 = " + $T.t1; @$T.t1 = $T.t1 - 1;

This would generate the following sequence:

log = "pi1: $T.t1 = " + $T.t1;log = "pi1: $T.t1 = " + $T.t1;log = "pi2: $T.t1 = " + $T.t1;log = "pi2: $T.t1 = " + $T.t1;

So when we execute generated code sometime later, we'd get the following output:

pi1: $T.t1 = -1pi1: $T.t1 = -1pi2: $T.t1 = -1pi2: $T.t1 = -1

We get the same -1 value of $T.t1 variable because by the time we execute code the TP loop was already long completed and left $T.t1 with value -1 (unless we set it to another value somewhere in between TP and execution). You can also see that our "@if" TP instruction within loop works as expected as you see first two log statements generated as log = "pi1: $T.t1 = " + $T.t1; and the last two as log = "pi2: $T.t1 = " + $T.t1;. So we know "@if" is working, but overall this is not very interesting. We need to do better.
One of the challenges of TP processing is that the same template can be processed multiple times throughout overall program execution or one template can generate another template (and so on, perhaps in multiple stages) before yielding the result at some point. We need to have a way to distinguish between entries we want to process now vs. entries we want to leave alone in the current TP pass. For that we need TP guards:

Guarded TP entries

You can think about TP guards as a way to paint different lines of your program into different colors and then, on each TP pass, execute only lines of the same selected color. This is exactly what guarding means: some instructions are allowed to be executed and others are just left alone. Guarded entries have the following syntax:

@<guard>@<entry>

Here <guard> means name of some guard variable which must evaluate to true or false during TP pass. For example:

@$g@log = "tp1: $T.t1 = " + $T.t1;

Here $g is the guard variable and log = "tp1: $T.t1 = " + $T.t1; is the guarded entry.
Guarded entries are processed by TP if and only if its [leftmost] guard variable evaluates to true. By processing we mean not only mere copying the entry to the result structure as we saw in previous example, but also substituting all variables from template and system stores. [Leftmost] guard is removed from the result entry and the entry is recompiled again by the BML compiler.
Before we move further let's introduce some rules and shortcuts:

  • You can invert the guard variable meaning by using logical negate operator '!' (e.g. @!x@log = ...). Logical negation is the only operator allowed in guard expression. If you want to compute a complex condition you'd have to store the result in a variable and use the variable as a guard. This is to improve performance, but mostly to avoid possible side effects of complex guard expressions when the same guard [expression] is used in more than one entry (e.g. multi-entry "if"). Obviously, you can't negate empty guard as it would mean "never process" which does not make any sense: @!@log = ... # syntax error.
  • Guard variable can be empty. That is, you can simply write @@log = .... In this case TP assumes it to be true (except when running in strict mode explained later)
  • Guard variable can be undefined. In this case TP resolves it as false unless inverted.
  • You can have more than one guard (e.g. log = ...). TP will "peel away" one guard at a time when it processes the entry
  • TP instructions can also be guarded: @log = .... You can easily distinguish between a guarded entry and a guarded TP instruction by counting '@' symbols in front of the entry:
    • Guarded entries have even number of '@' characters (e.g. two, four, six, etc.). Enabled guarded entries are processed by substituting variables from template store only (i.e. only $T...). Processed entries are re-compiled and copied into result structure.
    • Guarded TP instructions have odd number of '@' (e.g. one, three, five, etc.). Enabled TP instructions are never copied into the result (i.e. they are consumed by TP pass). Instead they are executed like regular BML instructions and therefore have access to any variable (not only to $T)
  • Any content of a TP instruction which is not another TP instruction is considered template data and either copied as is when not guarded or not enabled (i.e. static data) or evaluated as enabled guarded entry (see above)
  • Guard variable names without sigil are assumed to be from template data store so log = ... is the same as @$T.x@log = ...


Now let's redo our example with guarded entries in templates:

@$T.t1 = 3; @while($T.t1 >= 0): @if($T.t1 > 1): @@log = "pi1: $T.t1 = " + $T.t1; @else: @@log = "pi2: $T.t1 = " + $T.t1; @$T.t1 = $T.t1 - 1;

This would yield a different result:

log = "pi1: $T.t1 = 3";log = "pi1: $T.t1 = 2";log = "pi2: $T.t1 = 1";log = "pi2: $T.t1 = 0";

As you can see the $T.t1 variable has been substituted into each template entry, log expression compiled and constants folded. Thus log messages computations are no longer required at run time as all log expressions are just constants. Now we're getting somewhere!
If you intend to execute a piece of dynamically composed code more than one time, you might consider generating that piece ahead of time using TP instead of composing the same piece multiple times during run time (i.e. through code parameterization). This might save significant resources over multiple runs.
To summarize:

  • Any entry ever touched by TP in any way must start with '@' character
  • Odd-number of '@' entries are TP instructions which do the actual TP job when enabled by optional guard. They are consumed by TP and never show up in TP results
  • Even-number of '@' entries are TP entries (i.e. templates) which are getting processed by TP instructions when enabled by the guard. They get TP variables substituted, guard consumed and result copied into result
  • All other entries are simply copied to the result as is.
  • No-sigil guard variables are assumed to be from $T store
  • You can invert a guard with negate operator '!'
  • Complex logical expressions are not allowed as guards

Default vs. strict template processing

TP may exhibit a different attitude towards empty or non-existing guard resolution:

  • In default mode, the behavior is as explained above: empty guards are accepted as true and undefined guards as false unless inverted by '!' operator. In other words, in default mode TP is eager and is trying to process as many entries as it possibly can
  • In strict mode only guards explicitly resolved to true are processed. In this mode TP is reluctant and is trying to preserve as is as many entries as it can

Strict mode can be enabled by setting $T.$Strict variable to true (note that this is a system variable so its name starts with a dollar sign. In this case dollar sign is not a sigil, but a part of the name itself). When [re]set during processing it will take affect in the next nested structure. That is, you cannot change strictness in the middle of the current level of processing. The strictness of the current level is determined by which value $T.$Strict variable had right before entering this level and any changes made at this level can only affect nested levels. This is both for performance and consistency.

TP inline instruction

Just like inline instruction allows to execute a block of code in current scope, TP inline instruction includes its argument into result. In this case argument can also be a filename which is resolved in local file system. For example

@$T.file: = "sff.auto.config" + ".cdm" # compose file name @@$cfg: # generate passive assignment @: $T.file # of included file content

After TP, variable $cfg passive assignment instruction RHS will have the content of the local file sff.auto.config.cdm if such file exists in current or any parent directory. Obviously, when absolute file path is given, only the given location is checked.
Recall that active assignment of a structural value means that this structure will be deep TP-ed. This is usually the intention, but if you want to avoid such processing you might use the pattern above as does not perform any TP on included file.

Behavior composition

BML takes some of its roots from and builds upon behavior tree concepts. Although never intended for building large distributed and highly concurrent systems behavior trees provide a solid conceptual basis for composition of complex tasks from simple ones. It plays very well with data flow- based data communications as well as three-way notifications.

Sequence

This is the standard way of executing BML code. Every block executes its instructions in sequence in order of appearance. From an external perspective, the process which runs the code appears in running state until it's either succeed, discarded or failed.
In other words, sequence as a whole succeeds when all children succeed. Discard or failure notifications are simply bubble up. Synonyms are "inline", "all", "sequence".

FirstSuccess (aka selector, fallback)

Please don't confuse selector instruction with selector expression syntax.
"FirstSuccess" or simply "first" or "any" (a.k.a selector or fallback) instruction simply executes its children one by one until success. The only difference with original behavior tree is that instead of single fail notification we consider both failure and discard notifications. Failure is considered a "hard failure" which usually constitutes an error while discard or cancel notification is used to signal selector to try the next child. For example:

any: if(true): log(level: info, message: step 1 before discard); discard: ds1 log(level: info, message: step 1 after discard); if(true; onFailure: discard): # convert failure to discard log(level: info, message: step 2 before failure); failure: fs2 log(level: info, message: step 2 after failure); if(true): log(level: info, message: step 3 success); if(true): log(level: info, message: step 4 before discard); discard: ds4 log(level: info, message: step 4 after discard);

Here selector executes each "if" one by one until success (third "if") so the forth "if" will never be executed.
In other words, selector as a whole succeed when one of the children succeed (i.e. first successful child in order of appearance). Instruction rases discard notification when none of its children are successful and raises failure notification on any of the children failure.

Concurrent processing

Behaviour composition primitives can be trivially extended with asynchronous concurrent capabilities.
Most of concurrent instructions begin with "WaitFor" prefix.

WaitForAll

This is concurrent equivalent of a sequence node. All children are executed concurrently and the whole construct succeeds when all children succeed. Any discard or failure causes all other children to be discarded and notification propagated up.

WaitForFirst or WaitForAny

This is concurrent equivalent of fallback node. All children are executed concurrently until first success. After that all other children are discarded. Children discards which happened before the first success are ignored except the very last one which causes discard of the whole construct. Any children failures result in failure propagating up and discarding of all remaining children.