Lexical structure

This chapter describes the lexical structure of the Workflow language.

Programs

A program consists of one single stream of characters encoded using the Unicode character set, also known as the compilation unit. When a program is compiled it goes through three steps:

  1. Lexical analysis, which converts the stream of characters from the input to a stream of tokens.
  2. Syntactic analysis, which translates the stream of tokens into an AST (Abstract Syntax Tree).
  3. Validation of the AST, which applies the strong typing rules, validates data statements against the data schema, resolves imports and API calls and then produces the executable code.

Grammars

The Workflow language syntax is described by using two grammars:

  1. The lexical grammar defines how the input characters are combined to form white space, comments, and tokens.
  2. The syntactic grammar defines how the tokens resulting from the lexical grammar are combined to form Workflow programs.

Both grammars are presented using ANTLR grammar tool’s Extended Backus-Naur form.

Lexical grammar

The lexical grammar defines how the input characters are combined to form white space, comments, and tokens.

White space

White space is defined as the space characted, the tab character, the carriage return, and line feed.

WS
    : ' '
    | '\t'
    | '\r'
    | '\n'
    ;

Comments

COMMENT
    : '/*' .*? '*/'
    ;
LINE_COMMENT
    : '//' ~[\r\n]*
    ;

Tokens

There are four types of tokens: identifiers, keywords, literals and operators.

Identifiers

Identifiers can only include the basic english letter characters in lower or upper case, numbers or the underscore character ("_") and cannot start with a number.

ID  : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;

Keywords

A keyword is an identifier-like sequence of characters that is reserved, and cannot be used as an identifier except when enclosed by backticks (`).

KEYWORD
    : "workflow"  | "on error"  | "break"     | "continue"
    | "var"       | "if"        | "else"      | "foreach"
    | "in"        | "and"       | "or"        | "null"
    | "function"  | "method"    | "isolated"  | "atomic"
    | "alias"     | "as"        | "return"    | "type"
    | "import"    | "of"

    /* data keywords */
    | "ENTITY"    | "GET"       | "CREATE"    | "PUT"
    | "ORIGINAL"  | "CHILD"     | "DELETED"   | "DELETE"

    /* fetch keywords */
    | "FETCH"     | "FILTER"    | "GROUP"     | "ORDER"
    | "BY"        | "AS"        | "IN"        | "NEXT"
    | "LAST"      | "LIMIT"     | "TO"        | "FROM"
    | "AND"       | "OR"        | "DESC"      | "ASC"
    | "SUM"       | "AVG"       | "MIN"       | "MAX"
    | "MATCH"     | "LINK"      | "TYPE"      | "LEFT"
    | "RIGHT"     | "INNER"

Literals

Integer

Integer literals are used to write values of types int.

INT : '0'..'9'+
    ;

Floating point

Floating point literals are used to write values of types decimal.

FLOAT
    : ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
    | '.' ('0'..'9')+ EXPONENT?
    | ('0'..'9')+ EXPONENT
    ;

fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;

String

String literals are used to write values of types string.

STRING
    : '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    ;

fragment
ESC_SEQ
    : '\\' ('t'|'n'|'r'|'\\"'|'\\')
    ;

Boolean

Boolean literals are used to write values of types bool.

BOOLEAN
    : "true"
    | "false"
    ;

Null

Null literals are used to write the null value.

NULL : "null"
     ;

Syntactic grammar

The syntactic grammar presented here is a summary, describing the most important elements of the workflow language.

Workflow

A workfow defines the compilation unit. The compilation unit has imports, types, methods and functions.

workflow
	: 'workflow' ID ';' (importDirective)* (method | function | type )* EOF
	;

Imports

Imports are declared at program level.

importDirective
	:	'import' wf=ID 'alias' alias=ID ';'
	;

Types

Types are declared at program level.

type
	: 'type' typeName=ID '{' t1=typeVar ';' (t2=typeVar ';')* '}'
	;

typeVar
	: 'var' n=ID 'as' t=scalarTypeInstance
	;

Inline types

Inline types are declared in the same time as variable.

scalarTypeInstance
	:	ID ('of' scalarTypeInstanceArguments)?
	|	'ENTITY' ID
    |	scalarTypeMembers
	;

scalarTypeMembers
	:	'{' t1=scalarTypeMember (',' t2=scalarTypeMember)* '}'
	;

Example

var coords as { x as int, y as int };

Methods and functions

method
	: 'method' 'elevated'? 'isolated'? 'atomic'? ID a=arguments s=block ('on error' e=block)?
	;

function
	: 'function' 'elevated'? 'isolated'? 'atomic'? name=ID arguments 'as' scalarTypeInstance s=block ('on error' e=block)?
	;

A method of function declaration may include the elevated, isolated or atomic modifiers.

Statements

See Statements.