Quick links:   Flags   Verbs   Functions   Glossary   Release docs

# DSL higher-order functions¶

A higher-order function is one which takes another function as an argument. As of Miller 6 you can use `select`, `apply`, `reduce`, `fold`, and `sort`, and `any`, and `every` to express flexible, intuitive operations on arrays and maps, as an alternative to things which would otherwise require for-loops.

See also the `get_keys` and `get_values` functions which, when given a map, return an array of its keys or an array of its values, respectively.

## select¶

The `select` function takes a map or array as its first argument and a function as second argument. It includes each input element in the output if the function returns true.

For arrays, that function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean.

A perhaps helpful analogy: the `select` function is to arrays and maps as the `filter` is to records.

Array examples:

```mlr -n put '
end {
my_array = [2, 9, 10, 3, 1, 4, 5, 8, 7, 6];

print "Original:";
print my_array;

print;
print "Evens:";
print select(my_array, func (e) { return e % 2 == 0});

print;
print "Odds:";
print select(my_array, func (e) { return e % 2 == 1});
print;
}
'
```
```Original:
[2, 9, 10, 3, 1, 4, 5, 8, 7, 6]

Evens:
[2, 10, 4, 8, 6]

Odds:
[9, 3, 1, 5, 7]

```

Map examples:

```mlr -n put '
end {
my_map = {"cubit": 823, "dale": 13, "apple": 199, "ember": 191, "bottle": 107};
print "Original:";
print my_map;

print;
print "Keys with an 'o' in them:";
print select(my_map, func (k,v) { return k =~ "o"});

print;
print "Values with last digit >= 5:";
print select(my_map, func (k,v) { return v % 10 >= 5});
}
'
```
```Original:
{
"cubit": 823,
"dale": 13,
"apple": 199,
"ember": 191,
"bottle": 107
}

Keys with an o in them:
{
"bottle": 107
}

Values with last digit >= 5:
{
"apple": 199,
"bottle": 107
}
```

## apply¶

The `apply` function takes a map or array as its first argument and a function as second argument. It applies the function to each element of the array or map.

For arrays, the function should take one argument, for array element; it should return a new element. For maps, it should take two, for map-element key and value. It should return a new key-value pair (i.e. a single-entry map).

A perhaps helpful analogy: the `apply` function is to arrays and maps as the `put` is to records.

Array examples:

```mlr -n put '
end {
my_array = [2, 9, 10, 3, 1, 4, 5, 8, 7, 6];
print "Original:";
print my_array;

print;
print "Squares:";
print apply(my_array, func(e) { return e**2 });

print;
print "Cubes:";
print apply(my_array, func(e) { return e**3 });

print;
print "Sorted cubes:";
print sort(apply(my_array, func(e) { return e**3 }));
}
'
```
```Original:
[2, 9, 10, 3, 1, 4, 5, 8, 7, 6]

Squares:
[4, 81, 100, 9, 1, 16, 25, 64, 49, 36]

Cubes:
[8, 729, 1000, 27, 1, 64, 125, 512, 343, 216]

Sorted cubes:
[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
```
```mlr -n put '
end {
my_map = {"cubit": 823, "dale": 13, "apple": 199, "ember": 191, "bottle": 107};
print "Original:";
print my_map;

print;
print "Squared values:";
print apply(my_map, func(k,v) { return {k: v**2} });

print;
print "Cubed values, sorted by key:";
print sort(apply(my_map, func(k,v) { return {k: v**3} }));

print;
print "Same, with upcased keys:";
print sort(apply(my_map, func(k,v) { return {toupper(k): v**3} }));
}
'
```
```Original:
{
"cubit": 823,
"dale": 13,
"apple": 199,
"ember": 191,
"bottle": 107
}

Squared values:
{
"cubit": 677329,
"dale": 169,
"apple": 39601,
"ember": 36481,
"bottle": 11449
}

Cubed values, sorted by key:
{
"apple": 7880599,
"bottle": 1225043,
"cubit": 557441767,
"dale": 2197,
"ember": 6967871
}

Same, with upcased keys:
{
"APPLE": 7880599,
"BOTTLE": 1225043,
"CUBIT": 557441767,
"DALE": 2197,
"EMBER": 6967871
}
```

## reduce¶

The `reduce` function takes a map or array as its first argument and a function as second argument. It accumulates entries into a final output -- for example, sum or product.

For arrays, the function should take two arguments, for accumulated value and array element; for maps, it should take four, for accumulated key and value and map-element key and value. In either case it should return the updated accumulator.

The start value for the accumulator is the first element for arrays, or the first element's key-value pair for maps.

```mlr -n put '
end {
my_array = [2, 9, 10, 3, 1, 4, 5, 8, 7, 6];

print "Original:";
print my_array;

print;
print "First element:";
print reduce(my_array, func (acc,e) { return acc });

print;
print "Last element:";
print reduce(my_array, func (acc,e) { return e });

print;
print "Sum of values:";
print reduce(my_array, func (acc,e) { return acc + e });

print;
print "Product of values:";
print reduce(my_array, func (acc,e) { return acc * e });

print;
print "Concatenation of values:";
print reduce(my_array, func (acc,e) { return acc. "," . e });
}
'
```
```Original:
[2, 9, 10, 3, 1, 4, 5, 8, 7, 6]

First element:
2

Last element:
6

Sum of values:
55

Product of values:
3628800

Concatenation of values:
2,9,10,3,1,4,5,8,7,6
```
```mlr -n put '
end {
my_map = {"cubit": 823, "dale": 13, "apple": 199, "ember": 191, "bottle": 107};
print "Original:";
print my_map;

print;
print "First key-value pair:";
print reduce(my_map, func (acck,accv,ek,ev) { return {acck: accv}});

print;
print "Last key-value pair:";
print reduce(my_map, func (acck,accv,ek,ev) { return {ek: ev}});

print;
print "Concatenate keys and values:";
print reduce(my_map, func (acck,accv,ek,ev) { return {acck . "," . ek: accv . "," . ev}});

print;
print "Sum of values:";
print reduce(my_map, func (acck,accv,ek,ev) { return {"sum": accv + ev }});

print;
print "Product of values:";
print reduce(my_map, func (acck,accv,ek,ev) { return {"product": accv * ev }});

print;
print "String-join of values:";
print reduce(my_map, func (acck,accv,ek,ev) { return {"joined": accv . "," . ev }});
}
'
```
```Original:
{
"cubit": 823,
"dale": 13,
"apple": 199,
"ember": 191,
"bottle": 107
}

First key-value pair:
{
"cubit": 823
}

Last key-value pair:
{
"bottle": 107
}

Concatenate keys and values:
{
"cubit,dale,apple,ember,bottle": "823,13,199,191,107"
}

Sum of values:
{
"sum": 1333
}

Product of values:
{
"product": 43512437137
}

String-join of values:
{
"joined": "823,13,199,191,107"
}
```

## fold¶

The `fold` function is the same as `reduce`, except that instead of the starting value for the accumulation being taken from the first entry of the array/map, you specify it as the third argument.

```mlr -n put '
end {
my_array = [2, 9, 10, 3, 1, 4, 5, 8, 7, 6];

print "Original:";
print my_array;

print;
print "Sum with reduce:";
print reduce(my_array, func (acc,e) { return acc + e });

print;
print "Sum with fold and 0 initial value:";
print fold(my_array, func (acc,e) { return acc + e }, 0);

print;
print "Sum with fold and 1000000 initial value:";
print fold(my_array, func (acc,e) { return acc + e }, 1000000);
}
'
```
```Original:
[2, 9, 10, 3, 1, 4, 5, 8, 7, 6]

Sum with reduce:
55

Sum with fold and 0 initial value:
55

Sum with fold and 1000000 initial value:
1000055
```
```mlr -n put '
end {
my_map = {"cubit": 823, "dale": 13, "apple": 199, "ember": 191, "bottle": 107};
print "Original:";
print my_map;

print;
print "First key-value pair -- note this is the starting accumulator:";
print fold(my_map, func (acck,accv,ek,ev) { return {acck: accv}}, {"start": 999});

print;
print "Last key-value pair:";
print fold(my_map, func (acck,accv,ek,ev) { return {ek: ev}}, {"start": 999});

print;
print "Sum of values with fold and 0 initial value:";
print fold(my_map, func (acck,accv,ek,ev) { return {"sum": accv + ev} }, {"sum": 0});

print;
print "Sum of values with fold and 1000000 initial value:";
print fold(my_map, func (acck,accv,ek,ev) { return {"sum": accv + ev} }, {"sum": 1000000});
}
'
```
```Original:
{
"cubit": 823,
"dale": 13,
"apple": 199,
"ember": 191,
"bottle": 107
}

First key-value pair -- note this is the starting accumulator:
{
"start": 999
}

Last key-value pair:
{
"bottle": 107
}

Sum of values with fold and 0 initial value:
{
"sum": 1333
}

Sum of values with fold and 1000000 initial value:
{
"sum": 1001333
}
```

## sort¶

The `sort` function takes a map or array as its first argument, and it can take a function as second argument. Unlike the other higher-order functions, the second argument can be omitted when the natural ordering is desired -- ordered by array element for arrays, or by key for maps.

As a second option, character flags such as `r` for reverse or `c` for case-folded lexical sort can be supplied as the second argument.

As a third option, a function can be supplied as the second argument.

For arrays, that function should take two arguments `a` and `b`, returning a negative, zero, or positive number as `a<b`, `a==b`, or `a>b` respectively. For maps, the function should take four arguments `ak`, `av`, `bk`, and `bv`, again returning negative, zero, or positive, using `a` and `b`'s keys and values.

Array examples:

```mlr -n put '
end {
my_array = [2, 9, 10, 3, 1, 4, 5, 8, 7, 6];

print "Original:";
print my_array;

print;
print "Ascending:";
print sort(my_array);
print sort(my_array, func (a,b) { return a <=> b });

print;
print "Descending:";
print sort(my_array, "r");
print sort(my_array, func (a,b) { return b <=> a });
}
'
```
```Original:
[2, 9, 10, 3, 1, 4, 5, 8, 7, 6]

Ascending:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Descending:
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
```

Map examples:

```mlr -n put '
end {
my_map = {"cubit": 823, "dale": 13, "apple": 199, "ember": 191, "bottle": 107};

print "Original:";
print my_map;

print;
print "Ascending by key:";
print sort(my_map);
print sort(my_map, func(ak,av,bk,bv) { return ak <=> bk });

print;
print "Descending by key:";
print sort(my_map, "r");
print sort(my_map, func(ak,av,bk,bv) { return bk <=> ak });

print;
print "Ascending by value:";
print sort(my_map, func(ak,av,bk,bv) { return av <=> bv });

print;
print "Descending by value:";
print sort(my_map, func(ak,av,bk,bv) { return bv <=> av });
}
'
```
```Original:
{
"cubit": 823,
"dale": 13,
"apple": 199,
"ember": 191,
"bottle": 107
}

Ascending by key:
{
"apple": 199,
"bottle": 107,
"cubit": 823,
"dale": 13,
"ember": 191
}
{
"apple": 199,
"bottle": 107,
"cubit": 823,
"dale": 13,
"ember": 191
}

Descending by key:
{
"ember": 191,
"dale": 13,
"cubit": 823,
"bottle": 107,
"apple": 199
}
{
"ember": 191,
"dale": 13,
"cubit": 823,
"bottle": 107,
"apple": 199
}

Ascending by value:
{
"dale": 13,
"bottle": 107,
"ember": 191,
"apple": 199,
"cubit": 823
}

Descending by value:
{
"cubit": 823,
"apple": 199,
"ember": 191,
"bottle": 107,
"dale": 13
}
```

Please see the sorting page for more examples.

## any and every¶

This is a way to do a logical OR/AND, respectively, of several boolean expressions, without the explicit `||`/`&&` and without a `for`-loop. This is a keystroke-saving convenience.

```mlr --c2p cat example.csv
```
```color  shape    flag  k  index quantity rate
yellow triangle true  1  11    43.6498  9.8870
red    square   true  2  15    79.2778  0.0130
red    circle   true  3  16    13.8103  2.9010
red    square   false 4  48    77.5542  7.4670
purple triangle false 5  51    81.2290  8.5910
red    square   false 6  64    77.1991  9.5310
purple triangle false 7  65    80.1405  5.8240
yellow circle   true  8  73    63.9785  4.2370
yellow circle   true  9  87    63.5058  8.3350
purple square   false 10 91    72.3735  8.2430
```
```mlr --c2p --from example.csv filter 'any({"color":"red","shape":"square"}, func(k,v) {return \$[k] == v})'
```
```color  shape  flag  k  index quantity rate
red    square true  2  15    79.2778  0.0130
red    circle true  3  16    13.8103  2.9010
red    square false 4  48    77.5542  7.4670
red    square false 6  64    77.1991  9.5310
purple square false 10 91    72.3735  8.2430
```
```mlr --c2p --from example.csv filter 'every({"color":"red","shape":"square"}, func(k,v) {return \$[k] == v})'
```
```color shape  flag  k index quantity rate
red   square true  2 15    79.2778  0.0130
red   square false 4 48    77.5542  7.4670
red   square false 6 64    77.1991  9.5310
```
```mlr --c2p --from example.csv put '\$is_red_square = every({"color":"red","shape":"square"}, func(k,v) {return \$[k] == v})'
```
```color  shape    flag  k  index quantity rate   is_red_square
yellow triangle true  1  11    43.6498  9.8870 false
red    square   true  2  15    79.2778  0.0130 true
red    circle   true  3  16    13.8103  2.9010 false
red    square   false 4  48    77.5542  7.4670 true
purple triangle false 5  51    81.2290  8.5910 false
red    square   false 6  64    77.1991  9.5310 true
purple triangle false 7  65    80.1405  5.8240 false
yellow circle   true  8  73    63.9785  4.2370 false
yellow circle   true  9  87    63.5058  8.3350 false
purple square   false 10 91    72.3735  8.2430 false
```
```mlr --c2p --from example.csv filter 'any([16,51,61,64], func(e) {return \$index == e})'
```
```color  shape    flag  k index quantity rate
red    circle   true  3 16    13.8103  2.9010
purple triangle false 5 51    81.2290  8.5910
red    square   false 6 64    77.1991  9.5310
```

This last example could also be done using a map:

```mlr --c2p --from example.csv filter '
begin {
@indices = {16:true, 51:true, 61:true, 64:true};
}
@indices[\$index] == true;
'
```
```color  shape    flag  k index quantity rate
red    circle   true  3 16    13.8103  2.9010
purple triangle false 5 51    81.2290  8.5910
red    square   false 6 64    77.1991  9.5310
```

## Combined examples¶

Using a paradigm from the page on operating on all records, we can retain a column from the input data as an array, then apply some higher-order functions to it:

```mlr --c2p cat example.csv
```
```color  shape    flag  k  index quantity rate
yellow triangle true  1  11    43.6498  9.8870
red    square   true  2  15    79.2778  0.0130
red    circle   true  3  16    13.8103  2.9010
red    square   false 4  48    77.5542  7.4670
purple triangle false 5  51    81.2290  8.5910
red    square   false 6  64    77.1991  9.5310
purple triangle false 7  65    80.1405  5.8240
yellow circle   true  8  73    63.9785  4.2370
yellow circle   true  9  87    63.5058  8.3350
purple square   false 10 91    72.3735  8.2430
```
```mlr --c2p --from example.csv put -q '
begin {
@indexes = [] # So auto-extend will make an array, not a map
}
@indexes[NR] = \$index;
end {

print "Original:";
print @indexes;

print;
print "Sorted:";
print sort(@indexes, "r");

print;
print "Sorted, then cubed:";
print apply(
sort(@indexes, "r"),
func(e) { return e**3 },
);

print;
print "Sorted, then cubed, then summed:";
print reduce(
apply(
sort(@indexes, "r"),
func(e) { return e**3 },
),
func(acc, e) { return acc + e },
)
}
'
```
```Original:
[11, 15, 16, 48, 51, 64, 65, 73, 87, 91]

Sorted:
[91, 87, 73, 65, 64, 51, 48, 16, 15, 11]

Sorted, then cubed:
[753571, 658503, 389017, 274625, 262144, 132651, 110592, 4096, 3375, 1331]

Sorted, then cubed, then summed:
2589905
```

## Caveats¶

### Remember return¶

From other languages it's easy to accidentally write

```mlr -n put 'end { print select([1,2,3,4,5], func (e) { e >= 3 })}'
```
```mlr: select: function returned non-boolean "(absent)".
```

```mlr -n put 'end { print select([1,2,3,4,5], func (e) { return e >= 3 })}'
```
```[3, 4, 5]
```

### No IIFEs¶

As of September 2021, immediately invoked function expressions (IIFEs) are not part of the Miller DSL's grammar. For example, this doesn't work yet:

```mlr -n put '
end {
x = 3;
y = (func (e) { return e**7 })(x);
print y;
}
'
```
```mlr: cannot parse DSL expression.
Parse error on token "(" at line 4 column 35.
Expected one of:
; } > >> | ? || ^^ && =~ !=~ == != <=> >= < <= ^ & << >>> + - .+ .- * /
// % .* ./ .// . ?? ??? **

```

but this does:

```mlr -n put '
end {
x = 3;
f = func (e) { return e**7 };
y = f(x);
print y;
}
'
```
```2187
```

### Built-in functions currently unsupported as arguments¶

Built-in functions are, as of September 2021, a bit separate from user-defined functions internally to Miller, and can't be used directly as arguments to higher-order functions.

For example, this doesn't work yet:

```mlr -n put '
end {
notches = [0,1,2,3];
radians = apply(notches, func (e) { return e * M_PI / 8 });
print cosines;
}
'
```
```mlr: apply: second argument must be a function; got absent.
```

but this does:

```mlr -n put '
end {
notches = [0,1,2,3];
radians = apply(notches, func (e) { return e * M_PI / 8 });
```[1, 0.9238795325112867, 0.7071067811865476, 0.38268343236508984]