DSL built-in functions¶
These are functions in the Miller programming language
that you can call when you use mlr put
and mlr filter
. For example, when you type
mlr --icsv --opprint --from example.csv put ' $color = toupper($color); $shape = gsub($shape, "[aeiou]", "*"); '
color shape flag k index quantity rate YELLOW tr**ngl* true 1 11 43.6498 9.8870 RED sq**r* true 2 15 79.2778 0.0130 RED c*rcl* true 3 16 13.8103 2.9010 RED sq**r* false 4 48 77.5542 7.4670 PURPLE tr**ngl* false 5 51 81.2290 8.5910 RED sq**r* false 6 64 77.1991 9.5310 PURPLE tr**ngl* false 7 65 80.1405 5.8240 YELLOW c*rcl* true 8 73 63.9785 4.2370 YELLOW c*rcl* true 9 87 63.5058 8.3350 PURPLE sq**r* false 10 91 72.3735 8.2430
the toupper
and gsub
bits are functions.
Overview¶
At the command line, you can use mlr -f
and mlr -F
for information much
like what's on this page.
Each function takes a specific number of arguments, as shown below, except for
functions marked as variadic such as min
and max
. (The latter compute min
and max of any number of arguments.) There is no notion of optional or
default-on-absent arguments. All argument-passing is positional rather than by
name; arguments are passed by value, not by reference.
At the command line, you can get a list of all functions using mlr -f
, with
details using mlr -F
. (Or, mlr help usage-functions-by-class
to get
details in the order shown on this page.) You can get detail for a given
function using mlr help function namegoeshere
, e.g. mlr help function
gsub
.
Operators are listed here along with functions. In this case, the
argument-count is the number of items involved in the infix operator, e.g. we
say x+y
so the details for the +
operator say that its number of arguments
is 2. Unary operators such as !
and ~
show argument-count of 1; the ternary
? :
operator shows an argument-count of 3.
Functions by class¶
- Arithmetic functions: bitcount, madd, mexp, mmul, msub, pow, %, &, *, **, +, -, .*, .+, .-, ./, /, //, <<, >>, >>>, ^, |, ~.
- Boolean functions: !, !=, !=~, &&, <, <=, <=>, ==, =~, >, >=, ?:, ??, ???, ^^, ||.
- Collections functions: append, arrayify, concat, depth, flatten, get_keys, get_values, haskey, json_parse, json_stringify, leafcount, length, mapdiff, mapexcept, mapselect, mapsum, unflatten.
- Conversion functions: boolean, float, fmtifnum, fmtnum, hexfmt, int, joink, joinkv, joinv, splita, splitax, splitkv, splitkvx, splitnv, splitnvx, string.
- Hashing functions: md5, sha1, sha256, sha512.
- Higher-order-functions functions: any, apply, every, fold, reduce, select, sort.
- Math functions: abs, acos, acosh, asin, asinh, atan, atan2, atanh, cbrt, ceil, cos, cosh, erf, erfc, exp, expm1, floor, invqnorm, log, log10, log1p, logifit, max, min, qnorm, round, roundm, sgn, sin, sinh, sqrt, tan, tanh, urand, urand32, urandelement, urandint, urandrange.
- String functions: capitalize, clean_whitespace, collapse_whitespace, format, gssub, gsub, latin1_to_utf8, leftpad, lstrip, regextract, regextract_or_else, rightpad, rstrip, ssub, strip, strlen, sub, substr, substr0, substr1, tolower, toupper, truncate, unformat, unformatx, utf8_to_latin1, ..
- System functions: exec, hostname, os, system, version.
- Time functions: dhms2fsec, dhms2sec, fsec2dhms, fsec2hms, gmt2localtime, gmt2sec, hms2fsec, hms2sec, localtime2gmt, localtime2sec, sec2dhms, sec2gmt, sec2gmtdate, sec2hms, sec2localdate, sec2localtime, strftime, strftime_local, strptime, strptime_local, systime, systimeint, uptime.
- Typing functions: asserting_absent, asserting_array, asserting_bool, asserting_boolean, asserting_empty, asserting_empty_map, asserting_error, asserting_float, asserting_int, asserting_map, asserting_nonempty_map, asserting_not_array, asserting_not_empty, asserting_not_map, asserting_not_null, asserting_null, asserting_numeric, asserting_present, asserting_string, is_absent, is_array, is_bool, is_boolean, is_empty, is_empty_map, is_error, is_float, is_int, is_map, is_nan, is_nonempty_map, is_not_array, is_not_empty, is_not_map, is_not_null, is_null, is_numeric, is_present, is_string, typeof.
Arithmetic functions¶
bitcount¶
bitcount (class=arithmetic #args=1) Count of 1-bits.
madd¶
madd (class=arithmetic #args=3) a + b mod m (integers)
mexp¶
mexp (class=arithmetic #args=3) a ** b mod m (integers)
mmul¶
mmul (class=arithmetic #args=3) a * b mod m (integers)
msub¶
msub (class=arithmetic #args=3) a - b mod m (integers)
pow¶
pow (class=arithmetic #args=2) Exponentiation. Same as **, but as a function.
%¶
% (class=arithmetic #args=2) Remainder; never negative-valued (pythonic).
&¶
& (class=arithmetic #args=2) Bitwise AND.
*¶
* (class=arithmetic #args=2) Multiplication, with integer*integer overflow to float.
**¶
** (class=arithmetic #args=2) Exponentiation. Same as pow, but as an infix operator.
+¶
+ (class=arithmetic #args=1,2) Addition as binary operator; unary plus operator.
-¶
- (class=arithmetic #args=1,2) Subtraction as binary operator; unary negation operator.
.*¶
.* (class=arithmetic #args=2) Multiplication, with integer-to-integer overflow.
.+¶
.+ (class=arithmetic #args=2) Addition, with integer-to-integer overflow.
.-¶
.- (class=arithmetic #args=2) Subtraction, with integer-to-integer overflow.
./¶
./ (class=arithmetic #args=2) Integer division, rounding toward zero.
/¶
/ (class=arithmetic #args=2) Division. Integer / integer is integer when exact, else floating-point: e.g. 6/3 is 2 but 6/4 is 1.5.
//¶
// (class=arithmetic #args=2) Pythonic integer division, rounding toward negative.
<<¶
<< (class=arithmetic #args=2) Bitwise left-shift.
>>¶
>> (class=arithmetic #args=2) Bitwise signed right-shift.
>>>¶
>>> (class=arithmetic #args=2) Bitwise unsigned right-shift.
^¶
^ (class=arithmetic #args=2) Bitwise XOR.
|¶
| (class=arithmetic #args=2) Bitwise OR.
~¶
~ (class=arithmetic #args=1) Bitwise NOT. Beware '$y=~$x' since =~ is the regex-match operator: try '$y = ~$x'.
Boolean functions¶
!¶
! (class=boolean #args=1) Logical negation.
!=¶
!= (class=boolean #args=2) String/numeric inequality. Mixing number and string results in string compare.
!=~¶
!=~ (class=boolean #args=2) String (left-hand side) does not match regex (right-hand side), e.g. '$name !=~ "^a.*b$"'.
&&¶
&& (class=boolean #args=2) Logical AND.
<¶
< (class=boolean #args=2) String/numeric less-than. Mixing number and string results in string compare.
<=¶
<= (class=boolean #args=2) String/numeric less-than-or-equals. Mixing number and string results in string compare.
<=>¶
<=> (class=boolean #args=2) Comparator, nominally for sorting. Given a <=> b, returns <0, 0, >0 as a < b, a == b, or a > b, respectively.
==¶
== (class=boolean #args=2) String/numeric equality. Mixing number and string results in string compare.
=~¶
=~ (class=boolean #args=2) String (left-hand side) matches regex (right-hand side), e.g. '$name =~ "^a.*b$"'. Capture groups \1 through \9 are matched from (...) in the right-hand side, and can be used within subsequent DSL statements. See also "Regular expressions" at https://miller.readthedocs.io. Examples: With if-statement: if ($url =~ "http.*com") { ... } Without if-statement: given $line = "index ab09 file", and $line =~ "([a-z][a-z])([0-9][0-9])", then $label = "[\1:\2]", $label is "[ab:09]"
>¶
> (class=boolean #args=2) String/numeric greater-than. Mixing number and string results in string compare.
>=¶
>= (class=boolean #args=2) String/numeric greater-than-or-equals. Mixing number and string results in string compare.
?:¶
?: (class=boolean #args=3) Standard ternary operator.
??¶
?? (class=boolean #args=2) Absent-coalesce operator. $a ?? 1 evaluates to 1 if $a isn't defined in the current record.
???¶
??? (class=boolean #args=2) Absent/empty-coalesce operator. $a ??? 1 evaluates to 1 if $a isn't defined in the current record, or has empty value.
^^¶
^^ (class=boolean #args=2) Logical XOR.
||¶
|| (class=boolean #args=2) Logical OR.
Collections functions¶
append¶
append (class=collections #args=2) Appends second argument to end of first argument, which must be an array.
arrayify¶
arrayify (class=collections #args=1) Walks through a nested map/array, converting any map with consecutive keys "1", "2", ... into an array. Useful to wrap the output of unflatten.
concat¶
concat (class=collections #args=variadic) Returns the array concatenation of the arguments. Non-array arguments are treated as single-element arrays. Examples: concat(1,2,3) is [1,2,3] concat([1,2],3) is [1,2,3] concat([1,2],[3]) is [1,2,3]
depth¶
depth (class=collections #args=1) Prints maximum depth of map/array. Scalars have depth 0.
flatten¶
flatten (class=collections #args=2,3) Flattens multi-level maps to single-level ones. Useful for nested JSON-like structures for non-JSON file formats like CSV. With two arguments, the first argument is a map (maybe $*) and the second argument is the flatten separator. With three arguments, the first argument is prefix, the second is the flatten separator, and the third argument is a map; flatten($*, ".") is the same as flatten("", ".", $*). See "Flatten/unflatten: converting between JSON and tabular formats" at https://miller.readthedocs.io for more information. Examples: flatten({"a":[1,2],"b":3}, ".") is {"a.1": 1, "a.2": 2, "b": 3}. flatten("a", ".", {"b": { "c": 4 }}) is {"a.b.c" : 4}. flatten("", ".", {"a": { "b": 3 }}) is {"a.b" : 3}.
get_keys¶
get_keys (class=collections #args=1) Returns array of keys of map or array
get_values¶
get_values (class=collections #args=1) Returns array of values of map or array -- in the latter case, returns a copy of the array
haskey¶
haskey (class=collections #args=2) True/false if map has/hasn't key, e.g. 'haskey($*, "a")' or 'haskey(mymap, mykey)', or true/false if array index is in bounds / out of bounds. Error if 1st argument is not a map or array. Note -n..-1 alias to 1..n in Miller arrays.
json_parse¶
json_parse (class=collections #args=1) Converts value from JSON-formatted string.
json_stringify¶
json_stringify (class=collections #args=1,2) Converts value to JSON-formatted string. Default output is single-line. With optional second boolean argument set to true, produces multiline output.
leafcount¶
leafcount (class=collections #args=1) Counts total number of terminal values in map/array. For single-level map/array, same as length.
length¶
length (class=collections #args=1) Counts number of top-level entries in array/map. Scalars have length 1.
mapdiff¶
mapdiff (class=collections #args=variadic) With 0 args, returns empty map. With 1 arg, returns copy of arg. With 2 or more, returns copy of arg 1 with all keys from any of remaining argument maps removed.
mapexcept¶
mapexcept (class=collections #args=variadic) Returns a map with keys from remaining arguments, if any, unset. Remaining arguments can be strings or arrays of string. E.g. 'mapexcept({1:2,3:4,5:6}, 1, 5, 7)' is '{3:4}' and 'mapexcept({1:2,3:4,5:6}, [1, 5, 7])' is '{3:4}'.
mapselect¶
mapselect (class=collections #args=variadic) Returns a map with only keys from remaining arguments set. Remaining arguments can be strings or arrays of string. E.g. 'mapselect({1:2,3:4,5:6}, 1, 5, 7)' is '{1:2,5:6}' and 'mapselect({1:2,3:4,5:6}, [1, 5, 7])' is '{1:2,5:6}'.
mapsum¶
mapsum (class=collections #args=variadic) With 0 args, returns empty map. With >= 1 arg, returns a map with key-value pairs from all arguments. Rightmost collisions win, e.g. 'mapsum({1:2,3:4},{1:5})' is '{1:5,3:4}'.
unflatten¶
unflatten (class=collections #args=2) Reverses flatten. Useful for nested JSON-like structures for non-JSON file formats like CSV. The first argument is a map, and the second argument is the flatten separator. See also arrayify. See "Flatten/unflatten: converting between JSON and tabular formats" at https://miller.readthedocs.io for more information. Example: unflatten({"a.b.c" : 4}, ".") is {"a": "b": { "c": 4 }}.
Conversion functions¶
boolean¶
boolean (class=conversion #args=1) Convert int/float/bool/string to boolean.
float¶
float (class=conversion #args=1) Convert int/float/bool/string to float.
fmtifnum¶
fmtifnum (class=conversion #args=2) Identical to fmtnum, except returns the first argument as-is if the output would be an error. Examples: fmtifnum(3.4, "%.6f") gives 3.400000" fmtifnum("abc", "%.6f") gives abc" $* = fmtifnum($*, "%.6f") formats numeric fields in the current record, leaving non-numeric ones alone
fmtnum¶
fmtnum (class=conversion #args=2) Convert int/float/bool to string using printf-style format string (https://pkg.go.dev/fmt), e.g. '$s = fmtnum($n, "%08d")' or '$t = fmtnum($n, "%.6e")'. This function recurses on array and map values. Example: $x = fmtnum($x, "%.6f")
hexfmt¶
hexfmt (class=conversion #args=1) Convert int to hex string, e.g. 255 to "0xff".
int¶
int (class=conversion #args=1) Convert int/float/bool/string to int.
joink¶
joink (class=conversion #args=2) Makes string from map/array keys. First argument is map/array; second is separator string. Examples: joink({"a":3,"b":4,"c":5}, ",") = "a,b,c". joink([1,2,3], ",") = "1,2,3".
joinkv¶
joinkv (class=conversion #args=3) Makes string from map/array key-value pairs. First argument is map/array; second is pair-separator string; third is field-separator string. Mnemonic: the "=" comes before the "," in the output and in the arguments to joinkv. Examples: joinkv([3,4,5], "=", ",") = "1=3,2=4,3=5" joinkv({"a":3,"b":4,"c":5}, ":", ";") = "a:3;b:4;c:5"
joinv¶
joinv (class=conversion #args=2) Makes string from map/array values. First argument is map/array; second is separator string. Examples: joinv([3,4,5], ",") = "3,4,5" joinv({"a":3,"b":4,"c":5}, ",") = "3,4,5"
splita¶
splita (class=conversion #args=2) Splits string into array with type inference. First argument is string to split; second is the separator to split on. Example: splita("3,4,5", ",") = [3,4,5]
splitax¶
splitax (class=conversion #args=2) Splits string into array without type inference. First argument is string to split; second is the separator to split on. Example: splitax("3,4,5", ",") = ["3","4","5"]
splitkv¶
splitkv (class=conversion #args=3) Splits string by separators into map with type inference. First argument is string to split; second argument is pair separator; third argument is field separator. Example: splitkv("a=3,b=4,c=5", "=", ",") = {"a":3,"b":4,"c":5}
splitkvx¶
splitkvx (class=conversion #args=3) Splits string by separators into map without type inference (keys and values are strings). First argument is string to split; second argument is pair separator; third argument is field separator. Example: splitkvx("a=3,b=4,c=5", "=", ",") = {"a":"3","b":"4","c":"5"}
splitnv¶
splitnv (class=conversion #args=2) Splits string by separator into integer-indexed map with type inference. First argument is string to split; second argument is separator to split on. Example: splitnv("a,b,c", ",") = {"1":"a","2":"b","3":"c"}
splitnvx¶
splitnvx (class=conversion #args=2) Splits string by separator into integer-indexed map without type inference (values are strings). First argument is string to split; second argument is separator to split on. Example: splitnvx("3,4,5", ",") = {"1":"3","2":"4","3":"5"}
string¶
string (class=conversion #args=1) Convert int/float/bool/string/array/map to string.
Hashing functions¶
md5¶
md5 (class=hashing #args=1) MD5 hash.
sha1¶
sha1 (class=hashing #args=1) SHA1 hash.
sha256¶
sha256 (class=hashing #args=1) SHA256 hash.
sha512¶
sha512 (class=hashing #args=1) SHA512 hash.
Higher-order-functions functions¶
any¶
any (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, yields a boolean true if the argument function returns true for any array/map element, false otherwise. For arrays, the function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean. Examples: Array example: any([10,20,30], func(e) {return $index == e}) Map example: any({"a": "foo", "b": "bar"}, func(k,v) {return $[k] == v})
apply¶
apply (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, applies the function to each element of the array/map. For arrays, the function should take one argument, for array element; it should return a new element. For maps, it should take two arguments, for map-element key and value; it should return a new key-value pair (i.e. a single-entry map). Examples: Array example: apply([1,2,3,4,5], func(e) {return e ** 3}) returns [1, 8, 27, 64, 125]. Map example: apply({"a":1, "b":3, "c":5}, func(k,v) {return {toupper(k): v ** 2}}) returns {"A": 1, "B":9, "C": 25}",
every¶
every (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, yields a boolean true if the argument function returns true for every array/map element, false otherwise. For arrays, the function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean. Examples: Array example: every(["a", "b", "c"], func(e) {return $[e] >= 0}) Map example: every({"a": "foo", "b": "bar"}, func(k,v) {return $[k] == v})
fold¶
fold (class=higher-order-functions #args=3) Given a map or array as first argument and a function as second argument, accumulates entries into a final output -- for example, sum or product. For arrays, the function should take two arguments, for accumulated value and array element. For maps, it should take four arguments, for accumulated key and value, and map-element key and value; it should return the updated accumulator as a new key-value pair (i.e. a single-entry map). The start value for the accumulator is taken from the third argument. Examples: Array example: fold([1,2,3,4,5], func(acc,e) {return acc + e**3}, 10000) returns 10225. Map example: fold({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum": accv+ev**2}}, {"sum":10000}) returns 10035.
reduce¶
reduce (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, accumulates entries into a final output -- for example, sum or product. For arrays, the function should take two arguments, for accumulated value and array element, and return the accumulated element. For maps, it should take four arguments, for accumulated key and value, and map-element key and value; it should return the updated accumulator as a new key-value pair (i.e. a single-entry map). The start value for the accumulator is the first element for arrays, or the first element's key-value pair for maps. Examples: Array example: reduce([1,2,3,4,5], func(acc,e) {return acc + e**3}) returns 225. Map example: reduce({"a":1, "b":3, "c": 5}, func(acck,accv,ek,ev) {return {"sum_of_squares": accv + ev**2}}) returns {"sum_of_squares": 35}.
select¶
select (class=higher-order-functions #args=2) Given a map or array as first argument and a function as second argument, includes each input element in the output if the function returns true. For arrays, the function should take one argument, for array element; for maps, it should take two, for map-element key and value. In either case it should return a boolean. Examples: Array example: select([1,2,3,4,5], func(e) {return e >= 3}) returns [3, 4, 5]. Map example: select({"a":1, "b":3, "c":5}, func(k,v) {return v >= 3}) returns {"b":3, "c": 5}.
sort¶
sort (class=higher-order-functions #args=1-2) Given a map or array as first argument and string flags or function as optional second argument, returns a sorted copy of the input. With one argument, sorts array elements with numbers first numerically and then strings lexically, and map elements likewise by map keys. If the second argument is a string, it can contain any of "f" for lexical ("n" is for the above default), "c" for case-folded lexical, or "t" for natural sort order. An additional "r" in that string is for reverse. An additional "v" in that string means sort maps by value, rather than by key. If the second argument is a function, then for arrays it should take two arguments a and b, returning < 0, 0, or > 0 as a < b, a == b, or a > b respectively; for maps the function should take four arguments ak, av, bk, and bv, again returning < 0, 0, or > 0, using a and b's keys and values. Examples: Default sorting: sort([3,"A",1,"B",22]) returns [1, 3, 20, "A", "B"]. Note that this is numbers before strings. Default sorting: sort(["E","a","c","B","d"]) returns ["B", "E", "a", "c", "d"]. Note that this is uppercase before lowercase. Case-folded ascending: sort(["E","a","c","B","d"], "c") returns ["a", "B", "c", "d", "E"]. Case-folded descending: sort(["E","a","c","B","d"], "cr") returns ["E", "d", "c", "B", "a"]. Natural sorting: sort(["a1","a10","a100","a2","a20","a200"], "t") returns ["a1", "a2", "a10", "a20", "a100", "a200"]. Array with function: sort([5,2,3,1,4], func(a,b) {return b <=> a}) returns [5,4,3,2,1]. Map with function: sort({"c":2,"a":3,"b":1}, func(ak,av,bk,bv) {return bv <=> av}) returns {"a":3,"c":2,"b":1}. Map without function: sort({"c":2,"a":3,"b":1}) returns {"a":3,"b":1,"c":2}. Map without function: sort({"c":2,"a":3,"b":1}, "v") returns {"b":1,"c":2,"a":3}. Map without function: sort({"c":2,"a":3,"b":1}, "vnr") returns {"a":3,"c":2,"b":1}.
Math functions¶
abs¶
abs (class=math #args=1) Absolute value.
acos¶
acos (class=math #args=1) Inverse trigonometric cosine.
acosh¶
acosh (class=math #args=1) Inverse hyperbolic cosine.
asin¶
asin (class=math #args=1) Inverse trigonometric sine.
asinh¶
asinh (class=math #args=1) Inverse hyperbolic sine.
atan¶
atan (class=math #args=1) One-argument arctangent.
atan2¶
atan2 (class=math #args=2) Two-argument arctangent.
atanh¶
atanh (class=math #args=1) Inverse hyperbolic tangent.
cbrt¶
cbrt (class=math #args=1) Cube root.
ceil¶
ceil (class=math #args=1) Ceiling: nearest integer at or above.
cos¶
cos (class=math #args=1) Trigonometric cosine.
cosh¶
cosh (class=math #args=1) Hyperbolic cosine.
erf¶
erf (class=math #args=1) Error function.
erfc¶
erfc (class=math #args=1) Complementary error function.
exp¶
exp (class=math #args=1) Exponential function e**x.
expm1¶
expm1 (class=math #args=1) e**x - 1.
floor¶
floor (class=math #args=1) Floor: nearest integer at or below.
invqnorm¶
invqnorm (class=math #args=1) Inverse of normal cumulative distribution function. Note that invqorm(urand()) is normally distributed.
log¶
log (class=math #args=1) Natural (base-e) logarithm.
log10¶
log10 (class=math #args=1) Base-10 logarithm.
log1p¶
log1p (class=math #args=1) log(1-x).
logifit¶
logifit (class=math #args=3) Given m and b from logistic regression, compute fit: $yhat=logifit($x,$m,$b).
max¶
max (class=math #args=variadic) Max of n numbers; null loses.
min¶
min (class=math #args=variadic) Min of n numbers; null loses.
qnorm¶
qnorm (class=math #args=1) Normal cumulative distribution function.
round¶
round (class=math #args=1) Round to nearest integer.
roundm¶
roundm (class=math #args=2) Round to nearest multiple of m: roundm($x,$m) is the same as round($x/$m)*$m.
sgn¶
sgn (class=math #args=1) +1, 0, -1 for positive, zero, negative input respectively.
sin¶
sin (class=math #args=1) Trigonometric sine.
sinh¶
sinh (class=math #args=1) Hyperbolic sine.
sqrt¶
sqrt (class=math #args=1) Square root.
tan¶
tan (class=math #args=1) Trigonometric tangent.
tanh¶
tanh (class=math #args=1) Hyperbolic tangent.
urand¶
urand (class=math #args=0) Floating-point numbers uniformly distributed on the unit interval. Example: Int-valued example: '$n=floor(20+urand()*11)'.
urand32¶
urand32 (class=math #args=0) Integer uniformly distributed 0 and 2**32-1 inclusive.
urandelement¶
urandelement (class=math #args=1) Random sample from the first argument, which must be an non-empty array.
urandint¶
urandint (class=math #args=2) Integer uniformly distributed between inclusive integer endpoints.
urandrange¶
urandrange (class=math #args=2) Floating-point numbers uniformly distributed on the interval [a, b).
String functions¶
capitalize¶
capitalize (class=string #args=1) Convert string's first character to uppercase.
clean_whitespace¶
clean_whitespace (class=string #args=1) Same as collapse_whitespace and strip.
collapse_whitespace¶
collapse_whitespace (class=string #args=1) Strip repeated whitespace from string.
format¶
format (class=string #args=variadic) Using first argument as format string, interpolate remaining arguments in place of each "{}" in the format string. Too-few arguments are treated as the empty string; too-many arguments are discarded. Examples: format("{}:{}:{}", 1,2) gives "1:2:". format("{}:{}:{}", 1,2,3) gives "1:2:3". format("{}:{}:{}", 1,2,3,4) gives "1:2:3".
gssub¶
gssub (class=string #args=3) Like gsub but does no regexing. No characters are special. Example: gssub("ab.d.fg", ".", "X") gives "abXdXfg"
gsub¶
gsub (class=string #args=3) '$name = gsub($name, "old", "new")': replace all, with support for regular expressions. Capture groups \1 through \9 in the new part are matched from (...) in the old part, and must be used within the same call to gsub -- they don't persist for subsequent DSL statements. See also =~ and regextract. See also "Regular expressions" at https://miller.readthedocs.io. Examples: gsub("ababab", "ab", "XY") gives "XYXYXY" gsub("abc.def", ".", "X") gives "XXXXXXX" gsub("abc.def", "\.", "X") gives "abcXdef" gsub("abcdefg", "[ce]", "X") gives "abXdXfg" gsub("prefix4529:suffix8567", "(....ix)([0-9]+)", "[\1 : \2]") gives "[prefix : 4529]:[suffix : 8567]"
latin1_to_utf8¶
latin1_to_utf8 (class=string #args=1) Tries to convert Latin-1-encoded string to UTF-8-encoded string. If argument is array or map, recurses into it. Examples: $y = latin1_to_utf8($x) $* = latin1_to_utf8($*)
leftpad¶
leftpad (class=string #args=3) Left-pads first argument to at most the specified length (second, integer argument) using specified pad value (third, string argument). If the first argument is not a string, it will be stringified first. Examples: leftpad("abcdefg", 10 , "*") gives "***abcdefg". leftpad("abcdefg", 10 , "XY") gives "XYabcdefg". leftpad("1234567", 10 , "0") gives "0001234567".
lstrip¶
lstrip (class=string #args=1) Strip leading whitespace from string.
regextract¶
regextract (class=string #args=2) Extracts a substring (the first, if there are multiple matches), matching a regular expression, from the input. Does not use capture groups; see also the =~ operator which does. Examples: regextract("index ab09 file", "[a-z][a-z][0-9][0-9]") gives "ab09" regextract("index a999 file", "[a-z][a-z][0-9][0-9]") gives (absent), which will result in an assignment not happening.
regextract_or_else¶
regextract_or_else (class=string #args=3) Like regextract but the third argument is the return value in case the input string (first argument) doesn't match the pattern (second argument). Examples: regextract_or_else("index ab09 file", "[a-z][a-z][0-9][0-9]", "nonesuch") gives "ab09" regextract_or_else("index a999 file", "[a-z][a-z][0-9][0-9]", "nonesuch") gives "nonesuch"
rightpad¶
rightpad (class=string #args=3) Right-pads first argument to at most the specified length (second, integer argument) using specified pad value (third, string argument). If the first argument is not a string, it will be stringified first. Examples: rightpad("abcdefg", 10 , "*") gives "abcdefg***". rightpad("abcdefg", 10 , "XY") gives "abcdefgXY". rightpad("1234567", 10 , "0") gives "1234567000".
rstrip¶
rstrip (class=string #args=1) Strip trailing whitespace from string.
ssub¶
ssub (class=string #args=3) Like sub but does no regexing. No characters are special. Example: ssub("abc.def", ".", "X") gives "abcXdef"
strip¶
strip (class=string #args=1) Strip leading and trailing whitespace from string.
strlen¶
strlen (class=string #args=1) String length.
sub¶
sub (class=string #args=3) '$name = sub($name, "old", "new")': replace once (first match, if there are multiple matches), with support for regular expressions. Capture groups \1 through \9 in the new part are matched from (...) in the old part, and must be used within the same call to sub -- they don't persist for subsequent DSL statements. See also =~ and regextract. See also "Regular expressions" at https://miller.readthedocs.io. Examples: sub("ababab", "ab", "XY") gives "XYabab" sub("abc.def", ".", "X") gives "Xbc.def" sub("abc.def", "\.", "X") gives "abcXdef" sub("abcdefg", "[ce]", "X") gives "abXdefg" sub("prefix4529:suffix8567", "suffix([0-9]+)", "name\1") gives "prefix4529:name8567"
substr¶
substr (class=string #args=3) substr is an alias for substr0. See also substr1. Miller is generally 1-up with all array and string indices, but, this is a backward-compatibility issue with Miller 5 and below. Arrays are new in Miller 6; the substr function is older.
substr0¶
substr0 (class=string #args=3) substr0(s,m,n) gives substring of s from 0-up position m to n inclusive. Negative indices -len .. -1 alias to 0 .. len-1. See also substr and substr1.
substr1¶
substr1 (class=string #args=3) substr1(s,m,n) gives substring of s from 1-up position m to n inclusive. Negative indices -len .. -1 alias to 1 .. len. See also substr and substr0.
tolower¶
tolower (class=string #args=1) Convert string to lowercase.
toupper¶
toupper (class=string #args=1) Convert string to uppercase.
truncate¶
truncate (class=string #args=2) Truncates string first argument to max length of int second argument.
unformat¶
unformat (class=string #args=2) Using first argument as format string, unpacks second argument into an array of matches, with type-inference. On non-match, returns error -- use is_error() to check. Examples: unformat("{}:{}:{}", "1:2:3") gives [1, 2, 3]. unformat("{}h{}m{}s", "3h47m22s") gives [3, 47, 22]. is_error(unformat("{}h{}m{}s", "3:47:22")) gives true.
unformatx¶
unformatx (class=string #args=2) Same as unformat, but without type-inference. Examples: unformatx("{}:{}:{}", "1:2:3") gives ["1", "2", "3"]. unformatx("{}h{}m{}s", "3h47m22s") gives ["3", "47", "22"]. is_error(unformatx("{}h{}m{}s", "3:47:22")) gives true.
utf8_to_latin1¶
utf8_to_latin1 (class=string #args=1) Tries to convert UTF-8-encoded string to Latin-1-encoded string. If argument is array or map, recurses into it. Examples: $y = utf8_to_latin1($x) $* = utf8_to_latin1($*)
.¶
. (class=string #args=2) String concatenation. Non-strings are coerced, so you can do '"ax".98' etc.
System functions¶
exec¶
exec (class=system #args=variadic) '$output = exec( "command", ["arg1", "arg2"], {"env": ["ENV_VAR=ENV_VALUE", "ENV_VAR2=ENV_VALUE2"], "dir": "/tmp/run_command_here", "stdin_string": "this is input fed to program", "combined_output": true )' Run a command via executable, path, args and environment, yielding its stdout minus final carriage return. Example: exec("echo", ["I don't do", "$SHELL things"], {"env": "SHELL=sh"}) outputs "I don't do $SHELL things"
hostname¶
hostname (class=system #args=0) Returns the hostname as a string.
os¶
os (class=system #args=0) Returns the operating-system name as a string.
system¶
system (class=system #args=1) Run command string, yielding its stdout minus final carriage return.
version¶
version (class=system #args=0) Returns the Miller version as a string.
Time functions¶
dhms2fsec¶
dhms2fsec (class=time #args=1) Recovers floating-point seconds as in dhms2fsec("5d18h53m20.250000s") = 500000.250000
dhms2sec¶
dhms2sec (class=time #args=1) Recovers integer seconds as in dhms2sec("5d18h53m20s") = 500000
fsec2dhms¶
fsec2dhms (class=time #args=1) Formats floating-point seconds as in fsec2dhms(500000.25) = "5d18h53m20.250000s"
fsec2hms¶
fsec2hms (class=time #args=1) Formats floating-point seconds as in fsec2hms(5000.25) = "01:23:20.250000"
gmt2localtime¶
gmt2localtime (class=time #args=1,2) Convert from a GMT-time string to a local-time string. Consulting $TZ unless second argument is supplied. Examples: gmt2localtime("1999-12-31T22:00:00Z") = "2000-01-01 00:00:00" with TZ="Asia/Istanbul" gmt2localtime("1999-12-31T22:00:00Z", "Asia/Istanbul") = "2000-01-01 00:00:00"
gmt2sec¶
gmt2sec (class=time #args=1) Parses GMT timestamp as integer seconds since the epoch. Example: gmt2sec("2001-02-03T04:05:06Z") = 981173106
hms2fsec¶
hms2fsec (class=time #args=1) Recovers floating-point seconds as in hms2fsec("01:23:20.250000") = 5000.250000
hms2sec¶
hms2sec (class=time #args=1) Recovers integer seconds as in hms2sec("01:23:20") = 5000
localtime2gmt¶
localtime2gmt (class=time #args=1,2) Convert from a local-time string to a GMT-time string. Consults $TZ unless second argument is supplied. Examples: localtime2gmt("2000-01-01 00:00:00") = "1999-12-31T22:00:00Z" with TZ="Asia/Istanbul" localtime2gmt("2000-01-01 00:00:00", "Asia/Istanbul") = "1999-12-31T22:00:00Z"
localtime2sec¶
localtime2sec (class=time #args=1,2) Parses local timestamp as integer seconds since the epoch. Consults $TZ environment variable, unless second argument is supplied. Examples: localtime2sec("2001-02-03 04:05:06") = 981165906 with TZ="Asia/Istanbul" localtime2sec("2001-02-03 04:05:06", "Asia/Istanbul") = 981165906"
sec2dhms¶
sec2dhms (class=time #args=1) Formats integer seconds as in sec2dhms(500000) = "5d18h53m20s"
sec2gmt¶
sec2gmt (class=time #args=1,2) Formats seconds since epoch as GMT timestamp. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part. Examples: sec2gmt(1234567890) = "2009-02-13T23:31:30Z" sec2gmt(1234567890.123456) = "2009-02-13T23:31:30Z" sec2gmt(1234567890.123456, 6) = "2009-02-13T23:31:30.123456Z"
sec2gmtdate¶
sec2gmtdate (class=time #args=1) Formats seconds since epoch (integer part) as GMT timestamp with year-month-date. Leaves non-numbers as-is. Example: sec2gmtdate(1440768801.7) = "2015-08-28".
sec2hms¶
sec2hms (class=time #args=1) Formats integer seconds as in sec2hms(5000) = "01:23:20"
sec2localdate¶
sec2localdate (class=time #args=1,2) Formats seconds since epoch (integer part) as local timestamp with year-month-date. Leaves non-numbers as-is. Consults $TZ environment variable unless second argument is supplied. Examples: sec2localdate(1440768801.7) = "2015-08-28" with TZ="Asia/Istanbul" sec2localdate(1440768801.7, "Asia/Istanbul") = "2015-08-28"
sec2localtime¶
sec2localtime (class=time #args=1,2,3) Formats seconds since epoch (integer part) as local timestamp. Consults $TZ environment variable unless third argument is supplied. Leaves non-numbers as-is. With second integer argument n, includes n decimal places for the seconds part Examples: sec2localtime(1234567890) = "2009-02-14 01:31:30" with TZ="Asia/Istanbul" sec2localtime(1234567890.123456) = "2009-02-14 01:31:30" with TZ="Asia/Istanbul" sec2localtime(1234567890.123456, 6) = "2009-02-14 01:31:30.123456" with TZ="Asia/Istanbul" sec2localtime(1234567890.123456, 6, "Asia/Istanbul") = "2009-02-14 01:31:30.123456"
strftime¶
strftime (class=time #args=2) Formats seconds since the epoch as timestamp. Format strings are as at https://pkg.go.dev/github.com/lestrrat-go/strftime, with the Miller-specific addition of "%1S" through "%9S" which format the seconds with 1 through 9 decimal places, respectively. ("%S" uses no decimal places.) See also https://miller.readthedocs.io/en/latest/reference-dsl-time/ for more information on the differences from the C library ("man strftime" on your system). See also strftime_local. Examples: strftime(1440768801.7,"%Y-%m-%dT%H:%M:%SZ") = "2015-08-28T13:33:21Z" strftime(1440768801.7,"%Y-%m-%dT%H:%M:%3SZ") = "2015-08-28T13:33:21.700Z"
strftime_local¶
strftime_local (class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone. Examples: strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%S %z") = "2015-08-28 16:33:21 +0300" with TZ="Asia/Istanbul" strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%3S %z") = "2015-08-28 16:33:21.700 +0300" with TZ="Asia/Istanbul" strftime_local(1440768801.7, "%Y-%m-%d %H:%M:%3S %z", "Asia/Istanbul") = "2015-08-28 16:33:21.700 +0300"
strptime¶
strptime (class=time #args=2) strptime: Parses timestamp as floating-point seconds since the epoch. See also strptime_local. Examples: strptime("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440768801.000000 strptime("2015-08-28T13:33:21.345Z", "%Y-%m-%dT%H:%M:%SZ") = 1440768801.345000 strptime("1970-01-01 00:00:00 -0400", "%Y-%m-%d %H:%M:%S %z") = 14400 strptime("1970-01-01 00:00:00 EET", "%Y-%m-%d %H:%M:%S %Z") = -7200
strptime_local¶
strptime_local (class=time #args=2,3) Like strftime but consults the $TZ environment variable to get local time zone. Examples: strptime_local("2015-08-28T13:33:21Z", "%Y-%m-%dT%H:%M:%SZ") = 1440758001 with TZ="Asia/Istanbul" strptime_local("2015-08-28T13:33:21.345Z","%Y-%m-%dT%H:%M:%SZ") = 1440758001.345 with TZ="Asia/Istanbul" strptime_local("2015-08-28 13:33:21", "%Y-%m-%d %H:%M:%S") = 1440758001 with TZ="Asia/Istanbul" strptime_local("2015-08-28 13:33:21", "%Y-%m-%d %H:%M:%S", "Asia/Istanbul") = 1440758001
systime¶
systime (class=time #args=0) Returns the system time in floating-point seconds since the epoch.
systimeint¶
systimeint (class=time #args=0) Returns the system time in integer seconds since the epoch.
uptime¶
uptime (class=time #args=0) Returns the time in floating-point seconds since the current Miller program was started.
Typing functions¶
asserting_absent¶
asserting_absent (class=typing #args=1) Aborts with an error if is_absent on the argument returns false, else returns its argument.
asserting_array¶
asserting_array (class=typing #args=1) Aborts with an error if is_array on the argument returns false, else returns its argument.
asserting_bool¶
asserting_bool (class=typing #args=1) Aborts with an error if is_bool on the argument returns false, else returns its argument.
asserting_boolean¶
asserting_boolean (class=typing #args=1) Aborts with an error if is_boolean on the argument returns false, else returns its argument.
asserting_empty¶
asserting_empty (class=typing #args=1) Aborts with an error if is_empty on the argument returns false, else returns its argument.
asserting_empty_map¶
asserting_empty_map (class=typing #args=1) Aborts with an error if is_empty_map on the argument returns false, else returns its argument.
asserting_error¶
asserting_error (class=typing #args=1) Aborts with an error if is_error on the argument returns false, else returns its argument.
asserting_float¶
asserting_float (class=typing #args=1) Aborts with an error if is_float on the argument returns false, else returns its argument.
asserting_int¶
asserting_int (class=typing #args=1) Aborts with an error if is_int on the argument returns false, else returns its argument.
asserting_map¶
asserting_map (class=typing #args=1) Aborts with an error if is_map on the argument returns false, else returns its argument.
asserting_nonempty_map¶
asserting_nonempty_map (class=typing #args=1) Aborts with an error if is_nonempty_map on the argument returns false, else returns its argument.
asserting_not_array¶
asserting_not_array (class=typing #args=1) Aborts with an error if is_not_array on the argument returns false, else returns its argument.
asserting_not_empty¶
asserting_not_empty (class=typing #args=1) Aborts with an error if is_not_empty on the argument returns false, else returns its argument.
asserting_not_map¶
asserting_not_map (class=typing #args=1) Aborts with an error if is_not_map on the argument returns false, else returns its argument.
asserting_not_null¶
asserting_not_null (class=typing #args=1) Aborts with an error if is_not_null on the argument returns false, else returns its argument.
asserting_null¶
asserting_null (class=typing #args=1) Aborts with an error if is_null on the argument returns false, else returns its argument.
asserting_numeric¶
asserting_numeric (class=typing #args=1) Aborts with an error if is_numeric on the argument returns false, else returns its argument.
asserting_present¶
asserting_present (class=typing #args=1) Aborts with an error if is_present on the argument returns false, else returns its argument.
asserting_string¶
asserting_string (class=typing #args=1) Aborts with an error if is_string on the argument returns false, else returns its argument.
is_absent¶
is_absent (class=typing #args=1) False if field is present in input, true otherwise
is_array¶
is_array (class=typing #args=1) True if argument is an array.
is_bool¶
is_bool (class=typing #args=1) True if field is present with boolean value. Synonymous with is_boolean.
is_boolean¶
is_boolean (class=typing #args=1) True if field is present with boolean value. Synonymous with is_bool.
is_empty¶
is_empty (class=typing #args=1) True if field is present in input with empty string value, false otherwise.
is_empty_map¶
is_empty_map (class=typing #args=1) True if argument is a map which is empty.
is_error¶
is_error (class=typing #args=1) True if if argument is an error, such as taking string length of an integer.
is_float¶
is_float (class=typing #args=1) True if field is present with value inferred to be float
is_int¶
is_int (class=typing #args=1) True if field is present with value inferred to be int
is_map¶
is_map (class=typing #args=1) True if argument is a map.
is_nan¶
is_nan (class=typing #args=1) True if the argument is the NaN (not-a-number) floating-point value. Note that NaN has the property that NaN != NaN, so you need 'is_nan(x)' rather than 'x == NaN'.
is_nonempty_map¶
is_nonempty_map (class=typing #args=1) True if argument is a map which is non-empty.
is_not_array¶
is_not_array (class=typing #args=1) True if argument is not an array.
is_not_empty¶
is_not_empty (class=typing #args=1) True if field is present in input with non-empty value, false otherwise
is_not_map¶
is_not_map (class=typing #args=1) True if argument is not a map.
is_not_null¶
is_not_null (class=typing #args=1) False if argument is null (empty, absent, or JSON null), true otherwise.
is_null¶
is_null (class=typing #args=1) True if argument is null (empty, absent, or JSON null), false otherwise.
is_numeric¶
is_numeric (class=typing #args=1) True if field is present with value inferred to be int or float
is_present¶
is_present (class=typing #args=1) True if field is present in input, false otherwise.
is_string¶
is_string (class=typing #args=1) True if field is present with string (including empty-string) value
typeof¶
typeof (class=typing #args=1) Convert argument to type of argument (e.g. "str"). For debug.