In accord with the Unix philosophy, you can pipe data into or out of Miller. For example:
mlr cut --complement -f os_version *.dat | mlr sort -f hostname,uptime
You can, if you like, instead simply chain commands together using the
mlr cut --complement -f os_version then sort -f hostname,uptime *.dat
(You can precede the very first verb with
then, if you like, for symmetry.)
Here's a performance comparison:
% cat piped.sh mlr cut -x -f i,y data/big | mlr sort -n y > /dev/null % time sh piped.sh real 0m2.321s user 0m4.878s sys 0m1.564s % cat chained.sh mlr cut -x -f i,y then sort -n y data/big > /dev/null % time sh chained.sh real 0m2.070s user 0m2.738s sys 0m1.259s
There are two reasons to use then-chaining: one is for performance, although I don't expect this to be a win in all cases. Using then-chaining avoids redundant string-parsing and string-formatting at each pipeline step: instead input records are parsed once, they are fed through each pipeline stage in memory, and then output records are formatted once.
The other reason to use then-chaining is for simplicity: you don't have re-type formatting flags (e.g.
--csv --fs tab) at every pipeline stage.