Skip to content
Quick links:   Flags   Verbs   Functions   Glossary   Release docs

Questions about then-chaining

How do I examine then-chaining?

Then-chaining found in Miller is intended to function the same as Unix pipes, but with less keystroking. You can print your data one pipeline step at a time, to see what intermediate output at one step becomes the input to the next step.

First, look at the input data:

cat data/then-example.csv

Next, run the first step of your command, omitting anything from the first then onward:

mlr --from data/then-example.csv --c2p count-distinct -f Status,Payment_Type
Status  Payment_Type count
paid    cash         2
pending debit        1
pending credit       1
paid    debit        1

After that, run it with the next then step included:

mlr --from data/then-example.csv --c2p count-distinct -f Status,Payment_Type \
  then sort -nr count
Status  Payment_Type count
paid    cash         2
pending debit        1
pending credit       1
paid    debit        1

Now if you use then to include another verb after that, the columns Status, Payment_Type, and count will be the input to that verb.

Note, by the way, that you'll get the same results using pipes:

mlr --from data/then-example.csv --csv count-distinct -f Status,Payment_Type \
| mlr --c2p sort -nr count
Status  Payment_Type count
paid    cash         2
pending debit        1
pending credit       1
paid    debit        1

NR is not consecutive after then-chaining

Given this input data:

cat data/small

why don't I see NR=1 and NR=2 here??

mlr --from data/small filter '$x > 0.5' then put '$NR = NR'

The reason is that NR is computed for the original input records and isn't dynamically updated. By contrast, NF is dynamically updated: it's the number of fields in the current record, and if you add/remove a field, the value of NF will change:

echo x=1,y=2,z=3 | mlr put '$nf1 = NF; $u = 4; $nf2 = NF; unset $x,$y,$z; $nf3 = NF'

NR, by contrast (and FNR as well), retains the value from the original input stream, and records may be dropped by a filter within a then-chain. To recover consecutive record numbers, you can use out-of-stream variables as follows:

mlr --opprint --from data/small put '
  begin{ @nr1 = 0 }
  @nr1 += 1;
  $nr1 = @nr1
' \
then filter '$x>0.5' \
then put '
  begin{ @nr2 = 0 }
  @nr2 += 1;
  $nr2 = @nr2
a   b   i x        y        nr1 nr2
eks pan 2 0.758679 0.522151 2   1
wye pan 5 0.573288 0.863624 5   2

Or, simply use mlr cat -n:

mlr filter '$x > 0.5' then cat -n data/small
Back to top