Ooops somehow didn't see your messages until now. Very interesting read, thanks! While I very much appreciate a more formal lens, I thought of "size" very informally here. Something like number of primitives (var, app, abs) used.
Using your binary encoding I get:
Iota = 00 01 01 10 00 00 00 01 01 1110 10 01 110 10 00 00 110 (38 bits)
"Ioter" = 00 00 00 01 01 01 1110 00 00 110 10 01 110 10 (32 bits)
Hope I didn't mess up.
Way harder to think about how short (number of applications) "realistic" programs (say, primality check) could be written using Iota vs "Ioter". I'd bet on Iota, but then maybe it's just human bias since it's so closely related to SKI, which is easy to reason about and compose.
A while ago I thought about minimalism, not in terms of one-point bases: One could say that
B = \x \y \z x (y z) (composition)
T = \x \y y x (reordering)
E = \x \y y (elimination)
M = \x x x (duplication)
are more "boiled down" versions of BCKW. It'd be non-trivial to argue to a primary schooler who understands term rewriting why BCKW is so much more handy than BTEM. You figure out why, once you try writing something useful. But it's sad that a smaller basis (by any encoding I can think of) makes life harder.