Optimizing Opal output for size

Opal doesn't output the smallest code possible - that's not our goal. We want to output readable ES5 code and we have tools: JS minifiers (Terser and Google Closure Compiler), and tree shaking utilities (opal-optimizer) to bring the code size down.

JavaScript and Ruby certainly have some different semantics. Some things work similarly (like open classes), but some others don't - and those that don't require some boilerplate code. That not only reduces performance, but also increases load times, both crucial for JavaScript code.

In this article we will particularly focus on Terser, since it's the most widely used tool for Opal post-processing. Can Terser find every nook and cranny and optimize the resulting code to the minimal JavaScript version possible? Unfortunately not. It doesn't have a knowledge about which statements can produce side effects and which don't. And so it only does the transformations that are semantically equivalent. But while compiling, we may know something more.

So I attempted an exercise to reduce the size of the compiled JavaScript code. As a benchmark I took a real-world library, Asciidoctor (available in its Opal-compiled version as Asciidoctor.js), which is the one we already use to test for performance changes in the Opal CI.

I took a code golfing approach, with the exception that I wanted to produce readable code (Terser will take care of uglifying it), setting myself up for generating the smallest JavaScript AsciiDoc output by improving Opal compiler. The main idea is to reduce the code size, not to increase performance, but as I will sum this up later, those increases will happen together, but to a lesser extent. This exercise took about 4 work days for me.

Those improvements will most probably land in the Opal 1.4 release, to happen in late December, along with Ruby 3.1 support. But for now, let me take you for a long journey during which you may pick up a couple of JavaScript optimization tricks and learn a bit about how Opal compiles Ruby to JavaScript.

Step 1. Do we need self?

In Opal, we always alias this to self at the beginning of method definitions as var self = this. But, let's consider the following code, do we really need to define self in that case?

def loop
  while true
    yield
  end
end

And so, if a function doesn't reference self in any way (a special case will be x-strings), let's not compile it in. So, what gains for AsciiDoctor do we get?

Let's run bin/rake performance and find out:

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.239 -> 6.123 (change: -1.86%)
                      Run time: 0.285 -> 0.284 (change: -0.10%)
                   Bundle size: 5257437 -> 5236087 (change: -0.41%)
          Minified bundle size: 1264503 -> 1254455 (change: -0.79%)

Not much, but at least it's something. Take note - we can't reliably compare the first two performance metrics. And also gains for different softwares will be different. The minified bundle size is the created with terser -c. Do also note that all the following outputs of this kind will refer to entire patchset, as compared to Opal v1.3.2.

Step 2. Optimize methods that accept blocks

Ok, now for a very simple function:

def a(&block)
end

You may wonder, what does this function compile to?

  return (Opal.def(self, '$a', $a$1 = function $$a() {
    var $iter = $a$1.$$p, block = $iter || nil, self = this;

    if ($iter) $a$1.$$p = null;


    if ($iter) $a$1.$$p = null;;
    return nil;
  }, $a$1.$$arity = 0), nil) && 'a'

You can see $a$1, which is a reference to the method body, on which the block is attached as $$p when the method is called.

Hm, not so good. One statement is duplicated, too many variable declarations. We will focus quite a lot on this method, optimizing it step-by-step in the following passes.

So, after some tinkering, this is the result:

  return (Opal.def(self, '$a', $a$1 = function $$a() {
    var block = $a$1.$$p || nil;

    if (block) $a$1.$$p = null;

    ;
    return nil;
  }, $a$1.$$arity = 0), nil) && 'a'

Much better, right? How about the numbers?

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.225 -> 6.074 (change: -2.43%)
                      Run time: 0.285 -> 0.283 (change: -0.68%)
                   Bundle size: 5257437 -> 5220199 (change: -0.71%)
          Minified bundle size: 1264503 -> 1244281 (change: -1.60%)

Okay! That's not bad!

Step 3. nil && 'a'?

The previous versions of Ruby tended to return nil for method definition, but later changed to return Symbols (i.e. Strings in Opal). So why we actually need this nil? Let's reduce it:

  return (Opal.def(self, '$a', $a$1 = function $$a() {
    var block = $a$1.$$p || nil;

    if (block) $a$1.$$p = null;

    ;
    return nil;
  }, $a$1.$$arity = 0), 'a')

And the numbers:

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.221 -> 6.097 (change: -1.98%)
                      Run time: 0.284 -> 0.283 (change: -0.21%)
                   Bundle size: 5257437 -> 5218997 (change: -0.73%)
          Minified bundle size: 1264503 -> 1243880 (change: -1.63%)

That's not much. Even though Opal has a lot of method definitions. But let's go ahead.

Step 4. Helperize Opal.def and Opal.defs

What does "helperize" mean? Well - in Opal compiler we may make a statement helper :def to generate a per-file top-level statement that does var $def = Opal.def. That's kinda like more code, right? But most files have more than one use of Opal.def, often even a lot of them. And Terser can't reliably rename Opal.def to something shorter, but $def can safely become S or whatever it decides. So, our method (now with a broader context) will produce this:

  var $a$1, self = Opal.top, $nesting = [], nil = Opal.nil, $$$ = Opal.$$$, $$ = Opal.$$, $def = Opal.def;

  return ($def(self, '$a', $a$1 = function $$a() {
    var block = $a$1.$$p || nil;

    if (block) $a$1.$$p = null;

    ;
    return nil;
  }, $a$1.$$arity = 0), 'a')

Is that much? Numbers:

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.232 -> 6.103 (change: -2.08%)
                      Run time: 0.283 -> 0.283 (change: -0.19%)
                   Bundle size: 5257437 -> 5214022 (change: -0.83%)
          Minified bundle size: 1264503 -> 1238407 (change: -2.06%)

Yes. It's quite a lot.

Step 5. Optimize slice and splice calls.

We use those calls a lot for extracting Ruby rest arguments from the arguments array-like object in JavaScript, in methods like def a(arg, *restargs). Opal.slice is short for Array.prototype.slice.

Before this step, we used to output this:

    $post_args = Opal.slice.call(arguments, 0, arguments.length);

Now we output this, which is equivalent:

    $post_args = Opal.slice.call(arguments);

Numbers:

 Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.213 -> 6.107 (change: -1.71%)
                      Run time: 0.285 -> 0.282 (change: -0.81%)
                   Bundle size: 5257437 -> 5209571 (change: -0.91%)
          Minified bundle size: 1264503 -> 1234386 (change: -2.38%)

So, we are accelerating.

Step 6. Optimize $$arity and function variables.

Back to our def a(&block) empty method. Can we optimize this part even further?

  }, $a$1.$$arity = 0), 'a')

Also, why do we need this, isn't it wasteful? (do note that we also need to do var $a$1)

  return ($def(self, '$a', $a$1 = function $$a() {

Let's optimize those things out:

Opal.queue(function(Opal) {/* Generated by Opal 1.3.1 */
  var self = Opal.top, $nesting = [], nil = Opal.nil, $$$ = Opal.$$$, $$ = Opal.$$, $def = Opal.def;

  return ($def(self, '$a', function $$a() {
    var block = $$a.$$p || nil;

    if (block) $$a.$$p = null;

    ;
    return nil;
  }, 0), 'a')
});

Note the 0 argument. This is a shorthand for {$$arity: 0}. We can't do this optimization for a minority of methods that need additional properties set. So, what are our gains now?

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.249 -> 6.058 (change: -3.05%)
                      Run time: 0.286 -> 0.279 (change: -2.17%)
                   Bundle size: 5257437 -> 5036072 (change: -4.21%)
          Minified bundle size: 1264503 -> 1089609 (change: -13.83%)

That's a serious optimization now, isn't it? Almost 14% lesser files? But let's not finish here, but go ahead, maybe we can get even smaller files!

Step 7. Delete $$p

Why do we use things like $$a.$$p? Well - you may not know, if you aren't too familiar with Opal. This is how we pass a block argument, by setting a $$p property on a called function. After a call, we unset it. But this statement: if (block) $$a.$$p = null;... why can't we just delete $$a.$$p;? Do we lose some performance then? Perhaps - but not noticeably. And we gain a lot of space. So, our def a(&block) method compiled now looks like this:

Opal.queue(function(Opal) {/* Generated by Opal 1.3.1 */
  var self = Opal.top, $nesting = [], nil = Opal.nil, $$$ = Opal.$$$, $$ = Opal.$$, $def = Opal.def;

  return ($def(self, '$a', function $$a() {
    var block = $$a.$$p || nil;

    delete $$a.$$p;

    ;
    return nil;
  }, 0), 'a')
});

And the numbers aren't much better, but cumulatively they are better:

 Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.239 -> 6.057 (change: -2.92%)
                      Run time: 0.286 -> 0.280 (change: -2.03%)
                   Bundle size: 5257437 -> 5032881 (change: -4.27%)
          Minified bundle size: 1264503 -> 1087119 (change: -14.03%)

Step 8. Let's torture our method a little bit more

But why can't Opal.def itself return 'a'? If it does, we would be able to get to this form:

Opal.queue(function(Opal) {/* Generated by Opal 1.3.1 */
  var self = Opal.top, $nesting = [], nil = Opal.nil, $$$ = Opal.$$$, $$ = Opal.$$, $def = Opal.def;

  return $def(self, '$a', function $$a() {
    var block = $$a.$$p || nil;

    delete $$a.$$p;

    ;
    return nil;
  }, 0)
});

We are getting much closer to plain JavaScript now.

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.253 -> 6.145 (change: -1.73%)
                      Run time: 0.285 -> 0.280 (change: -1.61%)
                   Bundle size: 5257437 -> 5028381 (change: -4.36%)
          Minified bundle size: 1264503 -> 1086206 (change: -14.10%)

Step 9. $$($nesting, 'Opal')['$coerce_to!'](self.$a(), self.$b(), self.$c()) what?

Oh, of course. It's our representation of:

Opal.coerce_to!(a,b,c)

Can we make it into:

var $Opal = Opal.Opal;
$Opal['$coerce_to!'](self.$a(), self.$b(), self.$c())

Those calls happen a lot in our corelib (let's say - a corelib is those parts of Ruby we don't have to require. stdlib is those libraries that is provided with Ruby but we have to require, like 'json'. We will now focus a lot on our corelib). What's a result then?

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.258 -> 6.068 (change: -3.04%)
                      Run time: 0.287 -> 0.280 (change: -2.47%)
                   Bundle size: 5257437 -> 5041275 (change: -4.11%)
          Minified bundle size: 1264503 -> 1083267 (change: -14.33%)

We also similarly optimized an access to a few more constants like Kernel, Object, BasicObject.

_Also in this step we renamed Opal.defineProperty to just Opal.prop. Not much improvement on its own though.`

Step 10. Top-level constant access optimization

What does ::String compile into? Of course into $$$("::", "String"). Why? "::" is a special value here. If we would do something like ::A::B, we would get $$$($$$("::", "A"), "B").

But why can't it become just... $$$("String")? It can. And our numbers now are:

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.269 -> 6.195 (change: -1.18%)
                      Run time: 0.287 -> 0.279 (change: -2.75%)
                   Bundle size: 5257437 -> 5020671 (change: -4.50%)
          Minified bundle size: 1264503 -> 1077085 (change: -14.82%)

Not too much, but... we also needed to replace a lot of calls in the corelib from String to ::String. This will follow in the next steps.

Step 11. Empty classes and modules.

Do empty classes happen a lot? Well - they do. Mostly when you define exceptions. Like we do:

class StandardError     < ::Exception; end
class EncodingError       < ::StandardError; end
class ZeroDivisionError   < ::StandardError; end
class NameError           < ::StandardError; end
class NoMethodError         < ::NameError; end
class RuntimeError        < ::StandardError; end
class FrozenError           < ::RuntimeError; end
class LocalJumpError      < ::StandardError; end
class TypeError           < ::StandardError; end
class ArgumentError       < ::StandardError; end
class UncaughtThrowError    < ::ArgumentError; end
class IndexError          < ::StandardError; end

The indentation denotes a hierarchy :)

This used to compile to this:

Opal.queue(function(Opal) {/* Generated by Opal 1.3.1 */
  var self = Opal.top, $nesting = [], nil = Opal.nil, $$$ = Opal.$$$, $$ = Opal.$$, $klass = Opal.klass;


  (function($base, $super, $parent_nesting) {
    var self = $klass($base, $super, 'StandardError');

    var $nesting = [self].concat($parent_nesting);

    return nil
  })($nesting[0], $$$('::', 'Exception'), $nesting);
  (function($base, $super, $parent_nesting) {
    var self = $klass($base, $super, 'EncodingError');

    var $nesting = [self].concat($parent_nesting);

    return nil
  })($nesting[0], $$$('::', 'StandardError'), $nesting);
  (function($base, $super, $parent_nesting) {
    var self = $klass($base, $super, 'ZeroDivisionError');

    var $nesting = [self].concat($parent_nesting);

    return nil
  })($nesting[0], $$$('::', 'StandardError'), $nesting);
(...yeah and it goes like this...)

The closure is kind of... unneeded, isn't it? Let's make it disappear for this particular special situation:

Opal.queue(function(Opal) {/* Generated by Opal 1.3.1 */
  var self = Opal.top, $nesting = [], nil = Opal.nil, $$$ = Opal.$$$, $$ = Opal.$$, $klass = Opal.klass;


  $klass($nesting[0], $$$('Exception'), 'StandardError');
  $klass($nesting[0], $$$('StandardError'), 'EncodingError');
  $klass($nesting[0], $$$('StandardError'), 'ZeroDivisionError');
  $klass($nesting[0], $$$('StandardError'), 'NameError');
  $klass($nesting[0], $$$('NameError'), 'NoMethodError');
  $klass($nesting[0], $$$('StandardError'), 'RuntimeError');
  $klass($nesting[0], $$$('RuntimeError'), 'FrozenError');
  $klass($nesting[0], $$$('StandardError'), 'LocalJumpError');
  $klass($nesting[0], $$$('StandardError'), 'TypeError');
  $klass($nesting[0], $$$('StandardError'), 'ArgumentError');
  $klass($nesting[0], $$$('ArgumentError'), 'UncaughtThrowError');
  return ($klass($nesting[0], $$$('StandardError'), 'IndexError'), nil);
});

Much better! And numbers?

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.202 -> 6.022 (change: -2.90%)
                      Run time: 0.285 -> 0.276 (change: -3.02%)
                   Bundle size: 5257437 -> 5011915 (change: -4.67%)
          Minified bundle size: 1264503 -> 1072799 (change: -15.16%)

Yay! 15%!

Step 12. unless becoming else?

Ok - let's take this expression:

true unless false

What will it compile to?

  if ($truthy(false)) {
  } else {
    true
  };

Well - makes some sense. Oh, you may ask do we need this $truthy call here? Well - in this particular example - we don't - but in general, JavaScript has different truthiness semantics. "" is falsy, 0 is falsy, nil is truthy (yeah - our nil is not JS null). But why an if and else branch. Let's do it better:

  if (!$truthy(false)) {
    true
  };

And the numbers are:

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.164 -> 6.048 (change: -1.87%)
                      Run time: 0.284 -> 0.277 (change: -2.62%)
                   Bundle size: 5257437 -> 4994938 (change: -4.99%)
          Minified bundle size: 1264503 -> 1072746 (change: -15.16%)

Not much better in the minified bundle size - Terser did a good job here. But in the following it didn't...

Step 13. Let's get out of the closure hell

a || b || c || d || e

Seems simple, right? Can we compile it to the same thing in JS? Oh well, we can't... we have different truthy semantics as mentioned above. And also, if there's a next call... you know a || continue is an invalid JavaScript? So... this code compiles to the following monster:

  if ($truthy(($ret_or_1 = (function() {if ($truthy(($ret_or_2 = (function() {if ($truthy(($ret_or_3 = (function() {if ($truthy(($ret_or_4 = self.$a()))) {
    return $ret_or_4
  } else {
    return self.$b()
  }; return nil; })()))) {
    return $ret_or_3
  } else {
    return self.$c()
  }; return nil; })()))) {
    return $ret_or_2
  } else {
    return self.$d()
  }; return nil; })()))) {
    return $ret_or_1
  } else {
    return self.$e()
  }

Well. That's a lot of functions. And expressions like next don't happen here. Can't we at least use a ternary operator where we can:

  if ($truthy(($ret_or_1 = ($truthy(($ret_or_2 = ($truthy(($ret_or_3 = ($truthy(($ret_or_4 = self.$a())) ? ($ret_or_4) : (self.$b())))) ? ($ret_or_3) : (self.$c())))) ? ($ret_or_2) : (self.$d()))))) {
    return $ret_or_1
  } else {
    return self.$e()
  }

This is still ugly. But we don't abuse the functions. Numbers:

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.269 -> 6.198 (change: -1.13%)
                      Run time: 0.286 -> 0.273 (change: -4.44%)
                   Bundle size: 5257437 -> 4868356 (change: -7.40%)
          Minified bundle size: 1264503 -> 1069576 (change: -15.42%)

We improved the performance quite a bit. And the un-Tersered code size - but Terser also gained a bit. We lost a bit of compiler performance though.

Step 14. Various small optimizations

Opal doesn't support mutable strings (we have a plan to support them in the near future!) - and so we alert the developer if he tries to access them. But it's a lot of method definitions. Let's compress them with a define_method loop.

We also sometimes compile empty files - called stubbing - so we can for example make require "yaml" not fail - even though we don't use YAML, but some compiled-in method does. Let's make those compiled files smaller.

Also, eval in JavaScript is considered harmful. Let's at least use a different facility to support Ruby instance_eval.

Result:

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.071 -> 5.965 (change: -1.76%)
                      Run time: 0.285 -> 0.271 (change: -4.75%)
                   Bundle size: 5259054 -> 4856724 (change: -7.65%)
          Minified bundle size: 1264953 -> 1067161 (change: -15.64%)

Not much, but it still gives us some headroom.

Step 15. || strikes again

Some libraries (like parser) happen to use || a lot. For each usage, we generate a new $ret_or_X where X > 0 variable. This is so we can save the left-hand-side expression and return it later, possibly. And we don't reuse them, so we get a very large var $ret_or_1, $ret_or_2, $ret_or_3 ... $ret_or_42; definition. Let's reuse those.

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.110 -> 5.972 (change: -2.25%)
                      Run time: 0.286 -> 0.271 (change: -5.20%)
                   Bundle size: 5259054 -> 4826837 (change: -8.22%)
          Minified bundle size: 1264953 -> 1058462 (change: -16.32%)

A nice improvement!

Step 16. More helperizing

In compiled Asciidoctor I found a lot of dynamic regexps. And we define them by Opal.regexp([a,b,c]). Let's make it just $regexp([a,b,c]) and let's shorten a lot of other definitions like this. At this point I noticed, that we don't run Terser with name mangling in effect. Let's change it just now. The numbers are compared to Opal 1.3.2.

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.089 -> 6.006 (change: -1.36%)
                      Run time: 0.285 -> 0.269 (change: -5.39%)
                   Bundle size: 5259054 -> 4824589 (change: -8.26%)
          Minified bundle size: 1264953 -> 1054974 (change: -16.60%)
            Mangled & minified: 812275 -> 732066 (change: -9.87%)

That's fair enough.

Step 17. Optimize instance variable access

For two reasons, we set @variables to nil by default if they are referenced. The first reason is obvious, @variable is compiled to self.variable and we don't want undefined values to creep in - they are not an object and in Ruby everything is an object - we want to keep that impression, so in Opal undefined doesn't exist (if you get undefined somewhere - you have hit a bug or accessed some low level interfaces). The second is to improve a shape for the JS engines to optimize the code better.

The problem is, that the code looks like this:

self.$$prototype.variable1 = self.$$prototype.variable2 = self.$$prototype.variable3 = self.$$prototype.variable4 = nil

Why not make it:

var $proto = self.$$prototype;
$proto.variable1 = $proto.variable2 = $proto.variable3 = $proto.variable4 = nil

Remember - $proto can be safely renamed. self.$$prototype can't.

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.086 -> 6.014 (change: -1.19%)
                      Run time: 0.284 -> 0.269 (change: -5.28%)
                   Bundle size: 5259054 -> 4823880 (change: -8.27%)
          Minified bundle size: 1264953 -> 1053776 (change: -16.69%)
            Mangled & minified: 812275 -> 730112 (change: -10.12%)

Step 18. #method_missing stubs definition optimization

How does #method_missing work on Opal? In JavaScript there's no facility for that. Well - we define so-called stubs, which means that for every call you want to make, we define a method on BasicObject that basically calls #method_missing. This way no method is missing and all calls success. And if you use a call like #send... we have an easier job here, but we don't want to use #send everywhere for performance reasons.

The stubs used to be defined this way, for every file:

Opal.add_stubs(["$hello", "$new", "$<"]);

Let's make it shorter:

Opal.add_stubs("hello,new,<");

This also helps the JS parsers. This is how Google Closure Compiler optimizes large arrays of Strings.

 Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.088 -> 6.008 (change: -1.30%)
                      Run time: 0.285 -> 0.271 (change: -5.08%)
                   Bundle size: 5259054 -> 4812284 (change: -8.50%)
          Minified bundle size: 1264953 -> 1044964 (change: -17.39%)
            Mangled & minified: 812275 -> 721296 (change: -11.20%)

Step 19. Hiding $$ and $$$.

What is $$$ - I explained in one of the earlier parts. But what is $$ - I haven't. This is a relative constant access function. This is a bit less performant, because we have to iterate thru every class and module we are in and their ancestors - and Object and its ancestors as well. We store a list of modules and classes in a $nesting variable. And then we can call $$($nesting, "String") to find our String - because - maybe it is an Array::String? Well, we know it isn't, so we have to change our corelib furthermore a lot. And then - suddenly - some files don't need $$, so we don't need to helperize it.

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.082 -> 5.964 (change: -1.94%)
                      Run time: 0.284 -> 0.270 (change: -4.92%)
                   Bundle size: 5259054 -> 4811653 (change: -8.51%)
          Minified bundle size: 1264953 -> 1041721 (change: -17.65%)
            Mangled & minified: 812275 -> 720161 (change: -11.34%)

Step 20. $nesting - do we need it?

Sometimes though, we don't even need $nesting to be computed. If our class is small, doesn't have classes defined in its namespace and we don't reference constants relatively, we may skip computing $nesting altogether.

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.090 -> 5.952 (change: -2.27%)
                      Run time: 0.284 -> 0.269 (change: -5.26%)
                   Bundle size: 5259054 -> 4806185 (change: -8.61%)
          Minified bundle size: 1264953 -> 1036887 (change: -18.03%)
            Mangled & minified: 812275 -> 717787 (change: -11.63%)

Step 21. Curry $$

I came upon an idea, that the $$ method can be curried. Of course, this moves its definition from the top level scope to the class/module scope so it means it may happen more often. So now we don't call $$($nesting, "String"), but we can simply call $$("String") because $$ is defined with $nesting. Do we get any optimization from that, then?

 Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.097 -> 5.976 (change: -1.99%)
                      Run time: 0.285 -> 0.270 (change: -5.41%)
                   Bundle size: 5259054 -> 4795627 (change: -8.81%)
          Minified bundle size: 1264953 -> 1027837 (change: -18.75%)
            Mangled & minified: 812275 -> 716231 (change: -11.82%)

Yes. And quite a big one if we don't mangle variable names.

Step 22. Interpolated strings optimization

What do we do with strings like "aaaa#{true}" (also called dstrs)? Of course, we compile them to:

"" + "aaaa" + (true)

Why does it make sense? Also, how comes this thing can use the + operator? Well, let me explain. In JavaScript, "" + obj is actually equivalent to "" + obj.toString(). And toString() for Opal objects call #to_s - so this is exactly what "aaaa#{true} does in Ruby.

And you may say - ok, for strings like "#{5}" (being compiled to "" + 5) this makes sense. But if the first part of a dstr is a string, we don't need this "". Yes - though, Terser applies this optimization, so there's 0 gain there.

 Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.087 -> 5.959 (change: -2.10%)
                      Run time: 0.284 -> 0.269 (change: -5.45%)
                   Bundle size: 5259054 -> 4783491 (change: -9.04%)
          Minified bundle size: 1264953 -> 1027837 (change: -18.75%)
            Mangled & minified: 812275 -> 716231 (change: -11.82%)

Step 23. Hide $parent_nesting if it's not needed

This is a small one. But this is the last one in this patch series. Let's conclude it with compilation of this Ruby code:

class A
  def x
  end
end

Opal 1.3.2 outputs this:

Opal.queue(function(Opal) {/* Generated by Opal 1.3.2 */
  var self = Opal.top, $nesting = [], nil = Opal.nil, $$$ = Opal.$$$, $$ = Opal.$$, $klass = Opal.klass;

  return (function($base, $super, $parent_nesting) {
    var self = $klass($base, $super, 'A');

    var $nesting = [self].concat($parent_nesting), $A_x$1;

    return (Opal.def(self, '$x', $A_x$1 = function $$x() {
      var self = this;

      return nil
    }, $A_x$1.$$arity = 0), nil) && 'x'
  })($nesting[0], null, $nesting)
});

This patchset makes it output the following:

Opal.queue(function(Opal) {/* Generated by Opal 1.3.2 */
  var $nesting = [], nil = Opal.nil, $klass = Opal.klass, $def = Opal.def;

  return (function($base, $super) {
    var self = $klass($base, $super, 'A');


    return $def(self, '$x', function $$x() {

      return nil
    }, 0)
  })($nesting[0], null)
});

Therefore we skip one variable more. And while some JS minifiers may find this thing and optimize it out itself, some don't

Conclusion

After this patchset is merged, Opal will produce much cleaner code with much lesser complexity that you can read much easier without knowledge of how Opal actually works under the hood. If you know Ruby, you are likely to know what $super means in this particular code (if you don't, it means a superclass, which A doesn't have set). So, to conclude. What are the total gains from this entire patchset?

Comparison of the Asciidoctor (a real-life Opal application) compile and run:
                  Compile time: 6.073 -> 5.956 (change: -1.92%)
                      Run time: 0.284 -> 0.269 (change: -5.46%)
                   Bundle size: 5259054 -> 4781496 (change: -9.08%)
          Minified bundle size: 1264953 -> 1026844 (change: -18.82%)
            Mangled & minified: 812275 -> 715972 (change: -11.86%)

Of course - the numbers will depend on what you compile with Opal and how you minimize (or not). I tried compiling Opal-Parser and the size numbers reached about 15%. And you will get about 5% better performance (note - the performance gains are computed on a non-minified bundle, so if you minify you may get even better performance gains).

This doesn't end the optimization efforts we have - there are still some ideas that weren't realized in this patchset.

This patchset is located here. If you are interested in writing compilers, reading the source code of Opal compiler may prove useful - it's relatively lightweight, well organized and it's all Ruby! All despite the fact, that Ruby is one of the hardest to parse programming languages in existence (if not the hardest) - all lexing and parsing happens in a wonderful parser library which is also used by RuboCop and many other gems!

Opal 1.3

We have released Opal v1.3.0 a week ago and today we are releasing Opal v1.3.1 containing a few last minute fixes. We plan Opal v1.3 to be supported with a best-effort principle for a longer term, backporting bugfixes.

This is quite a big release with a special focus on developer tools and error reporting improvements.

Preliminary ES Modules support

Although we didn't yet fully integrate with Webpack as originally planned, Jan Biedermann (of the Isomorfeus project) contributed some preliminary ES Modules support.

If you use Isomorfeus you can take advantage of it with its ESBuild integration and the newly added autoload hooks, which finally works without requiring a patched version of Opal.

If you don't use it you can still roll your own integration, making autoload dynamically fetch modules from the server as they're imported.

We are planning to work on a proper import and npm package support for the next release, along with Ruby 3.1 support.

Newly supported language features

We already support a lot of Ruby 3.0 features. This release fills a few missing gaps, most of which will only be used in very niche usecases.

Refinements

Refinements are now mostly supported. If you don't like monkey patching, this feature can improve your codebase.

binding

Binding allows you to take a peek at local variables in a given scope. You probably most likely know it for binding.irb or binding.pry - while we don't support those yet, it's now only a matter of time when it gets implemented - all prerequisites are there already.

autoload

Autoload statements will cause a file to be included inside your Opal bundle, but won't load it instantly. We moved a bigger deal of corelib to benefit from faster load times.

This support can also be used to load files dynamically, like opal-zeitwerk does in Isomorfeus.

retry

Due to an asynchronous nature of JavaScript, it probably won't have much use here. But still, you can now use this statement to restart the code in a rescue clause.

super improvements

The following code used to call a super function with (a,b,c) arguments, now it correctly calls it with (4,b,c). Also a few more edge cases were corrected.

def method(a,b,c)
  a = 4
  super
end

stdin and gets support

We have now a larger synchronous IO support. gets is implemented on a browser via prompt (aka that annoying alert popup with a text field), on other platforms, including headless chrome, it bridges stdin properly.

Reminder: for basic support of other platforms than browser, you need to require "opal/platform". For extended support of NodeJS, you need to require "nodejs".

$ opal -ropal/platform -e 'p gets'
123
"123\n"

Implementing gets is all that's needed to make a runner able to be ran in the REPL mode

Global variable aliases

Certainly a lesser known feature of Ruby. Did you know it's possible to write this:

alias $PROGRAM_NAME $0
$PROGRAM_NAME = "abc" # This changes $0 as well!

Opal has you covered.

Flip-flop

This Perl-inspired feature was almost removed in Ruby 3.0. But due to some interest, it was actually fixed. Opal now supports this as well! Have you never heard of this feature? That's fair, it mostly isn't present in production code, but can be used for writing unreadable code like this:

a=b=c=(1..100).each do |num|
  print num, ?\r,
    ("Fizz" unless (a = !a) .. (a = !a)),
    ("Buzz" unless (b = !b) ... !((c = !c) .. (c = !c))),
    ?\n
end

Toolkit improvements

opal-repl

The opal-repl command line tool gained a number of improvements:

  • Windows support
  • Colored output
  • Support for printing native JS Objects / null / undefined
  • Pry-like ls support
  • Support for everything that opal command line tool supports
  • Support for multiple runners: nodejs, chrome, gjs, quickjs, miniracer (used to be miniracer only)
$ opal-repl
>> ls 123
Comparable#methods: between?  clamp
Numeric#methods: __coerced__  clone  conj  conjugate  div  dup  i  imag  imaginary  polar  pretty_print  pretty_print_cycle  real  real?  rect  rectangular  step  to_c  to_json  to_n
Number#methods: %  &  *  **  +  [email protected]  -  [email protected]  /  <  <<  <=  <=>  ==  ===  >  >=  >>  []  ^  __id__  abs  abs2  allbits?  angle  anybits?  arg  bit_length  ceil  chr  coerce  denominator  digits  divmod  downto  eql?  equal?  even?  fdiv  finite?  floor  gcd  gcdlcm  infinite?  inspect  instance_of?  integer?  is_a?  kind_of?  lcm  magnitude  modulo  nan?  negative?  next  nobits?  nonzero?  numerator  object_id  odd?  ord  phase  positive?  pow  pred  quo  rationalize  remainder  round  size  succ  times  to_f  to_i  to_int  to_r  to_s  truncate  upto  zero?  |  ~
>>
$ opal-repl -Rchrome
>> require 'native'
=> true
>> $$[:navigator][:userAgent]
=> "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/95.0.4638.54 Safari/537.36"
>>

New runners

Runners are how you can use different environments to run Opal code on, using our opal command line tool. We added support for 3 new runners:

  • gjs - the GNOME JavaScript engine based on Gecko's JS engine
  • quickjs - an independently developed JavaScript engine with a good embedding support: https://bellard.org/quickjs/
  • miniracer - Ruby bindings to pure V8 (used to be supported only for repl)
$ opal -ropal/platform -Rchrome -e 'p OPAL_PLATFORM'
"headless-chrome"
$ opal -ropal/platform -Rnodejs -e 'p OPAL_PLATFORM'
"nodejs"
$ opal -ropal/platform -Rgjs -e 'p OPAL_PLATFORM'
"gjs"
$ opal -ropal/platform -Rquickjs -e 'p OPAL_PLATFORM'
"quickjs"
$ opal -ropal/platform -Rminiracer -e 'p OPAL_PLATFORM'
"opal-miniracer"

While the support for those is preliminary and works only with stdio, there is a possibility to extend those implementations when needed (patches welcome!). GJS is now one of the primary GNOME development platforms, powering software like Flatseal, allowing you to bridge all GNOME libraries, like Glib or GTK4 (making it a fully featured Node.js alternative, though much more low level and a lot less popular - but that's a digression).

Source map debugger

We have improved the source map support in this release. It should now much more accurately tell you which line/column is being executed. Which means - easier debugging! And that's not all, if source mapping goes wrong, you can always use our new source map debugging tool:

$ opal --debug-source-map -e '[1,2,3,4].each { |i| p i }'
https://sokra.github.io/source-map-visualization/#base64,T3BhbC5xdWV1ZShmdW5jdGlvbihPcGFsKSB7LyogR2VuZXJhdGVkIGJ5IE9wYWwgMS4zLjAgKi8KICB2YXIgJCQxLCBzZWxmID0gT3BhbC50b3AsICRuZXN0aW5nID0gW10sIG5pbCA9IE9wYWwubmlsLCAkJCQgPSBPcGFsLiQkJCwgJCQgPSBPcGFsLiQkLCAkc2VuZCA9IE9wYWwuc2VuZDsKCiAgT3BhbC5hZGRfc3R1YnMoWyckZWFjaCcsICckcCddKTsKICByZXR1cm4gJHNlbmQoWzEsIDIsIDMsIDRdLCAnZWFjaCcsIFtdLCAoJCQxID0gZnVuY3Rpb24oaSl7dmFyIHNlbGYgPSAkJDEuJCRzID09IG51bGwgPyB0aGlzIDogJCQxLiQkczsKCiAgICAKICAgIAogICAgaWYgKGkgPT0gbnVsbCkgewogICAgICBpID0gbmlsOwogICAgfTsKICAgIHJldHVybiBzZWxmLiRwKGkpO30sICQkMS4kJHMgPSBzZWxmLCAkJDEuJCRhcml0eSA9IDEsICQkMSkpCn0pOwo=,eyJ2ZXJzaW9uIjozLCJzb3VyY2VSb290IjoiIiwic291cmNlcyI6WyItZSJdLCJzb3VyY2VzQ29udGVudCI6WyJbMSwyLDMsNF0uZWFjaCB7IHxpfCBwIGkgfSJdLCJuYW1lcyI6WyI8bWFpbj4iLCJlYWNoIiwiMSIsIjIiLCIzIiwiNCIsImJsb2NrIGluIDxtYWluPiIsImkiLCJibG9jayAoMiBsZXZlbHMpIGluIDxtYWluPiIsInNlbGYiLCJwIl0sIm1hcHBpbmdzIjoiQUFBQUEsMkJBQUFBO0VBQUFBOztFQUFBQTtFQUFBQSxPQUFTQyxNQUFULENBQUNDLENBQUQsRUFBR0MsQ0FBSCxFQUFLQyxDQUFMLEVBQU9DLENBQVAsQ0FBU0osUUFBQUEsRUFBQUEsRUFBQUEsRUFBTUssZ0JBQUdDLENBQUhELEVBQUFFOzs7O0lBQUc7SUFBQTtJQUFBO0lBQUdBLE9BQUFDLElBQUFDLEdBQUFBLENBQUVILENBQUZHLEVBQU5KLGtCQUFBQSxpQkFBQUEsS0FBTkw7QUFBVEQ7In0=,WzEsMiwzLDRdLmVhY2ggeyB8aXwgcCBpIH0=

This utility can be also used to help you understand which Ruby code corresponds to which JavaScript output, for educational purposes.

Cache for Opal::Builder

Sprockets doesn't use Opal::Builder - it has its own cache, though it may need to be configured.

Upgrade to Opal 1.3 and forget the long compilation times when using opal command line tool or Opal::Builder thru any other means (like Opal::SimpleServer). This is implemented by caching the marshaled Compiler objects.

4x improvement for a simple script:

$ time opal _1.2.0_ -e 'p 123'
123

real    0m2.321s
user    0m2.213s
sys    0m0.107s
$ time opal _1.3.0_ -e 'p 123'
123

real    0m0.537s
user    0m0.450s
sys    0m0.090s

8x when including more core, e.g. opal-parser:

$ time opal _1.2.0_ -ropal-parser -e 'p eval("(0..1)")'
Object freezing is not supported by Opal
0..1

real    0m12.555s
user    0m12.250s
sys    0m0.349s
$ time opal _1.3.0_ -ropal-parser -e 'p eval("(0..1)")'
Object freezing is not supported by Opal
0..1

real    0m1.512s
user    0m1.405s
sys    0m0.249s

Async/await support

This big feature is experimental, and may totally change in future releases of Opal.

This complements the PromiseV2 feature introduced in Opal v1.2 and uses magic comments and a supported library (aptly named async) to await on JavaScript native promises.

Check it out:

# await: sleep
require "await"

puts "Let's wait 2 seconds..."
sleep 2
puts "Done!"

Please consult documentation of this feature: https://opalrb.com/docs/guides/v1.3.0/async

We are leaving this as experimental, because, as an Opal specific feature. Although it's not supported by Ruby itself the syntax is fully compatible, maybe some unexpected breakthrough will happen, with a port to a gem or to the language itself.

Conclusion

Are you starting to make a new web application and don't want to be forced to write JavaScript on the frontend?

If you are adventurous, why not try Opal? You may be surprised how ready it is for your usecase! :D


Resources

Opal 1.2

After about 11 years of development we are proud to announce the Opal 1.2 release!

Opal is a Ruby to JavaScript compiler, Ruby runtime and an idiomatic Ruby API for interfacing Browser/Node/anything that uses JS.

Notable new features

Full Ruby 3.0 support! (almost)

We missed a blog post about Opal 1.1, which was released with a preliminary support for Ruby 3.0 features, but Opal 1.2 finalizes this support. Let me update you about them!

An interesting thing is that you can still run older versions of Ruby on the backend, but use the new Ruby 3.0 features when writing Opal code for your frontend.

End-less method definition (Opal 1.1)

No more writing code like this:

def ten
  10
end

Write it in one line!

def ten = 10

Beginless and endless ranges (Opal 1.2)

While those were introduced in earlier Ruby versions (2.6, 2.7), we didn't have those, and now we do! You can absolutely call a[1..] instead of a[1..-1] now!

Numblocks (Opal 1.1)

Also a Ruby 2.7 feature. Personally, I don't like this syntax. Nevertheless, you can use it in Opal now!

[1,2,3].map { _1 ** 2 }

Argument forwarding support def a(...); b(...); end (Opal 1.1)

We support 3.0 level of argument forwarding. No more writing code like

def example(arg1, arg2, *args, **kwargs, &block)
  @arg1, @arg2 = arg1, arg2
  Klass.new(:begin, *args, **kwargs, &block)
end

...when you can write it like this:

def example(arg1, arg2, ...)
  @arg1, @arg2 = arg1, arg2
  Klass.new(:begin, ...)
end

Pattern matching (Opal 1.2)

We support 3.0 level of pattern matching. All kinds of patterns have been fully implemented, including the find pattern.

Pattern matching is optional, if you want to use it, you will need to add require "corelib/pattern_matching" to your application.

Do note, that the find pattern ([1, 2, 3, "needle", 4] => [*, String => a, *] to _find String and save it to a local variable a) and one-line 1 in a syntax are deemed experimental in Ruby 3.0 and may change their behavior in the future_

Methods

There are quite a lot of new methods, some of which got introduced in Ruby 3.0, some earlier.

Opal 1.2: - {Random,SecureRandom}#{hex,base64,urlsafe_base64,uuid,random_float,random_number,alphanumeric} - String#{[email protected],[email protected]} - Hash#except! - {Array,Hash,Struct}#{deconstruct,deconstruct_keys}

Opal 1.1: - Array#{difference,intersection,union} - Enumerable#{filter_map,tally} - Kernel#then - {Proc,Method}#{<<,>>} - Integer#to_d

Breaking changes

Since Opal 1.2, we now follow the Ruby 3.0 semantics for Array#{to_a,slice,[],uniq,*,difference,-,intersection,&,union,|} and therefore those don't return an Array subclass anymore, but simply an Array.

What isn't there

We are still using keyword arguments in the Ruby 2.6 style. We want the future releases to warn on the incompatibilities, just like Ruby 2.7 did, to provide a smooth migration curve.

Ractor isn't implemented and likely won't be anytime soon. JavaScript has a very different threading model to Ruby.

ObjectSpace finalizer support and ObjectSpace::WeakMap implementation (Opal 1.2)

A destructor in Ruby? And also in a niche JavaScript Ruby implementation? Yes!

You need to do `require "corelib/objectspace"` to use those features._

begin ... end while ... (Opal 1.2)

Matz doesn't like this feature as it is inconsistent with how other Ruby postfix flow control constructs work. But it looks similar to do { ... } while(...) - and this is how it works. Now, also in Opal!

Native Promises (Opal 1.2 - experimental)

When Opal was first conceived, JavaScript didn't have Promises (or they weren't so universally supported), so we made our own. Now, that Promises in JavaScript are a big thing, well integrated into the language, we made a new kind of Promise (PromiseV2), that bridges the native JavaScript promise and is (mostly) compatible with the current Opal Promise (later called PromiseV1).

By bridges I mean that conversion between JavaScript and Opal values is seamless. Just as you can pass numbers, strings or arrays between those two ecosystems, Promises (only PromiseV2) will join that pack now.

There are some slight incompatibilities which is why we include both versions, and PromiseV2 is defined as experimental. This allows us to state, that it's behavior may change in the future (until deemed un-experimental). This is so also to foster a discussion about how to best support async/await in the next release.

There's a (mostly done) CoffeeScript approach to async/await available as a patch (not in the 1.2 release though), but we may as well take a different approach.

To use PromiseV2, require "promise/v2" and call PromiseV2.

Promise will become aliased to PromiseV2 instead of how it's currently aliased to PromiseV1 in Opal 2.0. You can require "promise/v1" and call PromiseV1 if you are certain that you want to stay with the old Promise. We will take a similar approach to redesigning other APIs (like NativeV2, BufferV2, etc.) in the upcoming minor releases.

The libraries in our ecosystem will soon be updated to use PromiseV2 if it's loaded instead of the PromiseV1. Do note, that PromiseV1 and PromiseV2 aren't to be mixed together, in your application you should pick one or the other. We suggest using Promises in your libraries this way:

module YourApp
  Promise = defined?(PromiseV2) ? PromiseV2 : Promise

  def using_promise
    Promise.value(true).then do
      ...
    end
  end
end

Other changes and statistics

Performance

Based on the statistics we made, Opal (both the runtime and the compiler) used to decrease in performance the more compatible with Ruby it became. This is not the case this time, we made quite a lot of performance improvements in this release. Opal 1.2 is more performant than 1.1 for many real world uses, and similar in performance to Opal 1.0. We will try to continue this trend in the future releases.

Issues

Opal 1.2 is mostly a bugfix release. We fixed a lot of long-standing bugs. Maybe it's time to try Opal again? Here we have a chart:

Issues chart

Roadmap

We didn't fully meet the expectations of the roadmap as described in the Opal 1.0 post. Better than postpone the Opal 1.1 release, we decided to release what we have (and we had quite a lot).

We have partially accomplished the code minification with an external (and experimental) opal-optimizer gem.

There are solutions for Webpack integration, but those aren't upstream. We want to gladly invite anyone interested to contribute during the Opal 1.3 development phase which we mainly want to dedicate towards accomplishment of that goal.

Opal 2.0 will hide Ruby methods behind a (JavaScript) symbol wall increasing compatibility with JavaScript libraries.

String are still immutable, (Regexps are still somewhat incompatible)[https://opalrb.com/blog/2021/06/26/webassembly-and-advanced-regexp-with-opal/], asynchronous methods can't be implemented in a Ruby synchronous style. Those features we plan for Opal 2.0 or later on, as they will require a lot of changes.

Ecosystem

All Gems from the opal-* namespace are compatible with the Opal 1.2 release. The new opal-browser version is up for a release soon, until then, we recommend you use the master branch!

From the outside of the core ecosystem, we have seen quite a lot of development, there are at least a few very interesting things that were done using Opal:

  • Glimmer DSL for Opal - an (experimental) platform-agnostic (web with Opal, desktop with SWT or Tk) toolkit for graphical applications. Write once, run anywhere!
  • 18xx.games - a website where you can play async or real-time 18xx games
  • Hyperstack - a well documented and stable Ruby DSL for wrapping React and ReactRouter as well as any JavaScript library or component. No need to learn JavaScript!

Conclusion

If you are unsure yet, Opal doesn't just feel like Ruby, it IS Ruby. It is absolutely possible to create a website with pure Ruby and not a single line of JavaScript (and a good performance!). In the advent of solutions like Hotwire, Opal goes a step higher and allows you to create fully isomorphic websites (with a fully fledged support for human language alike DSLs). Forget context switches from JavaScript, TypeScript, or even CoffeeScript to Ruby when you develop a feature, it's all Ruby!

If you are daring, you can even hook up the Opal compiler - written in a very clean Ruby - to add macro facilities to accommodate your application!

WebAssembly and advanced regular expressions with Opal

(This is a guest-post from the people at Interscript, featuring an in-depth account of the work done around building a web-assebly bridge and compiling Onigmo for Opal)


At Ribose Inc we develop Interscript, an open source Ruby implementation of interoperable transliteration schemes from ALA-LC, BGN/PCGN, ICAO, ISO, UN (by UNGEGN) and many, many other script conversion system authorities. The goal of this project is to achieve interoperable transliteration schemes allowing quality comparisons.

We decided to port our software to JavaScript using Opal (the Ruby to JavaScript compiler), so it can be also used in web browsers and Node environments. The problem is - Opal translates Ruby regular expressions (upon which we rely quite heavily) to JavaScript almost verbatim. This made our ported codebase incompatible on principle, so we searched for a better solution.

Unfortunately, Regexp is basically something like a programming language that has more than a dozen of incompatible implementations - even across the web browsers. For instance, we need lookbehind assertions, but even if there is a new standard in ECMAScript which adds lookbehind assertions, Safari doesn't implement that.

Given all this context let's dive into how we ported the original Ruby Regexp engine to the browser!

Onigmo

We started by trying to compile Onigmo with WebAssembly.

Onigmo is a Regexp engine that is used by Ruby. It is a fork of Oniguruma, which is also in use by PHP and a few more programming languages. Fortunately, it's possible to compile it to a static WebAssembly module which can be interfaced with the JavaScript land.

We tried compiling Onigmo using a simple handcrafted libc with no memory management so as to reduce the size, but this plan backfired, and rightfully so!

Now we use wasi-libc. WASI stands for WebAssembly System Interface, and is designed to provide "a wide array of POSIX-compatible C APIs".

The library is made to be able to work with both wasi-libc and the handcrafted libc, but use of wasi-libc is highly encouraged. As we are concerned about the output size of the resulting WASM binaries, we chose not to use Emscripten, just the upstream LLVM/Clang and its WASM target.

Opal-WebAssembly

After getting Onigmo, we noted, that the WebAssembly interface doesn't map 100% between C and JS. We can't pass strings verbatim and we need to do memory management (think: pointers). Is there a better solution for that than writing an Opal library to interface WebAssembly libraries?

The library is divided in two parts: there's a simple WebAssembly interface and there's a Ruby-FFI compatible binding that works on everything memory related and brings the C functions to seamlessly work with the Ruby (Opal actually) workflow.

The library has advanced beyond just being usable for this project. It should be quite compatible with Ruby-FFI allowing C API bindings across all Ruby implementations. There are some minor incompatibilities though.

Ruby-FFI assumes a shared memory model. WebAssembly has different memory spaces for a calling process and each library (think about something like a segmented memory). This makes some assumptions false.

For instance, for the following code, we don't know which memory space to use:

FFI::MemoryPointer.new(:uint8, 1200)

This requires us to use a special syntax, like:

LibraryName.context do
  FFI::MemoryPointer.new(:uint8, 1200)
end

This context call makes it clear that we want this memory to be alocated in the "LibraryName" space.

Another thing is that a call like the following:

FFI::MemoryPointer.from_string("Test string")

Would not allocate the memory, but share the memory between the calling process and the library. In Opal-WebAssembly we must allocate the memory, as sharing is not an option in the WASM model. Now, another issue comes into play. In regular Ruby a call similar to this should allocate the memory and clear it later, once the object is destroyed. In our case, we can't really access JavaScript's GC. This means we always need to free the memory ourselves.

Due to some Opal inadequacies, we can't interface floating-point fields in structs. This doesn't happen in Onigmo, but if needed in the future, a pack/unpack implementation for those will be needed.

Chromium browser doesn't allow us to load WebAssembly modules larger than 4KB synchronously. This means that we had to implement some methods for awaiting the load. This also means, that in the browser we can't use the code in a following way:

<script src='file.js'></script>
<script>
  Opal.Library.$new();
</script>

This approach works in Node and possibly in other browsers, but Chromium requires us to use promises:

<script src='file.js'></script>
<script>
  Opal.WebAssembly.$wait_for("library-wasm").then(function() {
    Opal.Library.$new();
  });
</script>

There are certain assumptions of how a library should be loaded on Opal side - the FFI library creation depends on the WebAssembly module being already loaded, so we need to either move those definitions to a wait_for block or move require directives, like so:

WebAssembly.wait_for "onigmo/onigmo-wasm" do
  require 'interscript'
  require 'my_application_logic'
end

The source for opal-webassembly is available at https://github.com/interscript/opal-webassembly.

Opal-Onigmo

After having a nice library to bind with WebAssembly modules, writing an individual binding was very easy and the resulting code looks (in my opinion) very cool.

Our initial plan assumed upstreaming the code later on. I don't think it will be possible or healthy. This library should stay as a separate gem for a couple of reasons.

First is that due to the memory issues, we aren't able to make it work as a drop-in replacement. We need to manually call an #ffi_free method. Eg:

re = Onigmo::Regexp.new("ab+")
# use the regular expression
re.ffi_free # free it afterwards and not use it anymore

At early stages our implementation of Opal-Onigmo we didn't consider the memory a problem. When hit with a real world scenario, we found out, that it's a severe issue and needs to be dealt with. As far as we know, the library doesn't leak any memory if the regular expression memory is managed correctly.

The second is that after all, we don't really have a way of caching the compiled Regexps. Furthermore, Onigmo compiled with WASM may not be as performant as the highly optimized JS regexp engine. In this case it's much better to leave it as a drop-in replacement for those who need more correctness.

Opal-Onigmo doesn't implement all the methods for Ruby Regexp, it was mostly meant for completion of the Interscript project, but can be extended beyond. It implements a few methods it needs to implement for String (this is just an option - you need to load onigmo/core_ext manually), but most of the existing ones work without a problem. We implemented a Regexp.exec (JavaScript) method, and the rest of Opal happened to mostly interface with it. At the current time we know that String#split won't "just" work, but String#{index,rindex,partition,rpartition} should.

Opal-Onigmo depends on the strings being coded as UTF-16. There are two reasons to that:

  1. Opal includes methods for getting the binary form of strings in various encodings, but only methods for UTF-16 are valid for characters beyond the Basic Multilingual Plane (Unicode 0x0000 to 0xffff) which are used in 2 maps.
  2. JavaScript uses UTF-16 strings internally.

The source for opal-onigmo is available at https://github.com/interscript/opal-onigmo.

Interscript

Using Opal-Onigmo we made it so that it passes all the tests (not counting transliterating Thai scripts which ultimately depends on an external process, which relies on machine learning). To optimize it, we use Opal-Onigmo only when the regexp is a more complex regexp, otherwise we fall back to an (ultimately faster) JavaScript regexp engine:

def mkregexp(regexpstring)
  @cache ||= {}
  if s = @cache[regexpstring]
    if s.class == Onigmo::Regexp
      # Opal-Onigmo stores a variable "lastIndex" mimicking the JS
      # global regexp. If we want to reuse it, we need to reset it.
      s.reset
    else
      s
    end
  else
    # JS regexp is more performant than Onigmo. Let's use the JS
    # regexp wherever possible, but use Onigmo where we must.
    # Let's allow those characters to happen for the regexp to be
    # considered compatible: ()|.*+?{} ** BUT NOT (? **.
    if /[\\$^\[\]]|\(\?/.match?(regexpstring)
      # Ruby caches its regexps internally. We can't GC. We could
      # think about freeing them, but we really can't, because they
      # may be in use.
      @cache[regexpstring] = Onigmo::Regexp.new(regexpstring)
    else
      @cache[regexpstring] = Regexp.new(regexpstring)
    end
  end
end

It also never frees the regexps (see a previous note about #ffi_free), because we never know if a Regexp won't be in use later on (and the Regexps are actually cached in a Hash for performance reasons). The issue about dangling Regexps can be worked out in the future, but the JS API will need to change again. We would need to do something like:

Opal.Interscript.$with_a_map("map-name", function() {
  // do some work with a map
});

This call would at the beginning allocate all the Regexps needed, and at the end, free them all. The good news is that we would be able to somehow integrate loading transliteration maps from the network (along with dependencies) with such a construct.

The future

Post writing this article we noted that JavaScript actually does implement a construct that would work like a destructor, allowing us to free the allocated memory dynamically. Unfortunately, that's the latest ECMAScript addition, which means there are still environments that don't support it (Safari) and there is one that needs an explicit flag (Node 13+).

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/FinalizationRegistry

We could use it to implement some parts of ObjectSpace of Ruby and then use it in opal-webassembly to free memory on demand.

Afterwords

This article was written long before it was published. Since then, Interscript was rewritten to a totally different architecture and doesn't use Opal anymore. We don't use Regexps directly anymore, but we created a higher-level (Ruby) DSL to describe the transliteration process that we compile directly to a highly-optimized pure Ruby/JavaScript code (and it can be extended to other languages as well). Ribose Inc still uses Opal in other projects, for example to build Latexmath, a library that converts LaTeX math expressions to MathML, as a JavaScript library. We also contribute fixes back to the upstream Opal project.

For the Opal project, all this effort serves as an interesting experiment to establish further guidelines should we decide to increase Regexp compatibility in the future and can serve as a useful tool for anyone wanting to port his Ruby codebase with a heavy regexp use to JavaScript. It should also facilitate porting libraries that use Ruby-FFI currently.

The libraries we created are available under a 2-clause BSD license in the following repositories:

  • https://github.com/interscript/Onigmo - Onigmo port to WebAssembly
  • https://github.com/interscript/opal-onigmo - the Onigmo interface to Opal
  • https://github.com/interscript/opal-webassembly - the FFI-like interface to Opal, using WebAssembly
  • https://github.com/interscript/interscript/tree/v1 - the obsolete v1 branch of Interscript that used Opal and Opal-Onigmo

Opal 1.0

Dear Opalists of the world,
the time has come, 1.0 has been released!

This is, of course, a really important milestone, one that I've been waiting on for about seven years. I started advocating for releasing version 1.0 back in 2012 when Opal was still at version 0.3. After all, at that time I already had code in production, and according to Semver (which by then was still a new thing) that's one of the criteria for releasing 1.0:

How do I know when to release 1.0.0?

If your software is being used in production, it should probably already be 1.0.0. If you have a stable API on which users have come to depend, you should be 1.0.0. If you’re worrying a lot about backward compatibility, you should probably already be 1.0.0.

https://semver.org/#how-do-i-know-when-to-release-100

I was so proud and excited about this ability to use Ruby for frontend code! I was writing the logic for an in-page product filter and Ruby allowed me to solve the problem quickly, with its signature concise syntax, and leveraging the power of enumerable. That immensely reduced the lines of code I had to write and allowed me to concentrate on just the core of the problem.

Many years have passed, and I'm even more proud of the project that Opal has become, the maturity and the feature parity with MRI is astounding, the new features that are coming in version 1.0 are even more amazing, and the roadmap ahead makes Opal one of the best choices among compile-to-JS languages.


Before diving into each new feature I'd like to give immense credit for the MVP of this release, namely our good Ilia Bylich! Without his unrelenting work and deep understanding of Ruby, we wouldn't be where we are. Thanks man! 👏👏👏

Notable New Features

continue reading…

Opal-RSpec v0.7, has been finally released! ✨

It's been a deep work of refactoring and rewriting, adding specs and updating dependencies. Huge kudos to @wied03 for the foundational work and as usual to @iliabylich for the laser sharp refactoring.

For a full list of changes and updated instructions please checkout the Changelog and the Readme.

New opal-rspec command

A new CLI executable has been added with basic functionality to aid writing and running specs. Type opal-rspec -h for the complete list of options.

One notable addition is the --init option that will help you initialize your project with opal-rspec.

New folder defaults 📂

With this update I'm also announcing the new standard paths for opal lib/ and spec/ folders, which are lib-opal/ and spec-opal. This will avoid the confusion about the role of the two folders and make them nicely stay near they're MRI counterparts in your editor.

In this spirit the opal-rspec command and all rake tasks will look for spec files in spec-opal/.

Note: those defaults are intended as the current best practice to organize an Opal project.


🗓 More than a year has passed since the last post, if you missed Opal, that's because most of the news circulated on the Slack channel, that's my fault! But rejoice! new posts are coming and the 1.0 release is around the corner 😄

Link to the official chat updated on 2020/04/28.

Opal-RSpec 0.5: Newer RSpec version, improved Rake task, better documentation

Opal-RSpec

If you're a Rubyist, you know about RSpec. It's the testing framework that most of us like. RSpec's code base also happens to use a lot of features of the Ruby language, which means getting it to work on Opal is a challenge, but to some extent, we're there! Even if you're writing code in ES5/ES6, you might enjoy using RSpec as an alternative to Jasmine. Opal's native JS functionality makes that fairly easy to do.

What's new?

There were 394 commits since opal-rspec 0.4.3 that covered a variety of areas:

  • Opal itself - 30+ pull requests went into Opal 0.9 to improve RSpec's stability on Opal. This obviously benefits anyone using opal, not just opal-rspec users.
  • RSpec specs - opal-rspec 0.5 now runs and passes 80%+ of RSpec's own specs. For the first time, limitations, including some present in prior opal-rspec versions, are documented.
  • New versions - Base RSpec version has been upgraded to 3.1 from the 3.0 beta (we know we're still behind, but read on) and the Rake task works with Phantom JS 1.9.8 and 2.0.
  • New features - Node runner support and improved Rake task configurability.

How do you get started?

First step is adding it to your Gemfile and bundle install.

gem 'opal-rspec'

Then you'll need to ensure, for the default config, you have at 1.9.8 of PhantomJS installed.

Then you can start writing specs!

Put this in spec/42_spec.rb

describe 42 do
  subject { 43 }

  it { is_expected.to eq 43 }
end

Then in your Rakefile:

require 'opal/rspec/rake_task'
Opal::RSpec::RakeTask.new(:default)

After running bundle exec rake, you'll see:

Running phantomjs /usr/local/bundle/gems/opal-rspec-0.5.0/vendor/spec_runner.js "http://localhost:9999/"
Object freezing is not supported by Opal

.

Finished in 0.013 seconds (files took 0.163 seconds to load)
1 example, 0 failures

Right off the bat, you can see at least a few things Opal doesn't like about RSpec's code base (the freeze warning), but as of opal-rspec 0.5, those are either limited to this warning message or other documented limitations.

Formatter support works as well.

After running SPEC_OPTS="--format json" bundle exec rake:

Object freezing is not supported by Opal

{"examples":[{"description":"should eq 43", "full_description":"42 should eq 43", "status":"passed", "file_path":"http://localhost", "line_number":9999, "run_time":0.005}], "summary":{"duration":0.013, "example_count":1, "failure_count":0, "pending_count":0}, "summary_line":"1 example, 0 failures"}

Asynchronous testing on Opal

Since opal has a promise implementation built in, opal-rspec has some support for asynchronous testing. By default, any subject, it block, before(:each), after(:each), or around hook that returns a promise will cause RSpec to wait for promise resolution before continuing. In the subject case, any subject resolution in your specs will be against the promise result, not the promise, which should DRY up your specs.

Example: Create spec/async_spec.rb with this content:

describe 'async' do
  subject do
    promise = Promise.new
    delay 1 do
      promise.resolve 42
    end
    promise
  end

  it { is_expected.to be_a Promise }
  it { is_expected.to eq 42 }
end

Result:

Running phantomjs /usr/local/bundle/gems/opal-rspec-0.5.0/vendor/spec_runner.js "http://localhost:9999/"
Object freezing is not supported by Opal

.F.

Failures:

  1) async should be a kind of Promise
     Failure/Error: Unable to find matching line from backtrace
     Exception::ExpectationNotMetError:
       expected 42 to be a kind of Promise
     # ExpectationNotMetError: expected 42 to be a kind of Promise
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:4294
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:50378
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:33927
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:33950
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:33708
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:53200
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:3000
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:52274
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:52348
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:1055
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:14805
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:52554
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:52573
     #     at http://localhost:9999/assets/opal/rspec/sprockets_runner.js:52555
     #
     #   Showing full backtrace because every line was filtered out.
     #   See docs for RSpec::Configuration#backtrace_exclusion_patterns and
     #   RSpec::Configuration#backtrace_inclusion_patterns for more information.

Finished in 2.03 seconds (files took 0.156 seconds to load)
3 examples, 1 failure

Failed examples:

rspec http://localhost:9999 # async should be a kind of Promise

The failure is because we get a number back, not a Promise.

Other new tools in the ecosystem

JUnit & TeamCity/Rubymine formatter support

The chances are your CI tool supports at least 1 of those. Check out opal-rspec-formatter

Karma support

If you want to tap into the browser runners that Karma supports but keep using Opal, you might want to check out karma-opal-rspec.

Work to be done

As was mentioned above, we're still using RSpec 3.1, which as of this writing, is almost 1.5 years old. Due to the number of changes necessary to make RSpec Opal friendly, we cannot track with RSpec releases as fast as we would like. That said, several things we're doing should allow us to move faster in the future, including:

  • Constantly improving Opal code base that allows more Ruby features to work out of the box.
  • RSpec specs which helped improve Opal and allow us to easily identify far corners of RSpec's functionality.
  • Work on arity checking is already in-progress (opal-rspec currently doesn't run with arity checking enabled). This will tease out even more issues.

All of these will eventually lead to less monkey patching and more out of the box stuff that works.

How to help

Arity checking is the top priority right now. If you can assist with issues like this opal issue that are documented on the opal-rspec arity check issue, that will help move things along.

After that is complete, we can begin work on RSpec 3.4.

Opal 0.10: Rack 2 compatibility, improved keyword argument support, better source maps, and a whole lot more

Opal is now officially on the 0.10.x release path. Thanks to all the contributors to Opal who've worked hard to make this release possible. Some highlights from the latest release:

  • Improvements to source maps
  • Updates to methods in Array, Enumerable, and Module for RubySpec compliance
  • Rack v2 compatibility
  • Support for keyword arguments as lambda parameters, as well as keyword splats
  • Some IO enhancements to better work with Node.js
  • Marshalling support continue reading…

Opal 0.9: direct JS calls, console.rb, numeric updates, better reflection, and fixes aplenty

Opal is now officially on the 0.9.x release path, having just released 0.9.2. Thanks to all the contributors to Opal, especially over the holidays, and we're excited for what the new year will bring to the Opal community. Some highlights from the latest release:

Direct calling of JavaScript methods

You can now make direct JavaScript method calls on native JS objects using the recv.JS.method syntax. Has support for method calls, final callback (as a block), property getter and setter (via #[] and #[]=), splats, JavaScript keywords (via the ::JS module) and global functions (after require "js").

Some examples, first a simple method call:

# javascript: foo.bar()
foo.JS.bar
foo.JS.bar()
continue reading…

Opal 0.7.0: require, kwargs, docs, testing, lots of fixes

It's been almost a year from our 0.6.0 release and has been an awesome time for the Opal community. Today I'm proud to announce we have released v0.7.0, which comes packed with lots of good stuff and uncountable bug fixes.

#require #require_relative and #require_tree

The require system has been completely overhauled in Opal 0.7. The previous version was a rather smart wrapper around sprockets directives but had some limitations, especially when it came to interleaving require calls and code. Some gems couldn't be compiled with Opal just for that reason.

The new require system now relies on a module repository where each "module" actually corresponds to a Ruby file compiled with Opal. This means that #require calls aren't no-ops anymore.

In addition to that #require_relative support has been added and for feature parity with sprockets directives we're also introducing #require_tree. The latter will be particularly useful to require templates.

Keyword Arguments

This has been a super requested feature, and thanks to Adam Beynon they're now a reality. They still have some rough edges (as they did in their first CRuby/MRI incarnation) but the core is there for you all to enjoy.

continue reading…

Promises: Handling asynchronous code in Opal

When using Opal, one large omission from the stdlib are Threads. This is because javascript does not have threads, which makes asyncronous programming difficult in ruby. As javascript has increased in popularity with DOM libraries and web frameworks, callback hell was the standard way to handle asynchronous events in javascript. This was also the way events were handled in Opal applications. Until now.

What is so great about promises?

When looking at a simple example, the benefits of promises may not be obvious:

# old callback method
HTTP.get("url") do |response|
  puts "got response"
end

# using promises
HTTP.get("url").then do |response|
  puts "got response"
end

Initially the method call using a Promise looks just a more verbose version of the standard callback approach. The benefit of promises come through when promises are chained together. The result of the #then call actually returns a new Promise instance, which will not be resolved until the result of the first block resolves itself.

Callback hell

Lets take a slightly more complex example where an initial HTTP request is made for some user details, and then a second request is made using the result of the first json response:

continue reading…