Ogma v0.5 Release
ogma
is a scripting language written in Rust focused on ergonomically and efficiently processing tabular data. Mixing aspects of terminal shells and functional programming, the ogma project lets one interact with data in a refreshing way.
ogma
version 0.5
has been released, bringing many bug fixes on top of the type inference
update.
Some of the major updates were behind the scenes, reworking the compiler to be more robust with
type inferencing and variable resolution, which should reduce any internal compilation erros seen
in the wild. This post is going to delve into the introduction of TypesSet
s and the variable
sealing rework.
Find the release binaries and notes at the GitHub repo.
TypesSet
s
Since the introduction of type
inference, the
compiler could encounter code which would fail to type check, but could be trivially reasoned
about. Much of the issue was how the compiler would trial different types and the deductions that
could be made without a concrete set of constraints.
To solve this issue, the notion of a TypesSet
was introduced. The set would begin as a master
superset of all known types, reducing as more information is gathered and constraints are placed on
each node.
The set replaced the Inferred
variant in a node within the type graph, so effectively each node
would maintain a set of inferred types. The actual change wasn't that large:
enum Knowledge {
// No longer represented, since the types set will dictate ambiguity
- Unknown,
Any,
Known(Type),
Obliged(Type),
- Inferred(Type),
+ Inferred(TypesSet)
}
struct TypesSet(Rc<HashSet<Type>>);
Notice that the Unknown
variant is removed, it is represented via a set having ambiguity (more
than one type). The TypesSet
structure is simply a reference counted HashSet
. The reference
counting is used to reduce the memory burden, since the type graph will be initialised with the
master superset, there is no need to keep clones of this in memory. When a set is reduced, it will
be lazily cloned (cow), and not all sets will go
this way; some sets are immediately overwritten with known knowledge.
Using sets allows for faster deductions to be made. When the type flow occurs, sets that flow
between one another can leverage set intersection to reduce each of them. Checking for a valid
intersection is simple, the Rust standard library's HashSet
has a great API.
The change to types sets also heralded a more constrained way of implementing intrinsic commands.
The compiler framework had already existed, this release now leverages it to unleash the power of
set deductions.
Moving to types sets for the inference fixed many of the outstanding type inferencing bugs, and
provides a robust foundation to get polymorphism out of your ogma
code.
Variable shadowing
Whilst working through the bug issues, a subtle variable shadowing bug arouse in uncommon cases where the shadowed variable would be the one referenced at runtime.
ls | let $x | grp type | map value
{:Table \ $x | let $row.key:Str $x | fold {Table $x} append-row $row.size:Num }
| ^ $x should be a string but was compiling as a table
^ $x gets reassigned here
To solve this issue, the assumptions the compiler would make about sealing nodes from variable introduction had to be reworked. The new method leverages the locals graph to strictly define the lexical parents which can introduce variables, and which would need to be sealed before any concrete answers about variables existing can be asked. This was especially tricky since most commands do not introduce variables, however, when applied as a default assumption would lead to the compiler eagerly compiling blocks with stale variables. The new system makes the compiler more pessimistic, but allows for fine grained control to tell the compiler when an command will not introduce any more variables and can be sealed.
Looking forward
ogma
's type system has reached enough power to allow most polymorphic needs.
Feature wise, there are lots of commands to implement, and the milestone of partitions, which is
key to create more modular code bases.
Open source support through code contributions, sponsorship, adoption and sharing is much
appreciated!