Liveblogging OCaml Workshop 2015

Good morning! I’m here bright and early—though a little bit late—for the OCaml Workshop 2015. Judging from the program, it’s going to be a great one. Watch this page for updates, or subscribe to the RSS feed to get notified automatically of new posts. So, time to caffeinate and play a little catch up. The post for the first session will be up in a little bit.

Core.Time_stamp_counter - A fast high resolution time source

Roshan James, Christopher Hardin

The Time_stamp_counter module exposes a “local” time source that’s fast to query, monotonic, and accurate to nanoseconds is what you need in order to do fine-grained benchmarking. It does this by using the RDTSC x86 instruction, which reads a register that counts CPU cycles since it was last reset. Fetching values from this reguster can take as little as 2ns and gives you nanosecond resolution, whereas using the gettimeofday() system call, can take as much as 800ns on some systems, and only gives you μsecond resolution.

The time recovered from this register does need to be calibrated against system time occasionally. If you’re using Async and the global calibrator, this will happen automatically for you every few seconds. If there’s some discontinuity with system time, Time_stamp_counter will smooth out the discontinuity while maintaining monotonicity. I’ll see if I can find the graphs from the talk and put them up here.

Session 2 - Low-level concerns

Already falling behind a bit, and the late start didn’t help… it was a long night. Apologies to the presenters, who gave great talks, but I’m going to go with the one-line summaries for this session.

Specialization of Generic Array Accesses After Inlining

Ryohei Tokuda, Eijiro Sumii, Akinori Abe

Unboxed arrays for int and float are much faster, but specializing calls isn’t automatic for the compiler. They had to make some modifications of the compiler to add type abstraction and application to the intermeidate language of the compiler. They got some pretty big performance improvements for numerical workloads.

Inline Assembly in OCaml

Vladimir Brankov

Similar to gcc’s inline assembly, pull request #162 implementes inline assembly for OCaml, using a similar syntax. Unfortunately, the pull request will not be merged into master, but you can try it out using opam switch 4.03.0+pr162.

Towards A Debugger for Native-Code OCaml

Fabrice Le Fessant, Pierre Chambart

There’s currently support for debugging OCaml programs in gdb. However, that’s support’s somewhat limited. Most obvious of the limitations is the inability to introspect values. So you can see where you are in an execution, but not what’s currently being processed.

The guys at OCamlPro are working on fixing that. But instead of working on gdb, they’re building out support for lldb, the debugger that from the LLVM project and the default debugger on OS X. Of the things that they currently support—which mind you are huge improvements over gdb support&dmash;are:

What needs to be done:

This will be open-sourced, pending code cleanup.

Operf - Benchmarking the OCaml Compiler

Pierre Chambart, Fabrice Le Fessant, Vincent Bernardoff

This is a cool idea for ensuring quality in the compiler in the face of new and rapid development, at least with respect to performance. Users can submit microbenchmarks along with assumptions about the results of those benchmarks. These can be used during development of compiler patches to watch out for regressions. Not only that but if you’re a user that submitted a microbenchmark, you can be notified at a future date if a compiler patch that’s merged invalidates it. Not the best thing in the world when stuff slows down, but at least you’ll know about it. You can find the collection of microbenchmarks here if you’d like to contribute.

But to take it a step further, the OCaml community can also collect a set of “macro” benchmarks that are bigger, longer-runnning programs. OCamlPro has set up a repository to collect these, and run the benchmarks with almost every combination of compiler flags imagineable. Once a benchmark goes through this system, you can look at a table of results that will tell you how much of a performance speedup (or slowdown) enabling or disabling a single flag will produce. This has a very similar feel to the opam publishing process! Here is the repository that collects those benchmarks.