Euan's Blog

Generating strings at compile time in Nim

When working with C libraries, you often find yourself passing strings to functions as a value list. One example of an API like this is the pledge(2) system call in OpenBSD, where the first parameter is a string listing the promises to pledge to.

Working with these string arguments can be brittle and may lead to runtime errors due to mistyped parameters. It would be much nicer if we could pass strongly typed pre-defined values via enums instead, and let the compiler worry about validating the arguments.

Luckily, Nim makes this extremely easy using templates. Let's walk through an example. We'll take the pledge(2) system call and run with it here.

Defining the available values

So to start with, we need to define the available values. Looking at the man page for pledge(2), there is a list of allowable promises. At this time these include the following (several items are omitted for brevity):

We'll now define an enum that contains these values. I tend to prefer to use pure enums in Nim, meaning I have to write out the full Pledge.Stdio rather than just Stdio. This is primarily due to familiarity with other languages where pure enums are the default (and often only) choice.

type Promise* {.pure.} = enum
  ## The possible operation sets that a program can pledge to be limited to.
  Stdio = "stdio",
  Rpath = "rpath",
  Wpath = "wpath",
  Cpath = "cpath",
  Dpath = "dpath"

This enum defines the string value for each of its members, which will come in handy later on.

Wrapping the base system call

The next task is to import and wrap the base system call from the C API.

Looking at the man page for pledge(2), we can see the system call has the following synopsis:

#include <unistd.h>

int pledge(const char *promises, const char *execpromises);

We can easily translate this into Nim so that we can interface to it:

proc pledge_c(promises: cstring, execpromises: cstring): cint {.importc: "pledge", header: "<unistd.h>".}

We now want to write a wrapper function that our Nim library will expose publicly in order to provide error handling and allow passing of standard Nim string values rather than raw cstring values, so let's have a look at the description and return values of the system call. I've picked out the interesting bits:

A promises value of "" restricts the process to the _exit(2) system call. This can be used for pure computation operating on memory shared with another process.

Passing NULL to promises or execpromises specifies to not change the current value.

So we know that an empty string ("") is treated differently to a null value, meaning our Nim library needs to allow passing either - we'll do this using an Option as strings cannot be nil in Nim.

Upon successful completion, the value 0 is returned; otherwise the value -1 is returned and the global variable errno is set to indicate the error.

So we know that if the system call returns anything but 0, we need to raise an exception.

Now let's write a wrapper function!

proc pledge*(promises: Option[string], execPromises: Option[string] = none(string)) =
    ## Pledge to use only the defined functions.
    ##
    ## If no promises are provided, the process will be restricted to the `_exit(2)` system call.

    # first check if pledge is available at all
    if (osVersion.major == 5 and osVersion.minor != 9) or osVersion.major < 5:
      raise newException(PledgeNotAvailableError, &"pledge(2) system call is not available on OpenBSD {osVersion.major}.{osVersion.minor}")

    # now check if execPromises is set - it's only available from openBSD 6.3+
    if (osVersion.major < 6 or (osVersion.major == 6 and osVersion.minor <= 2)) and execPromises.isSome():
      raise newException(PledgeExecPromisesNotAvailableError, &"cannot use execpromises with pledge(2) on OpenBSD {osVersion.major}.{osVersion.minor}")

    let promisesValue: cstring = if promises.isSome(): cstring(promises.get()) else: nil
    var execPromisesValue: cstring = nil

    # if running on openBSD <= 6.2, execpromises should be passed as NULL
    if osVersion.major > 6 or (osVersion.major == 6 and osVersion.minor > 2):
      execPromisesValue = if execPromises.isSome(): cstring(execPromises.get()) else: nil

    if pledge_c(promisesValue, execPromisesValue) != 0:
      raiseOSError(osLastError())

This is slightly more involved! The API for the pledge(2) system call has changed a little over OpenBSD releases, and wasn't available at all until OpenBSD version 5.9. We do a couple of checks to see which version of OpenBSD is running (the code for these checks is missing from this post for the sake of brevity) and either bail out of the function with an exception, or alter the arguments that we pass to the system call.

Adding a template to allow passing enum values

The wrapper function we defined above takes string optionals, meaning you call it as follows:

pledge(some("stdio rpath"))

Or, for a null promise set to not alter the promise list:

pledge(none(string))

We want to make that a little more friendly by using our enum, meaning it can be used as follows:

pledge(Promise.Stdio, Promise.Rpath)

Luckily, Nim's templates make this particularly easy. For this case, we're only going to handle the first parameter to the pledge(2) system call: promises. This allows us to pass a varags[Promise] to the template. We then want to de-duplicate the promises and glue them into a string.

We'll start by writing a compile time function that takes a list of promises and returns a string:

proc getPromisesString(promises: openarray[Promise]): string {.compiletime.} =
  var
    promiseSet: set[Promise] = {}
    sep = ""

  for p in promises:
    if p notin promiseSet:
      promiseSet.incl(p)
      result.add(sep)
      result.add($p)

      sep = " "

This procedure loops through the provided promises. If the promise isn't in the promiseSet of promises already processed, we add it to the result and insert it into the set.

Now for the template that makes use of this procedure:

template pledge*(promises: varargs[Promise]) =
  ## Pledge to use only the defined functions.
  ##
  ## This template takes a list of `Promise`, creates the required promise string and emits a call to the `pledge` proc.
  if len(promises) > 0:
    const promisesString = getPromisesString(promises)
    pledge(some(promisesString))
  else:
    pledge(none(string))

If the list of passed in promises is empty (for example, if just pledge() is called), then we emit a call to the wrapper procedure with a null promises string which results in the promises not being changed. If the list is not empty, we build the promises string at compile time, then emit a call to the wrapper procedure with this constant string value.

We can be sure that the promisesString is built at compile time, since we marked the getPromisesString procedure with the {.compiletime.} pragma.

Verifying the magic happens at compile time

You don't have to take my word for it that this happens when we compile the program - we can verify it pretty easily!

Let's write a simple program that makes use of our little library. This program will pledge the stdio and rpath promises only (only stdio is required, but we want to use multiple promises to see how our template works), then print Hello, world! to standard out:

import pledge

proc main() =
  pledge(Promise.Stdio, Promise.Rpath)

  echo "Hello, world!"

when isMainModule:
  main()

As Nim compiles to C as an intermediate, we can build the C code for this program and then look at it to see what gets emitted. Let's build it first:

nim c -c -d:danger --nimcache:./nimcache test_pledge.nim

This will compile the program to its intermediate C version without building or linking it (the -c flag), using the danger preset for the compiler optimisation and safety levels (this will prevent any extra debug information being included in the C code to make it easier to read) (the -d:danger flag) and setting the cache directory for the compiler to a local nimcache directory (the --nimcache:./nimcache flag).

Running this command will create a ./nimcache directory with a few files. The one we're interested in will be named something like ./nimcache/@mtest_pledge.nim.c. Let's open it and have a look.

We're looking for our main() procedure first. It's easiest to do a find in the file for this, as Nim's name mangling will have added a suffix to the procedure name. In my case, it's actually called main__uHr7ijCDNVBz6LCPCElMnQ and it ends up looking like this:

N_LIB_PRIVATE N_NIMCALL(void, main__uHr7ijCDNVBz6LCPCElMnQ)(void) { {
    tyObject_Option__vK1KzfYf1DGLiUIpLm9cS0A T5_;
    tyObject_Option__vK1KzfYf1DGLiUIpLm9cS0A T6_;
    if (!NIM_TRUE) goto LA3_;
    nimZeroMem((void*)(&T5_), sizeof(tyObject_Option__vK1KzfYf1DGLiUIpLm9cS0A));
    some__CxbVJ9cqvyc5uz0Ex4RuIKw(((NimStringDesc*) &TM__gNQ9bK0v9b5sOikIs4BrpGvQ_2), (&T5_));
    nimZeroMem((void*)(&T6_), sizeof(tyObject_Option__vK1KzfYf1DGLiUIpLm9cS0A));
    none__sxVpCqMBELxuy9cW9boFAqHw((&T6_));
    pledge__lgjpkV9aHfXG7wdFAJ4YwwA(T5_, T6_);
  }
  goto LA1_;
  LA3_: ;
  {
    tyObject_Option__vK1KzfYf1DGLiUIpLm9cS0A T8_;
    tyObject_Option__vK1KzfYf1DGLiUIpLm9cS0A T9_;
    nimZeroMem((void*)(&T8_), sizeof(tyObject_Option__vK1KzfYf1DGLiUIpLm9cS0A));
    none__sxVpCqMBELxuy9cW9boFAqHw((&T8_));
    nimZeroMem((void*)(&T9_), sizeof(tyObject_Option__vK1KzfYf1DGLiUIpLm9cS0A));
    none__sxVpCqMBELxuy9cW9boFAqHw((&T9_));
    pledge__lgjpkV9aHfXG7wdFAJ4YwwA(T8_, T9_);
  }
  LA1_: ;
  echoBinSafe(TM__gNQ9bK0v9b5sOikIs4BrpGvQ_3, 1);
}

There's quite a lot of extra code in here, but the bits we're interested in are the calls to the pledge procedure. We can see pledge() is called with the arguments T5_ and T8_ as the first promises argument.

If we look for other mentions of T5_ in the code, we can see that T5_ is defined as a tyObject_Option__vK1KzfYf1DGLiUIpLm9cS0A which is an Option type in Nim. We can see that the nimZeroMem procedure is called with T5_ as an argument to zero the memory, then some__CxbVJ9cqvyc5uz0Ex4RuIKw is called to set the value of the option to the some value. The actual value itself is defined as TM__gNQ9bK0v9b5sOikIs4BrpGvQ_2.

If we look for TM__gNQ9bK0v9b5sOikIs4BrpGvQ_2 in the code, we see it defined at the top of the file:

STRING_LITERAL(TM__gNQ9bK0v9b5sOikIs4BrpGvQ_2, "stdio rpath", 11);

We can clearly see here that this is defined as a string literal containing stdio rpath - hey that's the two promises we pledged!

Making life easy: an existing wrapper library for pledge(2)

Luckily, I've already written a wrapper library that's published in the Nimble packages collection to wrap the pledge(2) system call, as well as the unveil(2) system call. This library is available on GitHub if you want to see more.

I hope this is helpful to others looking to wrap C APIs and give them little more of a Nim feel.

#Nim