UP | HOME

Extempore LLVM 3.8 to LLVM 10 upgrade log

Extempore is a “full-stack” (not in the web dev sense) live coding environment. It uses LLVM to JIT compile XTLang code. Currently it depends on LLVM 3.8 and I’m trying to get it up to LLVM 10. This page is an unorganized set of notes and stream of thoughts written while working on this task.

The story as of 2020-05-25

Refactoring

I’ve done some refactoring work in this branch: llvm-refactor. You can also view the diff. Mostly I’ve been trying to isolate LLVM code so it’s easier to swap it out. What will probably happen is I’ll update the build to link against LLVM10 and then hack away at it until it compiles, but the dream would be to refactor in a way that I could link against one or the other by changing some config.

Ben Swift has added CI to the main extempore repo which I’ve merged into llvm-refactor. As expected it shows that my code is totally broken and I need to fix that.

DONE get the builds working on llvm-refactor

Because the build works for me I need to recreate one of the failing environments and git bisect I think.

I got the builds working on <2020-06-27 Sat> by commenting out code until I could see what was broken. It was the order of headers.

ORCv2

ORC is the LLVM API for building JIT compilers. Here is a good overview. It works differently to the old MCJIT API that Extempore currently uses. I’ve been playing with the API trying to understand how I can compile the LLVM IR that Extempore emits with it.

<2020-05-25 Mon>

About two weeks ago I started digging into Extempore’s LLVM IR and that’s when I realised I needed to start writing some stuff down so here we are. I had emailed Ben saying I was looking into function definition and he said:

I can’t even remember, do we currently allow function redefinition? Or do we just use a single struct for the “closure” (which includes pointers to the let-bound variables, the memory zone, and the actual function pointer). Because in that case you just need to swap the old function pointer for the new one.

This made me realise I actually have no idea what the IR that comes out of Extempore looks like which is probably something I should understand if I’m trying to compile it with ORC. So I hacked up the compile function to print it out:

static llvm::Module *jitCompile(std::string asmcode) {
  // ...

  static std::ofstream compiles("/tmp/compiles.txt");
  compiles << asmcode;
  compiles << std::endl << "-----------------" << std::endl;

  // ...
}

I compiled this:

(bind-func add1
  (lambda (x)
    (+ x 1)))

And then this:

(bind-func add1
  (lambda (x)
    (+ x 2)))

Which spat out a lot of IR. From the first compile: txt/add1.ll. From the second: txt/add1again.ll.

The interesting bits:

define i64 @add1_adhoc_W2k2NCxpNjRd__3881(i8* %_impz,i8* %_impenv, i64 %x) {
  ;; ...
  %xPtr = alloca i64
  store i64 %x, i64* %xPtr


  %val3883 = load i64, i64* %xPtr
  %val3884 = add i64 %val3883, 1
  ret i64 %val3884
}

define {i8*, i8*, i64 (i8*, i8*, i64)*}** @add1_adhoc_W2k2NCxpNjRd_maker(i8* %_impz) {
  ;; ...
}

@add1_adhoc_W2k2NCxpNjRd_var = dllexport global [1 x i8*] [ i8* null ]

@add1_adhoc_W2k2NCxpNjRd_var_zone = dllexport global [1 x i8*] [ i8* null ]

define void @add1_adhoc_W2k2NCxpNjRd_setter() {
  ;; ...
}

define i8* @add1_adhoc_W2k2NCxpNjRd_getter() {
  ;; ...
}

define i64 @add1_adhoc_W2k2NCxpNjRd(i64 %arg_0) {
  ;; ...
}

define i64 @add1_adhoc_W2k2NCxpNjRd_native(i64 %arg_0) {
  ;; ...
}

define i8* @add1_adhoc_W2k2NCxpNjRd_scheme(i8* %_sc, i8* %args) {
  ;; ...
}

define void @add1_adhoc_W2k2NCxpNjRd_callback(i8* %dat, %mzone* %inzone) {
  ;; ...
}

And then when I compiled it again:

define i64 @add1_adhoc_W2k2NCxpNjRd__3908(i8* %_impz,i8* %_impenv, i64 %x) {
  ;; ...
  %val3911 = add i64 %val3910, 2
  ret i64 %val3911
}

define {i8*, i8*, i64 (i8*, i8*, i64)*}** @add1_adhoc_W2k2NCxpNjRd_maker(i8* %_impz) {
  ;; ...
}

define void @add1_adhoc_W2k2NCxpNjRd_setter() alwaysinline {
  ;; ...
}

The first thing I see is that the _maker and _setter functions are compiled again. Currently I don’t know how to get ORCv2 to do this so we’ll have to come up with something. Before we think about that though, let’s dig into what each one of these things does.

<2020-06-06 Sat>

define i64 @add1_adhoc_W2k2NCxpNjRd__3881(i8* %_impz,i8* %_impenv, i64 %x) {
  ;; ...

  %val3883 = load i64, i64* %xPtr
  %val3884 = add i64 %val3883, 1
  ret i64 %val3884
}

This looks like the body of the function – we can see add i64 %val3883, 1 which presumably is part of (+ x 1). When we redefine the function with (+ x 2) we get:

define i64 @add1_adhoc_W2k2NCxpNjRd__3908(i8* %_impz,i8* %_impenv, i64 %x) {
  ;; ...
  %val3911 = add i64 %val3910, 2
  ret i64 %val3911
}

You can see the IR is adding two like we would expect. Notice the name of the function is unique, it ends with 3908 where the previous one ended in 3881. By reading Andrew Sorensen’s thesis on Extempore we can get more info! Chapter 4, The implementation of a live language has lots of useful info.

<2020-06-07 Sun>

In fact the compiler builds, not only one target, but several. The target presented here times2_adhoc_W2k2NCxpNjRd__264, is the function worker, the worker target in XTLang parlance, the primary call destination of an XTLang call site, such as (times2 3). However, XTLang functions are closures, and extra machinery is required.

Okay so we can call this the worker target.

In addition to generating a worker for times2 the XTLang compiler also generates a maker target, for constructing a times2 closure, a native target which enables C functions to call XTLang worker targets directly (i.e. allows C to call XTLang closures directly), and a scheme target which automatically manages Scheme-C-FFI conversion.

Evidently @add1_adhoc_W2k2NCxpNjRd_maker must be the maker target. One of the two symbols/functions we redefine.

the maker target is responsible for the construction of closure environments (the environment) which include a reference to the worker target (the function). All functions in XTLang are closures (a function and its environment).

OK so let’s go through the maker function line by line. First, from init.ll know that

%mzone = type {i8*, i64, i64, i64, i8*, %mzone*}

Which seems to correspond to

struct llvm_zone_t {
  void* memory;
  uint64_t offset;
  uint64_t mark;
  uint64_t size;
  zone_hooks_t* cleanup_hooks;
  llvm_zone_t* memories;
};

From EXTLLVM.h.

Section 4.9:

One solution is to extend the stack discipline to life-cycles beyond a single function call by introducing the concept of a memory zone. Zone’s are stored in a stack of zones where each new Zone is pushed onto the stack as it is created. New memory allocations are always made from the topmost zone of the zone-stack. When the topmost zone is popped from the zone-stack any memory allocated against that zone is automatically freed.

Memory zones in XTLang are auto-expanding. If allocation reaches the end of the available zone memory, a new allocation (doubling the existing allocation) is made, and linked (a simple linked list) to the active zone. In this way a zone can expand indefinitely, however, a zone can be hard limited to restrict the possible maximum size of expansion.

One of the arguments passed in is i8* %_impz which looks like it’s a pointer to some memory zone. The first thing we do is allocate some memory from that zone:

%_impzPtr = alloca i8*
store i8* %_impz, i8** %_impzPtr
%tzone3904 = load i8*, i8** %_impzPtr
%zone3905 = bitcast i8* %tzone3904 to %mzone*

; let assign value to symbol add1_adhoc_W2k2NCxpNjRd
%dat_add1_adhoc_W2k2NCxpNjRd = call i8* @llvm_zone_malloc(%mzone* %zone3905, i64 8)
%add1_adhoc_W2k2NCxpNjRdPtr = bitcast i8* %dat_add1_adhoc_W2k2NCxpNjRd to { i8*, i8*, i64 (i8*, i8*, i64)*}***
%tzone3885 = load i8*, i8** %_impzPtr
%zone3886 = bitcast i8* %tzone3885 to %mzone*
call void @llvm_zone_mark(%mzone* %zone3886)

And then we call llvm_zone_mark. What does that do?

define private void @llvm_zone_mark(%mzone* %zone) nounwind alwaysinline
{
  %offset_ptr = getelementptr inbounds %mzone, %mzone* %zone, i32 0, i32 1
  %offset_val = load i64, i64* %offset_ptr
  %mark_ptr = getelementptr %mzone, %mzone* %zone, i32 0, i32 2
  store i64 %offset_val, i64* %mark_ptr
  ret void
}

Looks like it’s something like:

void llvm_zone_mark(mzone* zone) {
  zone->mark = zone->offset;
}

I’ve got no idea what that’s for; maybe I’ll find out later.

The scheme function responsible for generating the closure is impc:ir:compile:make-closureenv.

The next step is allocating memory for the closure, also from the zone:

; malloc closure structure
%clsptr3887 = call i8* @llvm_zone_malloc(%mzone* %zone3886, i64 24)
%closure3888 = bitcast i8* %clsptr3887 to { i8*, i8*, i64 (i8*, i8*, i64)*}*

The environment:

; malloc environment structure
%envptr3889 = call i8* @llvm_zone_malloc(%mzone* %zone3886, i64 8)
%environment3890 = bitcast i8* %envptr3889 to {{i8*, i8*, i64 (i8*, i8*, i64)*}***}*

The address table:

; malloc closure address table
%addytable3891 = call %clsvar* @new_address_table()
%var3892 = bitcast [24 x i8]* @gs259 to i8*
; @gs259 = "add1_adhoc_W2k2NCxpNjRd\00"
; @gs260 = "{i8*, i8*, i64 (i8*, i8*, i64)*}**\00"
%var3893 = bitcast [35 x i8]* @gs260 to i8*
%addytable3894 = call %clsvar* @add_address_table(%mzone* %zone3886, i8* %var3892, i32 0, i8* %var3893, i32 3, %clsvar* %addytable3891)
%address-table3895 = bitcast %clsvar* %addytable3894 to i8*

new_address_table appears to return a null pointer:

define private %clsvar* @new_address_table() nounwind alwaysinline
{
  ret %clsvar* null
}

For reference, clsvar is defined in init.ll:

%clsvar = type {i8*, i32, i8*, %clsvar*}

Which seems to be related to but ?? not the same as:

// This here for Extempore Compiler Runtime.
// This is temporary and needs to replaced with something sensible!
struct closure_address_table
{
    uint64_t id;
    char* name;
    uint32_t offset;
    char* type;
    struct closure_address_table* next;
};

and add_address_table:

EXPORT closure_address_table* add_address_table(llvm_zone_t* zone, char* name, uint32_t offset, char* type, int alloctype, struct closure_address_table* table)
{
    struct closure_address_table* t = NULL;
    if (alloctype == 1) {
        t = reinterpret_cast<closure_address_table*>(malloc(sizeof(struct closure_address_table)));
    } else {
        t = (struct closure_address_table*) extemp::EXTLLVM::llvm_zone_malloc(zone,sizeof(struct closure_address_table));
    }
    t->id = string_hash(name);
    t->name = name;
    t->offset = offset;
    t->type = type;
    t->next = table;
    return t;
}

So the call in the IR looks like:

add_address_table(_impz, "add1_adhoc_W2k2NCxpNjRd", 0, "{i8*, i8*, i64 (i8*, i8*, i64)*}**", 3, nullptr);

Right it looks like a symbol table. Maybe that’s why the comment says ; let assign value to symbol add1_adhoc_W2k2NCxpNjRd!

; insert table, function and environment into closure struct
%closure.table3898 = getelementptr { i8*, i8*, i64 (i8*, i8*, i64)*}, { i8*, i8*, i64 (i8*, i8*, i64)*}* %closure3888, i32 0, i32 0
store i8* %address-table3895, i8** %closure.table3898
%closure.env3899 = getelementptr { i8*, i8*, i64 (i8*, i8*, i64)*}, { i8*, i8*, i64 (i8*, i8*, i64)*}* %closure3888, i32 0, i32 1
store i8* %envptr3889, i8** %closure.env3899
%closure.func3900 = getelementptr { i8*, i8*, i64 (i8*, i8*, i64)*}, { i8*, i8*, i64 (i8*, i8*, i64)*}* %closure3888, i32 0, i32 2
store i64 (i8*, i8*, i64)* @add1_adhoc_W2k2NCxpNjRd__3881, i64 (i8*, i8*, i64)** %closure.func3900

OK it looks like a closure is a symbol table, an environment, and a function pointer to the worker that we saw earlier.

%closure_size3901 = call i64 @llvm_zone_mark_size(%mzone* %zone3886)
call void @llvm_zone_ptr_set_size(i8* %clsptr3887, i64 %closure_size3901)
%wrapper_ptr3902 = call i8* @llvm_zone_malloc(%mzone* %zone3886, i64 8)
%closure_wrapper3903 = bitcast i8* %wrapper_ptr3902 to { i8*, i8*, i64 (i8*, i8*, i64)*}**
store { i8*, i8*, i64 (i8*, i8*, i64)*}* %closure3888, { i8*, i8*, i64 (i8*, i8*, i64)*}** %closure_wrapper3903

llvm_zone_mark_size looks to do something like:

int64_t llvm_zone_mark_size(mzone* zone) {
  return zone->offset - zone->mark;
}

So it’s the amount of memory we’ve allocated since we called the mark function earlier.

Then it looks like we allocate memory for a pointer to the closure in %closure_wrapper3903. And then we store that in the environment we allocated earlier? And we return the pointer to the closure:

%val3906 = load {i8*, i8*, i64 (i8*, i8*, i64)*}**, {i8*, i8*, i64 (i8*, i8*, i64)*}*** %add1_adhoc_W2k2NCxpNjRdPtr
ret {i8*, i8*, i64 (i8*, i8*, i64)*}** %val3906

So if we’re thinking about handling redefinition of _maker functions the first question is how does it currently work? I’m not totally sure. I do see llvm:erase-function get called with reference to _maker functions in places so let’s have a look at that.

pointer erase_function(scheme* Scheme, pointer Args)
{
    auto func(EXTLLVM2::FindFunctionNamed(string_value(pair_car(Args))));
    if (!func) {
        return Scheme->F;
    }
    func->dropAllReferences();
    func->removeFromParent();
    //func->deleteBody();
    //func->eraseFromParent();
    return Scheme->T;
}

Documentation here for dropAllReferences and here for removeFromParent. I’m not sure there’s a way to do it like this still.

One thing that seems like it could work is deleting the symbol from the "main" JITDylib in the ExecutionSession:

// define addone
{
  FunctionType *FT =
    FunctionType::get(G.Builder->getDoubleTy(),
                      {G.Builder->getDoubleTy()},
                      false);
  Function* F =
    Function::Create(FT, Function::ExternalLinkage, "addone", G.TheModule.get());
  BasicBlock* BB = BasicBlock::Create(*G.TheContext.get(), "entry", F);
  G.Builder->SetInsertPoint(BB);
  auto x = F->getArg(0);
  auto xplusone =
    G.Builder->CreateFAdd(x,
                          ConstantFP::get(G.Builder->getDoubleTy(), APFloat(1.0)));
  G.Builder->CreateRet(xplusone);
  verifyFunction(*F);
  assert(!G.crystallize());
}

// can we delete the symbol?
{
  auto& myguy = G.TheJIT->getMainJITDylib();
  myguy.dump(errs());
  cantFail(myguy.remove({G.TheJIT->getExecutionSession().intern("addone")}), "remove addone");
  myguy.dump(errs());
}
return 0;

// try to define it again
{
  FunctionType *FT =
    FunctionType::get(G.Builder->getDoubleTy(),
                      {G.Builder->getDoubleTy()},
                      false);
  Function* F =
    Function::Create(FT, Function::ExternalLinkage, "addone", G.TheModule.get());

  BasicBlock* BB = BasicBlock::Create(*G.TheContext.get(), "entry", F);
  G.Builder->SetInsertPoint(BB);
  auto x = F->getArg(0);
  auto xplusone =
    G.Builder->CreateFAdd(x,
                          ConstantFP::get(G.Builder->getDoubleTy(), APFloat(2.0)));
  G.Builder->CreateRet(xplusone);
  verifyFunction(*F);
  assert(!G.crystallize());
}

This seems to work but I don’t know if it’s the right thing to do. I might ask around. That program prints this:

JITDylib "<main>" (ES: 0x0000000002ecedb0):
Search order: [ ("<main>", MatchAllSymbols) ]
Symbol table:
    "addone": <not resolved> Never-Searched (Materializer 0x2edf1a0)
JITDylib "<main>" (ES: 0x0000000002ecedb0):
Search order: [ ("<main>", MatchAllSymbols) ]
Symbol table:

I also tried a version of the above program where I run it in a loop forever and then watch to see if it eats up virtual memory, but it does not!

<2020-06-21 Sun>

One thought I had was: if we don’t force the ExecutionSession to materialize the function we’ve defined, then maybe deleting the symbol is trivial and that’s why it works. But I tried calling the function and then deleting and it still seems to work so I’m going to go with this approach for now.

My goal today is to try and get the IR I produced earlier to compile under ORCv2. Off the top of my head the only problem should be mocking out the functions called in the IR. What are they?

  • void* llvm_zone_malloc(mzone*, int64_t)
  • void llvm_zone_mark(mzone*)
  • clsvar* new_address_table
  • clsvar* add_address_table(mzone*, void*, int32_t, void*, int32_t, clsvar*)
  • int64_t llvm_zone_mark_size(mzone*)
  • void llvm_zone_ptr_set_size(void*, int64_t)

and quite a few more…

Let’s start with everything up to and including the first function in add1.ll:

@gs259 = hidden constant [24 x i8] c"add1_adhoc_W2k2NCxpNjRd\00"
; -------
@gs260 = hidden constant [35 x i8] c"{i8*, i8*, i64 (i8*, i8*, i64)*}**\00"
; -------
define dllexport fastcc i64 @add1_adhoc_W2k2NCxpNjRd__3881(i8* %_impz,i8* %_impenv, i64 %x) nounwind {
entry:
%_impzPtr = alloca i8*
store i8* %_impz, i8** %_impzPtr
%zone3882 = bitcast i8* %_impz to %mzone*
; setup environment
%impenv = bitcast i8* %_impenv to {{i8*, i8*, i64 (i8*, i8*, i64)*}***}*
%add1_adhoc_W2k2NCxpNjRdPtr_ = getelementptr {{i8*, i8*, i64 (i8*, i8*, i64)*}***}, {{i8*, i8*, i64 (i8*, i8*, i64)*}***}* %impenv, i32 0, i32 0
%add1_adhoc_W2k2NCxpNjRdPtr = load {i8*, i8*, i64 (i8*, i8*, i64)*}***, {i8*, i8*, i64 (i8*, i8*, i64)*}**** %add1_adhoc_W2k2NCxpNjRdPtr_

; setup arguments
%xPtr = alloca i64
store i64 %x, i64* %xPtr


%val3883 = load i64, i64* %xPtr
%val3884 = add i64 %val3883, 1
ret i64 %val3884
}

There’s nothing external in here so it should compile fine after we include the requisite mzone type definition:

%mzone = type {i8*, i64, i64, i64, i8*, %mzone*}

aaaand we’re all good!

JITDylib "<main>" (ES: 0x0000000003ae3db0):
Search order: [ ("<main>", MatchAllSymbols) ]
Symbol table:
    "gs259": <not resolved> Never-Searched (Materializer 0x3aeeac0)
    "add1_adhoc_W2k2NCxpNjRd__3881": <not resolved> Never-Searched (Materializer 0x3affc70)
    "gs260": <not resolved> Never-Searched (Materializer 0x3aefb60)

Next up the _maker function. I will stub out these functions: void* llvm_zone_malloc(mzone*, int64_t) void llvm_zone_mark(mzone*) void* llvm_zone_malloc(mzone*, int64_t) void* llvm_zone_malloc(mzone*, int64_t) clsvar* new_address_table() clsvar* add_address_table(mzone*, void*, int32_t, void*, int32_t, clsvar*) int64_t llvm_zone_mark_size(mzone*) void llvm_zone_ptr_set_size(void*, int64_t) void* llvm_zone_malloc(mzone*, int64_t)

I didn’t get all the way but I made a useful start on this:

struct llvm_zone_t {
  void* memory;
  uint64_t offset;
  uint64_t mark;
  uint64_t size;
  // zone_hooks_t* cleanup_hooks;
  void* cleanup_hooks;
  llvm_zone_t* memories;
};

void* llvm_zone_malloc(llvm_zone_t*, uint64_t) {
  static uint8_t idk[4096];
  std::cout << "llvm_zone_malloc!" << std::endl;
  return idk;
}
// ...
// add llvm_zone_malloc to the main symbol table
{
  const DataLayout& DL = G.TheJIT->getDataLayout();
  MangleAndInterner Mangle(G.TheJIT->getExecutionSession(), DL);

  auto syms =
    absoluteSymbols(
      {{Mangle("llvm_zone_malloc"),
        JITEvaluatedSymbol(pointerToJITTargetAddress(&llvm_zone_malloc), {})}});
    cantFail(G.TheJIT->getMainJITDylib().define(syms), "defining llvm_zone_malloc");
}

{
  SMDiagnostic diag;
  G.TheModule = std::move(parseIRFile("ir/five.ll", diag, *G.TheContext));
  G.TheModule->setDataLayout(G.TheJIT->getDataLayout());
  assert(!G.crystallize());
}

// try run add1_adhoc_W2k2NCxpNjRd_maker(void* something_idk)
{
  auto sym = cantFail(G.TheJIT->lookup("add1_adhoc_W2k2NCxpNjRd_maker"), "lookup add1_adhoc_X_maker");
  void (*f)(llvm_zone_t*) = (void (*)(llvm_zone_t*))sym.getAddress();
  f(static_cast<llvm_zone_t*>(nullptr));
}

Progress!

nic@friendo ~/c/k/build> ./Kaleidoscope
llvm_zone_malloc!
JITDylib "<main>" (ES: 0x0000000002767db0):
Search order: [ ("<main>", MatchAllSymbols) ]
Symbol table:
    "add1_adhoc_W2k2NCxpNjRd_maker": 0x00007feceb126000, [Callable] Ready
    "add1_adhoc_W2k2NCxpNjRd__3881": <not resolved> Never-Searched (Materializer 0x2783c70)
    "gs260": <not resolved> Never-Searched (Materializer 0x2773b60)
    "llvm_zone_malloc": 0x0000000000455020, [Data][Hidden] Ready
    "gs261": <not resolved> Never-Searched (Materializer 0x27757f0)
    "gs259": <not resolved> Never-Searched (Materializer 0x2772ac0)

<2020-06-27 Sat>

Last weekend I started trying to compile Extempore’s LLVM IR. I’ve been thinking about it during the week and I figure I might as well try and pull the pieces of the runtime I need out of Extempore into a common runtime and compile against that rather than stubbing out the functions and compiling code just to show that I can compile code. The important thing I learned was how to add symbols from the process to the ExecutionSession / main JITDyLib and I don’t think pushing further ahead with this will be useful.

So it seems I’ve reached a point where I need to get back to the other half of this project – the refactoring. First task for the day is to merge in the latest master.

I merged upstream/master with no problems. Of course it will fail to build on macos/windows again and presumably GitHub will say “We are currently unable to download the log. Please try again later.” when I want to know why. I don’t really have enough compute to look into the Windows build myself without buying some hardware (new Ryzen chips are tempting though) or finding some way to get a Windows machine somewhere else I can build on. I’ll try Azure.

Do I want “Windows Server 2019 Datacenter” or “Windows 10 Pro”?? Let’s go Windows 10 Pro.

OK so here I am downloading Chrome in a Windows VM so I can install Visual Studio I guess?

There are instructions for building on Windows:

ok I’m running the build instructions in a “Powershell for VS 2019” terminal emulator. I guess it’s more of a console than a terminal emulator; we’re not pretending it’s a VT100. looks like we also need python. Which version? Who knows? It’s LLVM 3.8 so I’m guessing python 2 is probably what we want.

Probably useful to remember how to set the LLVM directory so I don’t have to build it over and over:

$env:EXT_LLVM_DIR = 'C:\Users\AzureUser\Desktop\llvm'

<2020-06-28 Sun>

Yesterday I gave up on git bisect and tried to see if I could find the problem myself. Turns out it’s the order of the headers. There’s a comment in the Extempore code which says something like “this must be included before anything which pulls in Windows.h” except it’s not obvious what it’s talking about. I still don’t know, but putting the LLVM includes at the top of a couple of files fixed things up so now the builds are green!

That’s great. Today I can start pulling out a runtime! It seems like most of what I need is in EXTLLVM.h / EXTLLVM.cpp. Forgiving the naming I don’t see any reason to move this to a new location so I’m going to start by reading over these files. OK it seems like we have a few data structures that could live in their own namespaces. Memory zones in particular look like they can and should be pulled out.

OK I’ve pulled zones out and put them in their own namespace for the C++ side of things. Of course many of the functions still need C linkage to be called from JIT code. At some point I’m hoping to audit which functions need to exist in which namespaces and have what kind of linkage so this can be a little cleaner.

I have also pulled out closure_address_table into its own namespace. This means I can try and combine everything now! I have LLVM10 + Extempore IR + Extempore runtime and in theory they can be put together. I’ll make a new repo for this experiment. https://github.com/nic-donaldson/extempore-llvm10-experiments.

After dropping everything I have so far into the new repo it seems to compile just fine! Now I need to add some more symbols to the symbol table and get the IR compiling + running. The first hiccup is llvm_zone_mark, which is not written in C++ it’s written in LLVM IR. I’m going to take the C++ version I translated earlier and put that in my code. And again for new_address_table. Thankfully it returns a nullptr so that’s easy.

All right I got my sample IR compiling and running and it doesn’t segfault so that’s a win for this weekend. I am now running LLVM IR produced by extempore on LLVM10’s LLJIT with a runtime factored out from the extempore codebase. This is what it looks like compiling up until the _maker function:

nic@friendo ~/c/e/build> ./Main
llvm_zone_malloc
llvm_zone_mark
llvm_zone_malloc
llvm_zone_malloc
llvm_zone_malloc
llvm_zone_mark_size
llvm_zone_ptr_set_size
llvm_zone_malloc
JITDylib "<main>" (ES: 0x0000000002040df0):
Search order: [ ("<main>", MatchAllSymbols) ]
Symbol table:
    "gs259": 0x00007ff6a409b000, [Data][Hidden] Ready
    "gs261": <not resolved> Never-Searched (Materializer 0x204e830)
    "add1_adhoc_W2k2NCxpNjRd_maker": 0x00007ff6a45f4000, [Callable] Ready
    "gs260": 0x00007ff6a409a000, [Data][Hidden] Ready
    "llvm_zone_mark_size": 0x00000000004580e0, [Data][Hidden] Ready
    "add_address_table": 0x0000000000458400, [Data][Hidden] Ready
    "llvm_zone_ptr_set_size": 0x0000000000458180, [Data][Hidden] Ready
    "add1_adhoc_W2k2NCxpNjRd__3881": 0x00007ff6a4099000, [Callable] Ready
    "llvm_zone_malloc": 0x0000000000457a40, [Data][Hidden] Ready
    "llvm_zone_mark": 0x0000000000458040, [Data][Hidden] Ready
    "new_address_table": 0x0000000000458270, [Data][Hidden] Ready

That comes from this program.

<2020-07-03 Fri>

The plan this weekend is to finish up last weekend’s project of compiling and running a small snippet of IR and then to start auditing LLVM use.

I finished up getting the two snippets from earlier compiling in my test repo. I didn’t bother with the _scheme function.

<2020-07-04 Sat>

Today is audit LLVM use day. The goal is to separate LLVM 3.8 from Extempore so that I can swap between it and version 10. I don’t know if this is possible, but we’ll find out. I’ve started on this work with the refactoring, but my memory of that is fuzzy. Let’s have a look at which files currently touch LLVM.

EXTLLVM.cpp had a few remaining LLVM includes but it seems they weren’t important because I was able to remove them easily enough. SchemeLLVMFFI.cpp is probably the scariest one at the moment. We give some LLVM data structures to Scheme-land and I don’t really know what happens with them. e.g. jitCompileIRString returns a Scheme pointer cast from an llvm::Module*. OK I grepped through the function calls and it looks like that gets passed back to llvm:export-module, which is pointer export_llvmmodule_bitcode(scheme* Scheme, pointer Args). If I’m right and that is the only use of llvm::Module in this interface I should be able to make it opaque and drop the LLVM headers.

<2020-07-05 Sun>

I made a little more progress yesterday on removing LLVM. I think I’ve taken all the easy headers out of SchemeLLVMFFI.cpp and the next few will require some more work.

One thing I’d like to do is make Modules opaque so we can keep the pointers somewhere in the LLVM code rather than have them living in Scheme-land too. I figure an easy way to do this is to put them in a namespaced array and hand out indexes. That way I can give out the index with jitCompileIRString and accept it in export_llvmmodule_bitcode. This is going to make things more complex, but hopefully it’s worth it for the moment. All these changes to isolate LLVM are making the code harder to read but hopefully by putting it all in one place we might get a sense of a cleaner abstraction.

Not a lot to write about – I’m about 1/3rd of the way through shoving all the exposed LLVM types into their own module. I think when I’m done with this though I can start re-refactoring the LLVM related code and port it to work with LLVM 10!

More generally I should be able to manage all the resources we hand out pretty easily. That’s because they’re pointers and they are opaque in Scheme. The only way anyone uses one of those pointers is by giving it back to code that knows about LLVM and doing a conversion to a known type. This should be fine I hope. What we can do is put a shim between the functions that know what types they are and write some functions that don’t. So from the Scheme FFI’s point of view everything will be a void * but because they’re already passed to the right places things will continue to work. Once we do that we can take types out of the interfaces, drop the headers, and be done with isolating LLVM.

<2020-07-15 Wed>

I have finished pulling LLVM out of SchemeLLVMFFI and moved it all into EXTLLVM2. There are a few other places in the code where LLVM is used and I think I might as well attack those too. At first I was thinking the easiest next step would be to just rip out 3.8 and put 10 in and fix up all the build issues, but thinking about it a little more I think the easiest thing to do would be to rip LLVM 3.8 out, add in a mock, and convince extempore to compile with that. Then I could try to link in LLVM 10 without worrying about fixing lots of different things at once, and slowly replace the mock functions with real ones.

After finishing with SchemeLLVMFFI.cpp I decided to instrument the functions and record a trace of the order they get called, it looks like this:

initLLVM
addGlobalMapping
...
addGlobalMapping
finalize
jitCompile
float_utohexstr
globalSyms
globalSyms
globalDecls2
globalDecls1
IRToBitcode
doTheThing
...

The nice thing about this is it gives me a list of functions I need to implement and an order to implement them which will allow me to make consistent progress. If I do it this way, I’ll have a build that always works and runs up to the point that I’m at.

Anyway so the plan right now is to find the last remnants of LLVM. It looks like the only thing left is in Extempore.cpp:

#ifndef _WIN32
#include <unistd.h>
#include <signal.h>
#else
#undef min
#undef max
#include "llvm/Support/Host.h"
#endif

After removing that it was surprisingly easy to strip out LLVM entirely. Can I now point cmake at my LLVM10 build? Almost! Just need to swap to a newer C++ version, aaaand:

bool initLLVM() {
    DTRACE_PROBE(extempore, initLLVM);

    auto TheJIT = llvm::orc::LLJITBuilder().create();

    std::abort();
}

it compiles!

<2020-07-18 Sat>

Slowly working away at getting those functions running in LLVM 10. Promising so far, but now I’m working on the part that does the compiling. I’ve run into this error:

parseAssemblyInto: <string>:29:48: error: base element of getelementptr must be sized
  %offset_ptr = getelementptr inbounds %mzone, %mzone* %zone, i32 0, i32 1

I don’t remember if this is what the LLVM patch was for, but I suspect it might be. I’m going to try and add the type declarations to inline.ll for the moment but I vaguely remember this coming back to bite me when I tried to upgrade to LLVM 5. It works, for now :|.

<2020-07-19 Sun>

and it no longer works :(. The reason is that when you identify a structure type it is not uniqued. i.e. when you write %mzone = type { ... } that’s different to any %mzone declared earlier. I think the fix is when generating declarations to go at the top of the module, I should strip the .xx in e.g. %mzone.18*. It’s a hack but that whole thing is a hack so it’s what we’ll do! OK that works.

Latest issue I’m running into is failing to redefine a function. I thought we were going to be fine with the functios being erased fromscheme, but turns out not all of them are. So I’ve added in a terrible hack to delete any functions that are defined.

I wrote a few more awful hacks today but it works! It gets through the startup sequence and I can make some sounds!

<2020-07-25 Sat>

So close to getting it working! Some things to do:

  • clean up hacks
  • get tests passing
  • use LLVM properly ?
  • check performance is reasonable

<2020-07-26 Sun>

Mostly been working on some GitHub actions to speed up CI and also get things working again. Currently trying to build LLVM 10 with GitHub Actions and hopefully cache it for faster builds. Part of this is minimizing the size of the LLVM install directory. All the gains are in not building LLVM tools like llvm-as, but I know we use some of them in Extempore so I’ll have to selectively build them.

<2020-07-31 Fri>

While waiting for GitHub Actions to try my latest LLVM build ideas, I’m trying to continue with the regular build process locally. The next target I’m interested in is aot_base. Currently something is aborting somewhere, but I don’t know what. I turned tracing back on and it tells me we’re running setOptimize, which currently is just a call to std::abort() so that will certainly do it. Rather than make it do something I’m inclined to just let it do nothing for the moment. It looks like it worked. Trying aot_math now and it’s looking good too. I’m honestly so so surprised. Oh the .so files have not materialised. Why is that?

Maybe I’ve misunderstood how the aot targets work. We’ve produced a base.xtm file and it contains the call (llvm:compile-ir (sys:slurp-file "libs/aot-cache/xtmbase.ll")). xtmbase.ll appears to contain all the IR for base.xtm to be compiled in one go. For some reason I thought it would get compiled to a shared library though. What is llvm-as being used for? Making bitcode? I should compare this against vanilla extempore tomorrow.

<2020-08-01 Sat>

OK I tried vanilla extempore and it appears to have the same behaviour. While I’m here though I might as well run through the rest of the build to see all the steps. But it doesn’t build for me without some modifications :(, maybe I’ll come back to this.

I’m trying to get the default make target to run to completion. We’re currently stuck at:

env EXT_LLVM_DIR=/home/nic/code/llvm-10.0.0.install ./extempore --nobase --eval '(impc:aot:compile-xtm-file "libs/core/audio_dsp.xtm")'

Apparently the IR produced for xtmaudiobuffer does not include a definition of the String type. I think String is defined in base.xtm, which is being loaded from the “compiled” aot_cache/base.xtm + aot_cache/xtmbase.ll pair. Does the process of loading either of those files include defining String? From base.xtm:

(register-lib-type xtmbase String <i64,i8*> "xtlang's string type\n\nTuple contains <string_length,pointer_to_data>\n\nShould be created, modified and destroyed with the String_* library functions.\n\nSecond item in tuple is a char* c-style string ")

Currently I have a hack to remember all types that get defined the normal way and include them at the start of each module so I can try and do something like that for this case too.

Ok so what is the path that xtmbase.ll takes through the C++ code? It ends up as an argument to jitCompileIRString. I think what I will try is rather than only extracting types if the entire code parameter is a type declaration I’ll try and look for all type declarations. The regex I’m using now is (%.*? = type.*$). I dropped the end of line character, $, and got xtmaudiobuffer to load :).

The latest issue:

[100%] Linking CXX shared library libassimp.so
/nix/store/hrkc2sf2883l16d5yq3zg0y339kfw4xv-binutils-2.31.1/bin/ld: ../contrib/zlib/libzlibstatic.a(compress.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC

Hmmm think I might just provide a system zlib instead of trying to figure this out.

All right! I got the default make target to build. Next up will be getting tests to pass.

<2020-08-02 Sun>

Lots of tests are passing now!

nic@friendo ~/c/e/extempore.build> ctest
Test project /home/nic/code/extempore/extempore.build
      Start  1: tests/core/system.xtm
 1/39 Test  #1: tests/core/system.xtm ....................................   Passed   11.95 sec
      Start  2: tests/core/std.xtm
 2/39 Test  #2: tests/core/std.xtm .......................................   Passed   10.06 sec
      Start  3: tests/core/xtlang.xtm
 3/39 Test  #3: tests/core/xtlang.xtm ....................................   Passed  104.26 sec
      Start  4: tests/core/generics.xtm
 4/39 Test  #4: tests/core/generics.xtm ..................................   Passed   59.40 sec
      Start  5: tests/external/fft.xtm
 5/39 Test  #5: tests/external/fft.xtm ...................................   Passed   22.90 sec
      Start  6: examples/core/audio_101.xtm
 6/39 Test  #6: examples/core/audio_101.xtm ..............................***Timeout 300.09 sec
      Start  7: examples/core/fmsynth.xtm
 7/39 Test  #7: examples/core/fmsynth.xtm ................................***Timeout 300.10 sec
      Start  8: examples/core/mtaudio.xtm

Time to figure out why those tests are timing out though.

I looked into it and they also time out on the main branch soooo, it’s expected? Might look into that another day.

<2020-08-08 Sat>

OK. What now? The first thing that comes to mind is getting this all building on more platforms. I suspect I’ve broken Windows and macOS support in unexpected ways. Let’s start with mac because it’s probably easier to get working and I have an easier time getting access to a Mac currently. I can get it to build on a mac no problem, but it fails on GitHub Actions and I can’t figure out how to get it to give me logs of the failure :(. This issue seems to match what I’m seeing. :0 I clicked “re-run jobs” and now it’s giving me logs from the mac LLVM build.

It seems like it was taking too long and that’s why it failed? nproc is not available on the GitHub mac runner so I replaced -j$(nproc) with -j2 and the LLVM build step started passing.

<2020-08-09 Sun>

Today I’m trying to figure out why the tests are segfaulting on mac. It’s easy enough to reproduce, run (bind-val a_cstring i8* "hello"), and then confirm by writing an XTLang function that a_cstring is 0x0.

I’m writing this after fixing the issue. I noticed while testing out various functions on a mac that binding a function the second time and calling it from scheme seemed to keep calling the original function. So redefinition was broken! I figured this could be causing the above issue because extempore sets global values by compiling and calling an ad-hoc function that sets the value. What if it was continuing to call an older function with the same name? The value wouldn’t get set. I checked the LLVM IR and sure enough a_cstring is initialized to zero which made me think the initializer code wasn’t being run.

About a month ago I wrote some experiments with LLVM10 to see how I would get extempore’s LLVM IR running. One of the things I did was test out removing the symbols of compiled functions from the JITted code’s symbol table. This worked fine on Linux and I assumed it would work on mac but turns out it doesn’t. I compiled that program on a mac today and it complained it couldn’t find the symbols I was asking it to remove. After staring at it for a bit I realized all the symbols on mac are prefixed with an underscore! Add the underscore back in and it works fine. Turns out what I needed to do was use LLVM ORC’s name mangler to get the right symbol. After adding that in it worked on both platforms.

I’ve updated extempore to use mangling as well so now I’m waiting for a build to see if tests are passing on mac. They are!

On a related note, one thing I’ve been waiting for is removable code which seemed to be coming in LLVM11 but apparently didn’t make it in time for the release. http://lists.llvm.org/pipermail/llvm-dev/2020-July/143532.html :(

Speaking of removable code: Unfortunately I was not able to get support for that ready in time for LLVM 11. I am continuing to work on it though, and I’ll provide more updates on that work as the patch takes shape.

Windows builds next?

<2020-08-15 Sat>

All right let’s try and turn windows builds on and see what happens. I think I’m out of Azure credit, but I can’t really tell with its byzantine interface.

It feels like computing has taken a step back when I’m tweaking a YAML file to try and get something running in powershell on a computer I can’t see or touch. There’s no interactivity. When it fails I have to wait for the entire build again to see if my fixes work. We’ve taken shells and made them harder to use. :(

<2020-09-19 Sat>

I’ve been distracted learning Rust for the last month while not having an easy way to fix up Windows builds. But now I have a Windows 10 VM running and I’m going through the instructions I wrote a few months ago to a build out and figure out what’s wrong with the GitHub Actions workflow file.

I think I have everyting installed now. Instead of figuring out what to do by hand I’ll follow the steps in the workflow file!

Of course there is a bug in LLVM 10.0.1 that means it can’t be compiled by the latest release of VS2019 :(. I guess I’ll just leave this alone until LLVM 11 is ready.

<2020-11-21 Sat>

been a while since looking at this. I should finish cleaning up the code now that it works on Linux + OSX.

What state are the branches in? I have llvm-refactor-cmake. I don’t remember why I made it. I’ll just merge it in? It looks like it was the LLVMification. that’s fine I’ll merge it in.

LLVM 11 is out and hopefully that fixes the VS compile problem, but for now I’m ignoring that and continuing with LLVM 10. I think the most pressing issue in the code is the proliferation of maps with useless names in the LLVM layer so I should dig into those.

<2020-11-27 Fri>

Bumping to LLVM 11 today, seems to be going pretty smoothly. I’ve also started on the slow process of working on an aarch64 release.

<2021-01-16 Sat>

OK back on it in 2021. Currently trying to merge latest master in. Does it still build? extempore still builds!

Finally got Windows building extempore, but it’s failing on the AOT stuff. I think I might leave that for now though. I rented an AWS instance to figure out what was wrong and I can leave it in a ready state for debugging later.

<2021-01-18 Mon>

<2021-01-23 Sat>

I plugged my laptop in to charge but forgot to check the other end of the cable was plugged into the wall so I forget exactly where I was up to on the second PR. But I have all the commits and I can go back over them.

Looks like I’m moving zone-related code into src/EXTZones.cpp.

<2021-01-31 Sun>

My zone’s branch seems broken and I have no idea why so I’m going to do it again but with more commits so I can find where it breaks.

D:\a\extempore\extempore\src\EXTZones.cpp(73): error C2280: 'extemp::EXTMutex::EXTMutex(const extemp::EXTMutex &)': attempting to reference a deleted function [D:\a\extempore\extempore\build\extempore.vcxproj]
  D:\a\extempore\extempore\include\EXTMutex.h(88): note: compiler has generated 'extemp::EXTMutex::EXTMutex' here
  D:\a\extempore\extempore\include\EXTMutex.h(88): note: 'extemp::EXTMutex::EXTMutex(const extemp::EXTMutex &)': function was implicitly deleted because a data member invokes a deleted or inaccessible function 'std::recursive_mutex::recursive_mutex(const std::recursive_mutex &)'
  C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Tools\MSVC\14.16.27023\include\mutex(109): note: 'std::recursive_mutex::recursive_mutex(const std::recursive_mutex &)': function was explicitly deleted

I’m not great at the finer points of C++ but let’s try and think about this error. Here’s the code in question:

static extemp::EXTMutex alloc_mutex = []() {
    extemp::EXTMutex m("alloc mutex");
    m.init();
    return m;
}();
extemp::EXTMutex::ScopedLock lock(alloc_mutex);

'extemp::EXTMutex::EXTMutex(const extemp::EXTMutex &)': attempting to reference a deleted function That looks like a constructor to me. We are trying to construct an EXTMutex so that sounds right. Are we trying to construct it with an EXTMutex &? Is that a special member function or a vanilla bonus constructor? OK my little bit of internet searching brings me to “copy initialization” which is what I think we’re trying to do here. It says we’re trying to invoke a deleted or inaccessible function std::recursive_mutex::recursive_mutex(const std::recursive_mutex &). So I’m guessing we’re trying to copy-initialize the member variables also but this one doesn’t support that? Going to look up std::recursive_mutex now. Rather than trying to figure this out can I just wrap it in a std::unique_ptr? I’m sure we can copy-initialize that here lol.

What I’m trying:

static std::unique_ptr<extemp::EXTMutex> alloc_mutex = []() {
    auto m = std::make_unique<extemp::EXTMutex>("alloc mutex");
    m->init();
    return m;
}();
extemp::EXTMutex::ScopedLock lock(*alloc_mutex);

Now to commit that and wait a long time to find out if it works :|. Oh this is C++11 I thought it was a newer version.

I ended up allocating the mutex with new because make_unique isn’t available in C++11. It seems to be working though!

<2021-02-07 Sun>

Final commit is to add the EXTZones namespace. PR https://github.com/digego/extempore/pull/398.

<2021-04-10 Sat>

Haven’t touched this code in some time! But now I want to merge in the changes which build LLVM separately from extempore to speed up CI.

I’ve written this for LLVM 11 in my branch, but I need to get it to work for LLVM 3.8 in the main branch. The first question is, how do you build LLVM 3.8 as used for Extempore? I guess I’ll check my shell history:

Maybe one of these?

tar -xJf ~/Downloads/llvm-3.8.0.src.tar.xz
cd llvm-3.8.0.src/
patch -p0 < ../extempore/extras/extempore-llvm-3.8.0.patch
env CXX=g++-5 CC=gcc-5 cmake -DLLVM_TARGETS_TO_BUILD=X86 -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_TERMINFO=OFF -DLLVM_ENABLE_ZLIB=OFF -DLLVM_INCLUDE_UTILS=OFF -DLLVM_BUILD_RUNTIME=OFF -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_INCLUDE_TESTS=OFF -DLLVM_INCLUDE_GO_TESTS=OFF -DLLVM_INCLUDE_DOCS=OFF -DCMAKE_INSTALL_PREFIX=../../installs/Release3.8/ ../../llvm-3.8.0.src/
mkdir llvm-3.8.0.build
mkdir llvm-3.8.0.install
cd ../llvm-3.8.0.build/
env CXX=g++-5 CC=gcc-5 cmake -DLLVM_TARGETS_TO_BUILD=X86 -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_TERMINFO=OFF -DLLVM_ENABLE_ZLIB=OFF -DLLVM_INCLUDE_UTILS=OFF -DLLVM_BUILD_RUNTIME=OFF -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_INCLUDE_TESTS=OFF -DLLVM_INCLUDE_GO_TESTS=OFF -DLLVM_INCLUDE_DOCS=OFF -DCMAKE_INSTALL_PREFIX=../llvm-3.8.0.install ../llvm-3.8.0.src
env CXX=/nix/store/q12jswrg89qf6x0pf2sxfqd0gq8i6xsj-gcc-wrapper-5.5.0/bin/g++ CC=/nix/store/q12jswrg89qf6x0pf2sxfqd0gq8i6xsj-gcc-wrapper-5.5.0/bin/cc cmake -DLLVM_TARGETS_TO_BUILD=X86 -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_TERMINFO=OFF -DLLVM_ENABLE_ZLIB=OFF -DLLVM_INCLUDE_UTILS=OFF -DLLVM_BUILD_RUNTIME=OFF -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_INCLUDE_TESTS=OFF -DLLVM_INCLUDE_GO_TESTS=OFF -DLLVM_INCLUDE_DOCS=OFF -DCMAKE_INSTALL_PREFIX=../llvm-3.8.0.install ../llvm-3.8.0.src
env CXX=/nix/store/q12jswrg89qf6x0pf2sxfqd0gq8i6xsj-gcc-wrapper-5.5.0/bin/g++ CC=/nix/store/q12jswrg89qf6x0pf2sxfqd0gq8i6xsj-gcc-wrapper-5.5.0/bin/cc EXT_LLVM_DIR=/home/nic/code/llvm-3.8.0.install cmake -DCMAKE_BUILD_TYPE=ReleaseWithDebInfo -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DCMAKE_INSTALL_PREFIX=../extempore.install ../extempore

Certainly the patch step and one of those build commands might be helpful. I’m wondering if we could use the existing cmakelists file to build LLVM and cache that result? … let’s not.

hmm the only thing I think will be a problem here is running patch on the windows machine. I don’t have a way to test that out :(. Looks like the path might be C:\msys64\usr\bin\patch.exe.

<2021-04-11 Sun>

Still working on building and caching LLVM 3.8 for CI and I think it’s going surprisingly well. Most of the issues are the same ones I had the first time so thankfully I think I’ll get through this without needing a windows machine to test on.

Looks like the extempore builds aren’t seeing the LLVM headers? The relevant section in CMakeLists.txt:

target_include_directories(extempore
  PRIVATE
  src/pcre
  ${CMAKE_BINARY_DIR}/portaudio/include # installed by ExternalProject
  ${EXT_LLVM_DIR}/include)

I feel like EXT_LLVM_DIR should exist… let’s invalidate the LLVM cache and see if that works.

That fixed it, but patched LLVM 3.8 is also not building on Windows 2019 even though it does on the main repo? Maybe the prepatched version has some extra changes? Maybe there are some compiler settings from the parent CMakeLists.txt?

ok the prepatched version seems to be building!! very exciting.

<2021-04-12 Mon>

Looks like we’ve got the basic build working, but now I want to improve things a little more. Right now we’re building extempore and the aot-compile targets in one step, let’s break that into two steps so we know which one failed. Then I need to update the release-binary.yml file, because I’ve probably broken it.

Got it all merged! The windows-2016 build turned out to be flaky, so Ben turned that back down to -j1.

<2021-04-15 Thu>

Looking for some more work to pull out of llvm-refactor. The diff on GitHub doesn’t seem accurate :( so I guess I’ll use emacs or something. I guess the next thing to work on is the LLVM isolation? Maybe I should read the commit log!

I want from roughly e702a280c5392a4c7cb45bf3ea1f8c2f32b0412f - "cut llvm includes from EXTLLVM" to the step before 4b6d292721dd7c8e99d83ba785124e26b0bca779 - "extempore but without llvm". I think that’s a good goal.

$ git rev-list e702a280c5392a4c7cb45bf3ea1f8c2f32b0412f..4b6d292721dd7c8e99d83ba785124e26b0bca779
4b6d292721dd7c8e99d83ba785124e26b0bca779
bbb8a499139740965177f9336f59f700b9812510
e53ff15ab5301bf0c0d0f01ca8ca5cbb387d237b
51fbbbc44e6463a7435233a4021435f9f004cfaa
5524a6d990d27bd023f33fac3302d7052d7c6bf0
1ad38429891ef1ba7e56752e787e801aed5f0147
f1bd6ab6b00488567cc6ce8a71178ae1f5007f70
2645290f8214c05c0eac61f3dd24c821cde81b22
871be6105804fa1bd0e6db712b5f34e883f60b41
f1232e153aa280eb76ee25cd174fa81e77543519
b55d1c27e33054f1df405563dad74bf1ab797f98
e05ea88b27358627ff87f8c5bf84d49f982c0b02
57eaf9c8621d76ada77f46fa0b7283599639047f
05815761e427584f29eac32f501c26e642dafaad
3d7b6fbf119a7120021c23b647d39b00db616258
f8bea16d8c2b2d84cb5d534d1846df1373e9153b
5b13bb7730bcdd4c1800d91c142e6203e69f593e
b86a775f1d039de65092bad878024731bd1051c4
e4af034f64d3a11dca4d0adf01a3a25bc9d1e5f4
5f29b7a236a53823c2c3ab09bf9ba8212f1a334a
57e5568bbf4d56e4ab9d3319dd27dd2de5fcb0ac
871416488a50abc2afcd1c0ca54f32ab435522e3
fd604966fe8aa8589aa5cac0bd5e5a31e7ee7d11
5c2fbd66607be9daf87c72b4df8bc8a448cdf93a
0efd951dd08dbe4c2d78be42a49d2c1507bbcc13
ec4ee7c89f1ad98d1fd3f45fa71255fed5f13bc0
bb802808de78eb81c70b1d065df87d56bade9499
b4cbb7e75728ee27ab37f8d98e51a2dc23f67b9a
7c4bf7d39fe7c57b6eabf0ecb66f60ea315411a2
f5a137a35e39881db8acab2eec3583bc791c3363
b49b59917c6a0eebe46ab63f505153d8b70703f0
c33c8366ccfa1af4f6a1878a5b6862a488ecb32d
241d35ccc4ece150346f05aafb8eb15b78719215
846879ab23deedade796158582506e18978e3d03
0fe14784d8bc2a963d219bd9668808c025a3edbe

I guess that’s the list and I should go through it one commit at a time! We’ll call it… isolate-llvm. The changes actually start way earlier but it’s the end goal that matters!

I think the other thing I need to think about is the end goal for each of these files so I know I’m working in the right direction.

  • EXTLLVM – yeah idk, misc dumping ground
  • SchemeLLVMFFI – should contain no LLVM code, only pointer someFunction(scheme *Scheme, pointer Args) type functions.
  • EXTLLVM2 – should contain LLVM code.

I could also use git log -S'thing' -p to help me find the changes I want.