This is the third part of my brain dump about my work on Second Life. One of the main projects that I worked on during my time at Linden Lab was implementing User Scripts on Mono
LSL2 VM is Second Life’s original VM for executing user code. Internally, Scripts were actor-like; they had a queue for messages and could communicate with other Scripts via messages. However, they can generate side effects on the world via library calls.
Each Sim was associated with a particular region and the objects that contained the Scripts could roam from one region to another. The VM provided transparent migration of Scripts as they moved between Sims.
Migration is described as transparent if the program can be written in the same way as non-mobile code. When the code is migrated it should resume in exactly the same state as before. LSL Scripts did not require any knowledge of where they were being executed.
To achieve this the full state of the Script is stored. This includes: the program code, heap objects, the message queue, the stack and program counter. As I said in my previous post, the LSL2 VM achieved this by putting all this state into a 16K block of memory. By storing everything is this block, this made migration simple because there was no serialisation step. When the Script is suspended, these blocks can just be shipped around and interpreted by any Sim.
Under Mono we wanted to provide the same quality of transparent migration. CLR (.Net) code is portable; write once, run anywhere, objects can be serialised, but the stack or program counter can not be accessed from user code.
The CLR stack is pretty conventional; it is composed of frames. A new frame is created each time a method is called and destroyed when a method exits. Each frame holds local variables and a stack for operands, arguments and return values.
To be able to migrate a Script, we need to suspend it and capture its state. Scripts are written by users and are not trusted. They can not be relied on to cooperatively yield. You can suspend CLR threads, but you have know way of knowing exactly what its doing and it can cause deadlocks.
The Script’s code is stored in an immutable asset. This is loaded by the Sim by fetching the asset via HTTP on demand. Multiple copies of the same Script will reused the same program code.
Classes can not be unloaded in the CLR, however, AppDomains can be destroyed. A Script was defined as live if it is inside an object rooted in the region or on an Avatar. When a new Script is instantiated, it is placed in the nursery AppDomain. If more than half the Scripts in a the nursery AppDomain are dead, the live Scripts are moved to the long lived domain. The long lived AppDomain has a similar scheme but it is replaced with a new domain. Migration was used to move Scripts between AppDomains.
The Scripts heap state is either referred to by a frame or the Script object itself contains a reference. The queue of messages is just an object on the Script base class. These objects are serialised and then restored at the destination.
The stack serialisation was achieved by modifying the program’s byte code that inserted blocks do the capture and restore of the current threads state. Doing it at the code level meant that we did not have to modify Mono and the code could also be ported to a standard Windows .Net VM.
The assembly rewriter was implemented using RAIL, which is similar to Mono Cecil (which was released part way through the project).
Each method on the Script object is modified. At the start of each method, a block of code is injected to do the restore. It get the saved frame and restores the local variables and the state of the stack. It then jumps to the part of the method that was previously executed. At the end of the method it inserts code to do the save which populates a stack frame. At various points within the method, it injects yield points where the program can be potentially suspended.
Here is a pseudo-code example of the the kind of thing the assembly re-writing did.
Original:
void method(int a)
{
var result = 2;
while (result < a)
{
result <<= 1;
}
}
Re-written:
void method() { if (IsRestoring) { var frame = PopFrame(); switch(frame.pc) { case 0: // instructions to restore locals and stack frame goto PC0; case 1: // instructions to restore locals and stack frame goto PC1; // ... } } // snip var result = 2; while (result < a) { result <<= 1; // Backwards jump is one of the places to insert a yield if(IsYieldDue) { frame.pc = 1; frame.locals = ... frame.stack = ... goto Yield; } PC1: } Yield: if (IsSaving) { PushFrame(frame); } // return normally }
Despite all the overhead injected into the methods, Mono still performed several orders of magnitude faster than the original VM.
We would have liked to allow other languages, like C#, however, supporting all of the instructions in the CIL was a challenge; there are hundreds. We controlled the LSL compiler and knew it only used a subset of the CIL instruction set.
The techniques I’ve described for code re-writing were influenced by some Java implementations of mobile agents. See Brakes and JavaGoX (I’m unable to find the original material, but I think this is the original paper)
For those who are really keen, you can also find the actual test suite I wrote for the LSL language here: LSL Language Tests (I was Scouse Linden, for those who look at the page history). We had to break it up into two parts because of the arbitrary 16K limit on program size of LSL; the basic tests were too big! If you look at the tests, you can see some of the quirks of the LSL language, from the assignment of variables in while loops, to rendering of negative zero floating point numbers.
Also here is the source to the LSL compiler but sadly it looks like they spelled my name wrong when they migrated to Git.
I’m not intending to write any more on the topic of Scripting in Second Life unless there is any specific interest in other areas. I hope you’ve found it interesting or useful.