Building the Ultimate Mobile Game Engine: Starlite

TL;DR: The making of our game engine, Starlite, why we did it, and the impressive results.

I’m Črt Kristl and I’m a Principal Software Engineer at Outfit7. I’ve been with the company since 2016. Since then I’ve implemented several systems to facilitate the development of games and art.

In this blog post I’ll dive into Starlite, our proprietary game engine and editor. I’ll tell you why we made it and how it affected game development, as well as other areas of game production. This engine was written in C++ using many open source libraries, while the editor was written in C#.

Example of the editor and engine running: In the center we’re capturing the engine’s output to a 2D texture within the editor with zero delay.

A Brief History

It all started a few years ago, when we at Outfit7 were developing My Talking Hank. We weren’t satisfied with the solutions Unity offered at the time (version 5), since many things didn’t work and required workarounds, leading to increased development time and frustration among developers.

We decided to tackle the problem with a solution of our own. Within a few months, we created a prototype of the engine. (We already had experience working on engines, so we knew what we were getting into.)

Using our prototype, we made a My Talking Hank proof of concept, demonstrating that our engine was significantly faster than other commercially-available engines. And so, Starlite was born.

Let’s take a look at how we made the engine, the challenges we encountered and the results in comparison with other game engines.

The Engine

Starlite engine is written in C++ for cross-platform development (Windows, MacOS, Linux). Its runtime supports iOS, Android, and WebGL, as well as the three desktop platforms mentioned above.

Architecture

Starlite uses two concepts to handle all the high-level logic: managers and singletons. Managers, such as ResourceManager, are determined during the initialization process and are ordered (the order can be changed according to the project). They can also be overridden to implement custom logic. Meanwhile, singletons, such as Log, are static. They have no inheritance and have predefined initialization points. Singletons simplify access points and make code very readable. This really sped up the onboarding of new developers and shortened feature implementation times.

This architecture proved to be easy to understand and implement, so it was something we used for projects built with Starlite as well.

Starlite used a bunch of open-source libraries, including BGFX, for rendering. This reduced the development time significantly, since the libraries provided a ton of functionality that would normally take a long time to get working correctly (e.g. interfacing with different rendering APIs like OpenGL, DirectX, etc.)

Our goal was to build a minimal API that would expose the functionalities of these libraries — as well as our own — and teach game developers how to use it. We didn’t want to bloat the API in such a way that developers could horribly misuse the API, since this often promotes bad coding practices and it’s also harder to maintain. We had some pushback on that front, but in the end it proved to be a wise decision because developers produced better code and learned to understand why the API was made in the way it was.

Starlite also uses prefab systems (as well as prefabs within prefabs) found in many other commercial engines. It also uses component-based architecture, such as those used in Unity.

One of the best parts of Starlite is its asset and resource system. Compared to other publicly-available engines, our solution loads and instantiates assets ten or more times faster. The main component is our in-place build system.

In-place Build

Most engines choose to implement their build system through some sort of reflection. Then, they transfer the variables and data through the reflection during runtime, since this makes it easier to maintain and edit on the fly. But this process takes time. Instead, we opted for an in-place system, which meant that we would build the data once for each platform and then just memory copy during runtime, fixing the pointers within that memory. This is an extremely fast operation, but it requires careful engine design.

We had to create our own data structure containers, such as List, Array and HashMap, among others, to support the tracking of data allocation and reallocation during the build. This was necessary in order to keep track of every pointer created (the new operator was overloaded as well) so we could identify what needed fixing during runtime. We also had to make sure 32-bit and 64-bit platforms could read the same data layout, so everything had to be wrapped within our structures.

This system had a few advantages:

Object instantiation was virtually free, requiring just a memory copy and pointer fixing.
There was no need for pools of objects, since taking objects from pools took about the same amount of time as instantiating a new object.
Switching scenes was nearly instantaneous.

However, there were also a few downsides:

Data for a scene/prefab had to be rebuilt every time something changed.
It was hard to debug when something went wrong (e.g. data not rebuilt properly).
Whenever we wanted to support a new data structure container, we had to make our own implementation for it.
It required a different system for transferring data from/to the engine when working with the editor, more on that later.

Overall, the in-place build was one of the main advantages of Starlite and it showed in the performance of the games we built with it.

The Asset System

Each asset in Starlite has a GUID, a 16-byte integer that’s unique within the project. We have several types of assets, with the most common ones being:

Scene/Prefab
Mesh
Animation
Texture
Font
Shader
Data

Assets can be linked by their GUID within the components. The field linking to the assets can be annotated in code so that it doesn’t preload the asset when the prefab/scene is loaded. This greatly simplifies memory management and asset organization. Dependencies of the assets are figured out during the build time, so assets not used in the game are never included. The loading of such assets is taken care of by the engine as well as it loads all the required dependencies. Our game developers never need to worry about whether something is included in the build or if something is taking up too many resources, it could just be marked as non-preload and then manually load it later.

In comparison, engines like Unity do not support this — the closest approximation would be Unity’s Resources folder, which you can dynamically load through string-based addressing. But everything in the Resources folders is included in the build, so you’re forced to use asset bundles, which sometimes work in mysterious ways.

Component System

Starlite uses a component-based system to drive game logic. It has components like MeshRenderer, Animator, PhysicsBody, etc. This has been proven to work in other engines as well, including Unity.

Rendering

Starlite uses the BGFX library under the hood to render everything. We had to implement our own layer of converting our data into data that BGFX accepts, as well as the order of operations and renderers. By default we now use multithreaded rendering, since modern mobile devices have multiple CPU cores, meaning there are no real drawbacks anymore. Starlite supports particle systems, skinned mesh rendering and blendshapes, font rendering, and UI rendering, among other things.

Animation

Starlite uses the ozz library for animation workloads. It has proven to be very good, as the animations are heavily compressed, and it’s efficiently parallelized. We also ended up porting this library to Unity, which helped reduce animation sizes. However, we will need a separate blog for that topic.

Sound

For sound we’ve used several different implementations. Currently, we’re using Soloud, an open source library. We also added support for audio mixers (e.g. audio mixers with a similar interface to Unity’s audio mixers). However, we also added support for custom audio implementations from project side such as FMOD/Wwise as our audio artists wanted more control over the playback and better user experience while making the sounds.

Performance

Game logic is written in C++ and the engine as well, allowing us to customize data structures and where memory allocations occur, as well as avoid inefficient access patterns, meaning that the performance, in general, is very good. We use a reference counting system to track pointers and objects so that game developers don’t have to track things like memory allocations. The syntax is very similar to C#.

Starlite outperforms other commercially available engines by 10–20% in most areas. This is because it’s built primarily for mobile devices and, as such, the focus has always been on performance and non-generic solutions.

Application Size

Application size is also something we wanted to optimize as much as we could, so our games are smaller compared to other engines. The extent of reduction depends on the game’s content, but the percentage generally hovers around 25%.

The base APK size (i.e. without user-added assets) on Android is 4MB in Starlite but 13MB in Unity. This was achieved by using better compression tools/settings and also partly because of the in-place build system. Application size is really important on mobile platforms because each megabyte you save increases the likelihood of the application being downloaded and also makes it less likely that the application will be removed from the device in future.

The Editor

Starlite editor is written in C# and supports Windows, Linux, and MacOS platforms. It’s run as a separate process from the engine so that if the engine crashes, the editor still survives.

Asset Management

Each asset in the project has a unique GUID assigned to it. As mentioned before, this is a 16-byte integer that’s unique within a particular project. Assets can also contain other objects, such as scene objects, which then require another layer of IDs. These are called file IDs. With this system in place, each object within the project can be found with a combination of GUID and file ID.

Starlite editor supports many types of assets, including (but not limited to):

Images (2D textures, cubemaps)
Audio
FBX files
Skeletons
Meshes
Animations
Data files (materials, physics materials, custom game data objects)
Scenes, prefabs

All the information about asset locations, types, last modification dates, and other properties is stored in a sqlite3 database that’s heavily optimized for fetching data. This duplicates the data on the hard drive (i.e. file contents) but on platforms that have slow file systems (e.g. Windows), fetching data from the database is much faster than opening and reading files. Editor users can also opt out of storing duplicate data, but, due to space constraints, only one-time build systems have used this option up to this point.

This database also stores outputs of asset processing, such as built meshes or textures for a specific platform. Whenever a user switches the target platform, the operation to reimport assets is extremely fast and can even be instant if there are no changes.

Importing

When opening a project in Starlite editor, we first identify which files are new or changed, and which files were deleted. This is done first through file timestamps, and if they are different from last seen timestamps stored in the database, we re-import them. At this point, we could check for hash changes as well, but this step wasn’t necessary in our use cases and would only have slowed down the import process.

Once we have a list of assets to import, we run jobs that scale according to the computer’s CPU core count. Since each asset is imported on its own thread and the data is stored in the database, the import process itself is extremely fast compared to other publicly available engine editors. At this point, we compile the scripts so that the project code is loaded into the assembly. We compile C# and C++ simultaneously to simplify the process.

Each type of asset gets its own asset processor, allowing project developers to customize the import pipeline however they wish. We have a lot of post processors for projects so that assets are imported in a standardized way, preventing human error.

Once an asset is processed, it’s put in a queue to write to the database. This is a very memory-intensive operation, since we store our results in memory. Permanent storage is often the bottleneck in this situation. Even the fastest SSDs can’t handle the throughput of high core count CPUs.

Once all the data is written in the database, we also scan the scene and prefab files, index them and update their dependencies.

This whole process takes about five minutes on a 16 core CPU for a project with more than 21,000 top-level assets (i.e. scenes, meshes) and 12GB of data on the first run. Each subsequent opening of the project usually takes just a few seconds, unless there are many changes.

Engine Interoperability

As mentioned above, the engine and the editor live in their own processes. We developed a system allowing the two to communicate, using shared memory or TCP sockets. Locally, the engine and editor use the shared memory system, since it’s much faster than the TCP sockets. Shared memory uses locks to communicate when data is available for reading or writing. The interface is very similar to writing normal code with DLL calls (e.g. int a = GetIntA();). Under the hood we take care of telling the engine what object and offset to read from or write to through the custom reflection system we built for the engine and editor.

Since we also support TCP sockets, we implemented a method of connecting to the engine on a mobile device through the editor. All the hierarchy and properties were synced this way, and you could even rebuild and replace shaders in runtime. This was really useful, since many Android phones have particular rendering quirks.

One of the major challenges here was how to optimize the data and delta checking in such a way that we wouldn’t bottleneck the CPU or memory operations (lock contention). We implemented a bunch of systems to minimize the amount of data the engine sent to the editor by hashing objects that were already sent and then rechecking hash on each subsequent send.

We also implemented an event system in the engine so that it would know when to send new objects, meaning we didn’t have to recurse through the hierarchy with every frame. On the editor side, we also had to perform similar optimizations because each property set in the editor would trigger various callbacks that would consume unnecessary CPU cycles.

Play mode

Users could also enter play mode within the editor, which would be exactly the same as building the game and running it, logic-wise. The only difference was how assets were built. In this mode they were built on-demand rather than all at once. This sped up the launch process significantly.

We also cached all the built assets for this mode in the database, so if there were no asset changes, play mode would be entered instantly and nothing would have to be rebuilt. Often, after pressing play it would take up to three seconds to enter the game, since the process entailed restarting the engine process and loading all the assets again. This would not be possible without the in-place build system and various optimization tricks.

All the other editor functionality would work the same as in edit mode, with the exception of prefabs, which would no longer be linked to prefab assets in play mode.

My Talking Angela 2 running in play mode in the editor.

UI

The UI for Starlite editor is built using the IMGUI open source library. We decided to use immediate mode GUI because we had to transfer some projects built in other engines to Starlite and we wanted all the custom editor scripts to be simple to reimplement. There were many challenges with this approach, since the immediate mode GUI is not suited for building complex UI.

Starlite editor supports custom layouts, drag-and-drop of files, assets, and other objects, dockable windows, separable windows, and utilizes BGFX to render everything, just like the engine.

C# Bindings

We built a custom reflection system for the Starlite engine and a custom tool, called Headertool, for parsing custom attributes in C++ header files. These attributes are applicable on fields, functions, classes, and structures. They define how the annotated members are stored in reflection (if at all) and how they are exported to C# as bindings. These bindings are required for the editor to work as all the data is stored in C#, not C++. They also take care of syncing the data from the editor to the engine.

Headertool uses the LLVM toolchain for parsing C++ code and building syntax trees of the classes, and then using this information to deduct types of members. Using a custom remapping file (e.g. how to remap a link to an asset in C++ to a link to an asset in C#), we then generate the necessary C# files.

Example of a C#-generated property from C++ definition.

Headertool also generates hashes of each class so that we know if both the engine and the editor can communicate with each other. This is done by generating a hash of the whole type tree and then sending this hash right after the connection between the engine and editor has been established. Everything is automated to prevent as much user error as possible.

User Code Assembly

The Starlite editor supports the latest Microsoft Visual Studio (Windows), Visual Studio Code (all platforms), and Xcode (MacOS) for code editing. All user code in C# lives in two editor assemblies. The first assembly is a so-called “startup assembly.” Code here is compiled and run before any code is visible from the second assembly. This is mostly used for asset preprocessing or code generation during project startup.

The second assembly is the project’s main assembly. It’s dynamically compiled and loaded whenever a code change is detected in C# or C++. This posed a few challenges, such as how to unregister user callbacks in the editor’s assembly, since keeping references to the user assembly would prevent it from being unloaded.

We implemented a bunch of custom delegate wrappers that could handle this, as we wanted our game developers to be able to code without the burden of unassigning from delegates and other menial tasks. Of course, we then also had to take extra care in storing user assembly’s types anywhere in the editor’s assembly, which proved to be quite annoying. For this purpose, we built a custom Roslyn analyzer to detect obvious errors. We also used WinDbg extensively for dumping the C# heap and detecting static references or reference chains to the project’s assembly.

Issues

The first game we built on Starlite was a challenge. There were a lot of bugs that we didn’t catch during the development of the engine and editor. And since bugs in the editor prevent game developers and artists from doing their work, it was our top priority to fix them. This process did lead to some frustration, but at this point most of the bugs have been fixed.

Currently, one of the major issues with Starlite is the user experience with the editor. Since it’s written in IMGUI, the layout work is a big problem. Building a complex UI is difficult and maintaining it is time-consuming. Some of our artists are also a bit unhappy with the look and feel of the editor.

Overall, maintaining such a big project and integrating custom solutions (i.e. tools like Bugsnag provide support for Unity, but Starlite needed custom implementations) is quite time-consuming, and since we can use other engines that are proven to work on all platforms, maybe diverting that manpower to game development could be a better option. But, on the other hand, those engines also come with their own problems that may be unfixable, especially if there’s no source code (Unity now shares source code with Enterprise users, which enables us to understand it better and carry out optimizations we couldn’t do before). It’s also harder to find developers with adequate knowledge of C++ compared to those who can use C#.

We’ve learned a lot from this project. If we could change anything, it would likely be to remake the editor UX and choose a different UI framework other than IMGUI. We’d also create C# bindings so that game developers could write runtime code in C# as well. This would speed up compiling times greatly and allow for faster development. Apart from these elements, everything else was pretty much spot-on, based on our experience thus far.

Conclusion

Starlite has proven to be a very good engine for mobile platforms. The games we’ve built on Starlite are some of our best performers, including My Talking Tom 2, My Talking Tom Friends and My Talking Angela 2. Game developers enjoy using it because it allows them full control of the code and they’re able to check source code or build it for themselves whenever something goes wrong. It’s also written in a way that every developer can pick up quickly, with a UI similar to a lot of other commercially available engines/editors.