What's this all about?
This page contains the many ramblings I've spread across varying internet sites over the last 5 years or so (with corrections and extensions). I've been building Reflection APIs since 2003 when I saw a need for one on the ill-fated Advent Shadow
game for PSP, and would really love to condense everything I've learned into a coherent set of documents that others can learn from. Unfortunately the subject is huge when applied to games and I'm just too busy with work
and other stuff
to organise it all.
I've implemented the "simple as pie" method documented below and you can find it, here: http://bitbucket.org/dwilliamson/reflectabit/
This contains everything up to the binary serialisation with versioning but leaves out a few things such as endian processing (necessary for network communication) and the object database (pointer serialisation). The work required to get this functional enough to ship a game is minimal (I used a similar solution when working on the Splinter Cell Conviction
The PDB method for generating reflection information is quite literally broken and doesn't compile after a mammoth refactoring session that was never finished. Luckily, thanks to the magic of source control, I've zipped up a version of it all before the craziness took over: http://donw.org/Rfl.zip
You'll have to excuse the name of the hosting "game"! It was initially meant to be a conversion of PDB files to an SQLite DB but quickly diverged (hence the app name).Reflection
A reflection API is a very basic, powerful and important tool that every game studio should have at their disposal. This is what I'd consider a reasonably featured-up C++ Reflection implementation capable of:
- Holding a database of types and their inheritance relationship toward each other.
- Storing a list of raw data members per type, each with name/type/offset tuples.
- Storing enumerations with their parent type and name/value pair.
- Storing functions/methods and their parameter lists, giving you the ability to request them by name and call them using a generalised parameter passing mechanism (for piping to scripts, RPC, etc).
- Storing properties, in the sense of Get/Set function/method pairs that look like average data members in your editor.
- Storing "attributes" next to any of the above. I've used name/string pairs in the past but there's no reason you can't go as far as C# does.
In the past I've also reflected events and introduced the notion of "type extension" to remotely extend existing types in a similar spirit to a few AOP techniques. They were pretty complex problems and tbh, the type extension was over-engineered. These are some of the options open to you if you have a Reflection API:
- Serialisation of any game type.
- Versionable serialisation of data.
- Network communication through serialisation and RPC.
- Binding to arbitrary scripting languages (including C#) through minimal translation layers.
- Inspect game state of any object at runtime.
- Memory mapping of data formats with post-load pointer patching.
- Automatically populate user interfaces for editing tools.
- Dynamic reloading of C++ source code.
There's too many ways to do reflection with C++:
- Using macros inline with your source.
- Using templates inline with your source.
- Doing either of the above non-intrusively.
- Using an IDL/DDL to generate cpp/h files.
- Merging the lot into a scripting solution and generating cpp/h files from there.
- Performing a post/pre-process of your cpp code.
- Extending 6 to a link-time post-process to catch function/method addresses.
- Mix 6 with some cpp generation to catch function/method addresses.
Some of them are just plain nasty and I'm continually surprised that boost hasn't got their own wildly over-engineered version of 3 as they already have all the code necessary in Boost.Python. Each will always be limited in some way and even though some don't have an on-disk database (it's all generated at runtime), this won't be a problem for your tools as you can either link to the needed modules or just send your type database over the network.
In the past I've done this two ways in production code and another, more transparent/powerful way at home...The Uber Solution
I can't remember any of the source code so this will be an approximation. It all starts with an IDL file, for example:
Code: import "SomeFile";
A, B, C, D = 0x54 * 7
interface iStuff : iBase
int data_member [transient];
property int GeneratesGetSetPair;
property char GenerateGet [readonly];
method void Hello(Blah x) [call_on_postload];
event Borked(int x, string blah);
This would generate:
- An XML file for the editors to read/write level data (the only client was the Lightwave-based level editor).
- A header file which was just a translation of the interface into C++.
- A source file that would register all this data with the runtime database on startup.
There were extensions that allowed you to drag in types from 3rd party libraries and specify their included headers (you can't get away without this really) and you could also use a C++ API to do your own registration. You would then inherit from the interface with a specific implementation and implement the methods. There were many reasons for this design which I could go on about but it had shortcomings in that it was a bit confusing and the overuse of virtual methods should have been avoided. A more concrete DDL approach would have solved this (ala UnrealScript but without the script bit and horrible interdependencies and changing one file rebuilds all and yadda yadda - owww it still hurts).
- XML-based serialisation of all world data (ended up a huge loadtime bottleneck).
- Auto-reload of any externally modified world data (e.g. change layout of current level while running the game).
- Binding C++ to LUA script.
- Binding C++ to C# (this was experimental and was never used to write any tools, though it worked well).
- Editing of objects in the "level editor."
In short, I believe the solution was over-engineered in a lot of ways (it was incredibly template-heavy and I also reflected the reflection API, *shudder*) but it worked pretty well and showed me that when done right, a DDL approach could be simple and good enough for any studio.The Simple-as-pie Solution
The next one was very simple. It only reflected data types, enumerations, data members and a pre-defined set of fixed attributes in place of a generalised attribute system. It was purely code-based and non-intrusive, with all registration performed outside a type's implementation. Again, it was template based (no macros anywhere) and looked something like this:
Code: Property props =
When you start dealing with get/set properties and function/method reflection your template-fu requires an order-of-magnitude increase in complexity if you want to do it cross-platform (you can use your platforms ABI to remove most of it if you're feeling heroic - Scott Bilas covers some interesting ideas on his homepage). In this case it was more than manageable.
- Binary serialisation of all data.
- Network communication for editor/debugging/profiling tools (auto property edit, stats display/graphing, etc) (actually, all I did was reuse the binary serialisation code but dump it into a network byte stream). Property editing was done by sending the type database over the network each time a new client connected (a few k).
- Ability to edit your C++ code while the game is running, having it auto-reload on successful compile without losing the current game state (far more powerful than edit-and-continue).
I really, really like this solution. It took 2 days to implement, is only a few hundred lines of code and was very stable, requiring very minimal updates to its implementation as more code started using it. And it was very fast - it's absolutely perfect for small teams on a fixed budget that are prepared to just shut up about coding miracle cures and get on with their work.PDB Method
The idea here is that all you need to do is specify what types and functions you want reflecting and the information required to do that is automatically deduced from the PDB file generated after compilation. It can be very powerful but I've not put any of this through production code; there may be big problems. Everything you need is reflected with minimal C++ code intrusion and automatically: data types, members, enums, functions, parameters, templates, etc. Some quick points to note about this approach:
- Through load-time pointer patching, you can employ this technique to get a constant-time typeof operator that returns your fully reflected type pointer upon request.
- You can add attributes to the C++ language. VC already supports attributes but the API for using them is undocumented and the implementation is too statically typed. I managed to settle on an interesting solution that used the __annotation intrinsic (also undocumented) and the Spart C# parser.
- You can use your platform's ABI to construct language bindings without adding large amounts of generated template code to your final build (e.g. see Boost.Bind/Function or LuaBind). When I wrote the Tahiti game language and VM, I demonstrated to myself that t was viable so I feel confident this approach can be applied to the PDB reflection scheme.
- Parsing the PDB file quickly is hard and required many attempts to get right. There is no guarantee, however, that the implementation I've written will perform consistently between versions of Dia2Lib. A test of the current implementation on an actual 25MB+ game executable returned results in 5-10 seconds.
- Between build configurations the functions that are exported from a class can change if the compiler decides to optimise them out.
- In general you have to watch what the optimiser does and actively try to stop it getting rid of functions that you have manually specified for reflection.
- Your collection serialisation becomes intrusive. While this drastically reduces the size of the code that gets generated during reflection, it does require your serialisation to depend on the collection implementation. Checkout the source for the std::vector implementation to see how this is achieved.
- You have to edit your header/source files in order to change the definition of your type. For some this may not be desirable.
- Reflection templates is a Very Tough problem and is probably best avoided, except for collections or smart pointers (if you use them).
When I started this system I was convinced it was simple. However, the PDB application is getting big and contains many nuances that rival the initial implementation of my old IDL compiler.
What would be nice is some help from the compiler authors at Microsoft and Sony. They have full access to this information and could output it in such a way that we could rely upon it to generate this data. This kind of support would yield quite measurable productivity gains, especially when applied to the next topic...Dynamic C++ Code Reloading
This is not just edit-and-continue, but the ability to edit large sections of your C++ code base without having to shutdown the game and reload the compiled executable. The repetitive cycle of compiling, linking, loading the executable, loading the level and reproducing the steps that put you in a position to test your feature can account for many hours of lost time, frustrated developers and gameplay that's worse than it should have been. Games and their engines have been putting in "hot-syncing" features for their assets or scripts for many years now and no production cycle is complete without them. Some games switch to scripting languages purely for the fast iteration and script reloading functionality - why can't we do this with the rest of our C++ code?
Assuming you have a good Reflection API, you can:
- Wrap all your classes in interfaces (no need for your very base libraries to have this).
- Embed your code in reasonably partitioned DLLs.
- Object names are stored as 32-bit CRCs.
When your game detects a DLL change, do the following:
- Run through all properties of objects that point to objects of types in the changed DLL. Replace the pointers with the CRCs of those objects.
- Run through all objects of types that belong to the changed DLL and backup all values (enumerate properties and store property/value pairs in a temp buffer).
- Release all objects of those types.
- Reload the DLL.
- Recreate all objects and restore their properties.
- Replace all pointer CRCs with pointers to the new objects.
The serialisation can be done to RAM on PC, letting the OS take care of the paging. On consoles you can't really do that (with the exception of some debug kits) so you may have to implement a slower path. Also, from what I can gather of my limited exposure to SPUs, offsets beat pointers and reloadable SPU code is achievable without the above.
If you have a sensible include strategy that keeps build times low then you're looking at less than a second turnaround on any compile change. This gives you an incredible amount of flexibility to change most aspects of your DLL code at runtime. You need to be careful with the DLL. Make sure you load a dynamicly-named copy of it, rather than the output from the compiler as the compiler can't write to the DLL when it's being used. Debugging is achieved simply by attaching/detaching to your process whenever necessary.
It can be demoralising, however, if you have implemented this with limited scope for one module only. As soon as you switch to working with other modules, you start to realise what you're missing.