So I found a little memory hog in the scripting engine that affects pretty much every single property instance. Internally, arguments to script functions are passed around using a structure called ScriptVariant. A pointer is allocated, and then one of the members is populated with data depending on the value type in question.
As an example, this is openborvariant returning whether or not the title screen is active:
The issue is with this line:
It's a total waste, because var->lVal is a LONG, whereas titleScreen is an INTEGER. 4-8 bytes of storage for a 2-4 byte value - exactly twice the space needed. This is how the entire script engine operates, so that pithy little 4 byte waste adds up real quick. Not enough to mean anything on a PC, but consoles are another story. Even on more powerful ones like the Wii, it can make a big difference.
But it's actually worse than that. This is the type enumerator and structure for scriptVariant:
IOW, regardless which data type you use, there will always be at least 10 bytes allocated since that's what the largest one (dblVal) requires, plus 2-4 more for vt. 10-12 bytes used, when in the vast majority of cases all we need is 4-8. That's before you start accounting for the issue of most modern CPUs being optimized for INT and thus needing extra cycles to handle other types.
The quandary is this - how exactly do we optimize? The following obvious solutions won't work:
Perhaps we could break the structure down and allocate individual vars as needed, but that strikes me as one big mess waiting to happen, and might not save anything.
Any thoughts?
As an example, this is openborvariant returning whether or not the title screen is active:
Code:
case _sv_in_titlescreen:
ScriptVariant_ChangeType(var, VT_INTEGER);
var->lVal = titleScreen;
break;
The issue is with this line:
Code:
var->lVal = titleScreen;
It's a total waste, because var->lVal is a LONG, whereas titleScreen is an INTEGER. 4-8 bytes of storage for a 2-4 byte value - exactly twice the space needed. This is how the entire script engine operates, so that pithy little 4 byte waste adds up real quick. Not enough to mean anything on a PC, but consoles are another story. Even on more powerful ones like the Wii, it can make a big difference.
But it's actually worse than that. This is the type enumerator and structure for scriptVariant:
Code:
typedef enum VariantType
{
VT_EMPTY = 0, //not initialized
VT_INTEGER = 1, //int/long
VT_DECIMAL = 2, //double
VT_PTR = 5, //void*
VT_STR = 6, //char*
} VARTYPE;
typedef struct ScriptVariant
{
union//value
{
LONG lVal;
VOID *ptrVal;
DOUBLE dblVal;
int strVal;
};
VARTYPE vt;//variatn type
} ScriptVariant;
IOW, regardless which data type you use, there will always be at least 10 bytes allocated since that's what the largest one (dblVal) requires, plus 2-4 more for vt. 10-12 bytes used, when in the vast majority of cases all we need is 4-8. That's before you start accounting for the issue of most modern CPUs being optimized for INT and thus needing extra cycles to handle other types.
The quandary is this - how exactly do we optimize? The following obvious solutions won't work:
- Adding an integer member to the union like intval or some such won't affect allocation, because allocation is based on the largest union member (dblval).
- We can't remove or downgrade the larger members because while not used nearly as often they are occasionally necessary.
- Changing the union to a sub structure would be worse, because then we're allocating enough memory for ALL members, not just the largest.
Perhaps we could break the structure down and allocate individual vars as needed, but that strikes me as one big mess waiting to happen, and might not save anything.
Any thoughts?