Sunday, May 14, 2006

Published fields

In our little series about reverse engineering the undocumented fields of the Delphi VMT, we have come to the FieldTable field. This field points to structures that describe the published fields of a class. In Delphi, published fields must be object references and are mainly used by forms and datamodules to store component references in logically named and easy to use fields (the alternative would be to use the Components property array with specific index values and casting).

The Delphi RTL only contains a single exposed method that accesses the field table, TObject.FieldAddress. This method returns the address of a published field given the field name.

function TObject.FieldAddress(const Name: ShortString): Pointer;
asm
// ...
MOV ESI,[EAX].vmtFieldTable
// ...
end;

This method is used by the component system, in the implementation of the private TComponent.SetReference method, to search the component’s owner for a field that matches the name of the component. If the owned component finds a correctly named published field in its owner, it will assign the field with the component’s reference or nil, depending on the component has just been is added to or removed from the owner.

procedure TComponent.SetReference(Enable: Boolean);
var
Field: ^TComponent;
begin
if FOwner <> nil then
begin
Field := FOwner.FieldAddress(FName);
if Field <> nil then
if Enable then Field^ := Self else Field^ := nil;
end;
end;

procedure TComponent.InsertComponent(AComponent: TComponent);
begin
// …
AComponent.SetReference(True);
// …
end;

procedure TComponent.RemoveComponent(AComponent: TComponent);
begin
// …
AComponent.SetReference(False);
// …
end;

procedure TComponent.SetName(const NewName: TComponentName);
begin
// …
SetReference(False);
ChangeName(NewName);
SetReference(True);
// …
end;

This is how the component fields in your form class automagically get their values. And if you free a component at runtime, the corresponding field is automagically cleared to nil. Pretty neat, huh?! :-).

We’ll look at the exact layout of the field table shortly, but to make FieldAddress work the field table must contain the name of each field and the offset into the object instances it resides. Note that the field table is part of the VMT and thus part of the class, not a specific object instance. This is why the field table cannot contain the actual address of the field, only the offset. The offset must be combined (added to) the object instance address to get the true address of the field at runtime.

There is another, private, routine that also accesses the field table. The Classes unit contains a BASM routine in the implementation section called GetFieldClassTable. This routine accesses a different part of the field table – one that contains class references.

function GetFieldClassTable(AClass: TClass): PFieldClassTable; 
asm
MOV EAX,[EAX].vmtFieldTable
// …
end;

This routine is part of some of the innermost private implementation details of the TReader streaming logic. The nested calls that end up in GetFieldClassTable starts with the public TReader.ReadComponent method and look like this:

TReader.ReadComponent
     CreateComponent (nested routine)
          FindComponentClass
               GetFieldClass
                    GetFieldClassTable
     FindExistingComponent (nested routine)
          FindComponentClass
               GetFieldClass
                    GetFieldClassTable

The FindExistingComponent logic handles visually inherited forms, datamodules and frames. CreateComponent creates a new component read from a DFM stream, given the string of with the component class name. FindComponentClass operates like a local mapping of class name strings to runtime TClass references. Instead of using the heavy-duty global routine GetClass that is used by Delphi at design-time, FindComponentClass first limits its search to the list of unique class types of declared published fields. You see, in addition to the name-offset association, the field table also contains a list of all the unique class types used for published fields in the owner class.

When you design a form, there is a little-known trick to remove the component fields of components you never reference from code. Alternatively, you can simply clear the name field of the component. This will make the component unnamed and the IDE will remove the field declaration for you. These tricks will slightly reduce the size of the DFM and slightly improve the form load performance at runtime.

You have to be careful when performing this trick, however. You must keep at least one published field of each component type on the form, otherwise it will not stream in from the DFM properly – giving you an error message like this:
---------------------------
Debugger Exception Notification
---------------------------
Project richedit.exe raised exception class EClassNotFound with message 'Class TLabel not found'. Process stopped. Use Step or Run to continue.
---------------------------
OK   Help  
---------------------------

Or outside the debugger:
---------------------------
Rich Edit Control Demo
---------------------------
Class TLabel not found.
---------------------------
OK  
---------------------------

You should now see the reason why you get this error. The TReader class uses the list of published field types to convert from a class name string to a proper TComponent class reference. If the class reference is not present in the form class’ field table RTTI, TReader is unable to create the component, and it resorts to raising the EClassNotFound exception you saw above. Note that TReader does fall back to the (potentially) slower GetClass mechanism if the component class reference isn’t found in the field table. This means that an alternative to keeping one published field of each component class, you can call RegisterClass on the component class in an initialization section.

//…
initialization
RegisterClass(TLabel);
end.

Then you don’t need any TLabel fields in the form class.

Ok, that should give you some background of why Delphi supports published fields, what kind of RTTI information is stored about them and how the VCL exploits them to perform its design time and DFM streaming magic. The assembly code in TObject.FieldAddress and GetFieldClassTable along with some helpful type declarations inside the Classes unit implementation section give us some helpful clues of how the field table RTTI structures are laid out in memory.

In the next blog post we’ll dive deeper down and write some Pascal data structures and utility methods to find and iterate the published fields and their types. Stay tuned!

No comments:



Copyright © 2004-2007 by Hallvard Vassbotn