This message was edited by 684867 at 2005-4-3 2:50:52
: ------------------------------------------------------
: Project: Component Based Object Oriented Programming
: (CBOOP)
:
: SubProject: Component Executable (CXE) File Format
:
: Author: Sam Caldwell
: (c) March 2005.
: ------------------------------------------------------
:
***Problem Statement:
The new programming paradigm requires a format for the component image, both in memory and on disk.
***Solution:
Discussion is limited to the component image's disk state. Its memory state will be taken up later. We have two priorities in defining this format: (1) maintaining a small disk footprint and to establish a high degree of scalability. CXE files must be able to contain multiple interdependent components efficiently.
We illustrate the component CXE file, below, and note that it consists of a 160 byte header and variable-length body of up to 4GB.
4.1: CXE File format
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 007 08 FileVersion CXE file format version code.
008 - 015 08 FileAttributes (Reserved)
016 - 031 16 Count Number of components in CXE file.
032 - 063 32 TimeStamp File Creation Timestamp
064 - 095 32 CDTAddress Component Descriptor Table Address
096 - 127 32 Dptr First Data Page Address
128 - 159 32 Cptr First Code Page Address
160 - nnn xx CodePage Block of executable code.
aaa - bbb xx DataPage Block of persistent/volatile data.
ccc - ddd xx CDT Component Descriptor Table.
-----------------------------------------------------------------------
The 160-byte header begins with the FileVersion field specifying the exact format which follows. Reading the file accurately depends on the value of this field, which is followed by a reserved FileAttributes field. At present this field is reserved for use under the AMI-OS project.
The header also specifies a number of components contained in the file. This is a time saver and redundancy for integrity verification. Remember: a single CXE can contain multiple independent or interdependent components. (This allows enormous next-gen op/sys capabilities, as described in the AMI-OS design notes.)
A creation timestamp is also included as a security feature and a part of the AMI-OS design to be gradually introduced over the coming years.
Finally, the header references to the major sections of the file body. These include the addresses of the first code page (Cptr), data page (Dptr) and Component Descriptor Table (CDTAddress). Each address (like all addresses in the CXE file) are measured relative to byte 160 of the CXE file.
Code pages, data pages and CDTs are all connected into separate linked lists. The header reference to the first node in these lists maintains the integrity of the file. These lists are doubly linked to minimize the chance of fragmentation. Where these pages and tables are maintained as linked lists by the CXE file for tracking allocated memory, however, they are also linked as parallel linked lists to build component assemblies which are defined by the Component Descriptor Table (CDT).
The CDT references data and code pages. These pages are blocks of disk memory linked together into scalable lists. Each data and code page is defined as either persistent or volatile. In the disk state, a persistent page is one which cannot be altered during run-time (i.e. read-only); whereas a volatile page is that which may be altered during run-time (i.e. read-write).
The CDT acts as a blueprint for the component. It references the code and data pages needed by that blue print to assemble the required component. A single CDT defines one component. It may use any number of code and data pages to accomplish this mission. These same code and data pages may be cross-linked by several CDTs in separate linked lists defining separate components, where the contents of the code or data page is identical. Yet this is only to compress the size of the CXE file and does not imply that the actual data and code are shared, or that the boundaries of a component are in any way compromised.
Below, we define the Component Descriptor Table (CDT). In doing so we reference byte positions from the start of the CDT block, rather than from the 160th byte of the file. This is done since the actual position would depend on the placement of the CDT in the file, a matter known at this time to be variable.
4.2: Component Descriptor Table (CDT).
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 007 008 CDTVersion CDT Format
008 - 263 256 UID Universal Identifier (MD5 hash)
264 - 272 008 CVS Version User-Defined Component Version
273 - 277 004 ThreadModel Processing Model
278 - 282 004 Attributes Loading/Execution Attribues
283 - 314 032 CTime Creation TimeStamp
315 - 346 032 UTime Update Timestamp
347 - 362 016 Language Instruction Set Code
363 - 394 032 Depptr Dependency Table Address
395 - 416 032 BCTptr Base Component Table Address
417 - 448 032 Instanceptr Instance vTable Address
449 - 480 032 Nextptr Next CDT Address
481 - xxx xxx LogicalName Component Logical Name (String)
-----------------------------------------------------------------------
The CDT is a table of tables. We start with a CDTversion field which denotes the version of the Component Descriptor Table we are about to read. This will allow newer CDT formats to be stored in older CXE formats, and vis-a-versa. CDTversion is followed by a UID or universal identifier. The UID is an MD5 checksum and the core to CBOOP security. Each component and component instance (object) is uniquely identified by its UID.
UID is followed by the CVSversion. CVS version refers to the programmer's version. A UID cannot accurately serve this function due to its higher sensitivity to a component's data. Rather CVSversion allows the programmer to track the component by the source code from which it was produced.
Threadmodel and language are fields used by Op/sys loaders to determine how a component will run. Will the component run within the same process space as the owner component or separately? Is the binary language native to the CPU, or is an intepreter necessary? Likewise, component Attributes are a matter for OSloaders and are left for discussion elsewhere.
CTime and UTime maintain the creation and update timestamps for a component. These timestamps affect the UID and increase security around the system.
Thus far, the CDT has given general parameter information. What follows this is a set of tables which define the actual component. The first is the dependency table, which defines the components upon which the defined component depends. That is, if component "foo" uses the services of component "moo," then moo appears in the dependency table for foo.
4.3: CDT Dependencies Table
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 031 032 TableSize Size (in Bytes).
032 - 287 032 UID Universal Identifier
288 - xxx xxx FileName CXE Filename
-----------------------------------------------------------------------
Dependency Tables define the filename and UID for components used by a given component. However, CBOOP does allow for component inheritance. Inherited components are defined in the Base Component Tables (BCT), referenced by the BCTptr field in the CDT. This table is similar in format to the CDT Dependency Table, as illustrated below:
4.4: Base Component Table (BCT)
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 031 032 TableSize Size of the BCT (in bytes).
032 - 287 032 UID Universal Identifier
288 - 289 002 Scope Inheritance Scope (enumerated)
290 - xxx xxx Filename Component CXE filename (string)
-----------------------------------------------------------------------
The BCT contains basic information about the component's storage location (filename), identity (UID) and inheritance scope (public, private, friend or protected). This table is of variable length, as determined by TableSize. It may range up to 4GB in size.
BCT and the Dependency Table allow the CDT to describe the foundation upon which a component will be assembled. However the "meat and potatoes" is the Instance Vector Table (IVT). This IVT serves two functions: (1) it defines the prototype component object and (2) it stores the persistent definition of all objects of a given component type.
4.5: Instance Vector Table.
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 255 256 UID Universal Identifier
256 - 287 032 Nextptr Next IVT Record Pointer
288 - 319 032 Dataptr Data Area Pointer
320 - 351 032 Iptr Interface Vtable Pointer
352 - 383 032 Mptr Method Vtable Pointer
-----------------------------------------------------------------------
IVTs are another level of detail for the component. Here the component data area is referenced, the component-object is identified by unique hash and interface and method vector tables are defined. But, further, the IVT is revealed to be part of a linked list of records. The table is scalable. As new instances are created the file can grow. Likewise, instances can be deleted from the table and the table can contract without much effort.
UIDs are digital signatures identifying each component-object. The component-prototype from which all other instances are created has the same UID as that stored elsewhere in the CDT. This is always the first record in the IVT, and though it is possible to change this prototype image under CBOOP, doing so changes the MD5 checksum for the component-- and thus the component identity itself. This is core to the security of the CBOOP architecture.
From the component-prototype component objects are copy-created. That is the prototype IVT is copies, as are the related tables and data area. The first to be copies is the data area with the default data image, this is followed by the method vtable and interface vtable last. the interface vtable then contains the component constructor that is executed immediately following its copy creation to the new IVT.
4.6: IVT Data Area Table.
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 031 032 TableSize Size of Table (in Bytes).
008 - 063 032 PageNumber CXE Page Number (Code/Data).
064 - 095 032 Offset Address relative to top of data page.
096 - 127 032 Size Size of the data area.
-----------------------------------------------------------------------
The IVT-DAT format is straight forward. A CXE Page number identifies each data and code page in the CXE. This number is used to address the proper data/code page in the file. From this number, the data area can be calculated as an offset within the page having a given size.
4.7: IVT Interface vTable.
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 031 032 Nextptr Reference to next vTable record.
008 - 063 032 Chainptr Reference to the code chain.
064 - 319 256 UID Universal Identifier
320 - 336 016 Plen Argument Length
337 - xxx xxx Arglist Parameters
xxx - xxx xxx LogicalName Logical Interface name
-----------------------------------------------------------------------
The IVT Interface vTable is a linked list of descriptors which cover each interface exposed by a component. Each interface has its own UID, strongly typed parameter list and logial name. The interface itself is a chain of executable binary code represented by a linked list of code pages referenced by the Chainptr field.
4.8: IVT Method vTable.
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 031 032 TableSize TableSize (in bytes)
008 - 063 032 Chainptr Reference to the code chain.
-----------------------------------------------------------------------
Because methods are internal structures, they may be more rigidly defined by the compiler and require less information at run-time. Thus the method vtable only references the codechain where a component lies.
4.9: IVT Code Chain Table
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 031 032 FilePage Code Page Reference
008 - 063 032 Offset Offset within code page
064 - 095 032 Size Size of code chain node
096 - 127 032 Nextptr Reference to next code chain node.
-----------------------------------------------------------------------
5.0: Code Pages
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 031 32 FilePage Page ID (Internal Address)
032 - 063 32 Attributes Page Attributes (such as read-only)
064 - 095 32 Page Size
096 - 127 32 Nextptr Address of next code page
128 - xxx xx Undefined code area
-----------------------------------------------------------------------
5.1: Data Pages
-----------------------------------------------------------------------
Byte Pos. Bits Field Name Description
-----------------------------------------------------------------------
000 - 031 32 FilePage Page ID (Internal Address)
032 - 063 32 Attributes Page Attributes (such as read-only)
064 - 095 32 Page Size
096 - 127 32 Nextptr Address of next data page
128 - xxx xx Undefined data area
-----------------------------------------------------------------------
Code and data pages use the same format. They differ only in the type of information they contain. Even their attributes should be compatible. Primarily the attributes reflect persistence/volatility. Otherwise they are reserved for later specification under project AMI-OS.