XDB database format specification

 

Version: 0.7
Restrictions: Only tables and unused blocks info saving is supported
Objective: Possibility to store big data volumes with maximum coefficient of filling
Author: Tomáš Koutný

 

Conventions

    If not mentioned else, then all declarations and data types are same as those ones in Object Pascal implemented in Borland Delphi 5.0. All names of tables, indexes, questions and others are case sensitive ones. Empty string is not valid name, but it identifies that string is not entered.

 

Position and size of items

    Positions in file and the sizes of records are stored as 64 bits long integer (Int64). Zero means that referenced item is not present in file. The most significant bit of Int64 (negative values) is reserved for future utilization.

 

File header

TXDBHeader = packed record
               Signature: array of char = 'XDTKDataBase';
               LoVersion,
               HiVersion: word;
               Tables,
               Indexes,
               Queries,
               Spaces,
               DTypes,
               Specs,
               Info:Int64;
             End;

Signature File type identification
LoVersion,
HiVersion
Version of database file structure; HiVersion.LoVersion
Tables Position, where the tables are defined
Indexes Pposition, where the indexes are defined
Queries Position, where the queries are defined
Spaces Position, where the spaces are defined
DTypes Position, where the non-standard data types are defined
Specs Position, where the special data are stored
Info Position, where the additional information about database is stored

 

Data types

Name Size in bytes Description ID
Int64 8 Integer number 0
Double 8 Real number 1
Boolean 1 Yes/No; value <> 0 means Yes 2
Currency 8 Currency 3
DateTime 8 Time in TDateTime format 4
WideText 8+? Text in UNICODE format 5
RawData 8+? Byte sequence 6

    When data types variable length (WideText & RawData) are stored, they begin of integer (Int64) that stores their size in bytes and after this integer, the data are stored.
    You can use RawData to define your own data types.
    ID represents data type within next definition; for example it's used in definition of table's column. ID is always stored as Int64. Negative values represent definitions of non-standard data types.

 

Tables storing

    Item Tables from database file header refers to position where is stored Int64 that represents count of defined tables. Then follow packed array of Int64 with count Int64s. Each Int64 of these Int64s refers to location where is stored definition of one table - TTable record defines table.

Table definition - TTable record

TTable = packed record
           ItemsCount: Int64;
           FirstItem:Int64;
           Name: WideText;
           PrimaryKey,
           SortedBy:WideText;
           ColCount: Int64;
           Cols: array[0..ColCount-1] of TColumn;
         End;

ItemsCount Table's records count
FirstItem Position where the table with records is stored
Name Name of the table, must be unique
PrimaryKey Name of index that determines primary key
SortedBy Name of index that determines how the records are sorted
ColCount Table's columns count
Cols Sequence of column definitions TColumn records

    PrimaryKey & SortedBy may be empty strings.

 

Table's column definition – TColumn

TColumn = packed record
            Kind:Int64;
            ID:Int64;
            Flags:Int64;
            Name,
            Comment:WideText;
            InputMask,
            OutputMask: WideText;
            InitialValue,
            MinValue,
            MaxValue:DataType;
            MinLength,
            MaxLength:Int64;
          End;

Kind Data type identification; see Data types; required
ID Unique column's identification; utilized at records' definitions; required
Flags Reserved for future utilization
Name Column's name; must be unique in its table; required
Comment Column's comment
InputMask Input mask, used for input text formatting
OutputMask Output mask, used for output text formatting
InitialValue Record's item initial value
MinValue Record's item minimum value
MaxValue Record's item maximum value
MinLength Record's item minimum length
MaxLength Record's item maximum length

    Some items may be senseless for some data types. E.g. minimum and maximum lengths for boolean. Only Kind, Name and ID are required for every declaration. The rest of items are to be used by database applications to realize their needs.
    If MinLength and MaxLength are zeros then record's item's length isn't limited.

 

Table's records storing

    Item from record TTable.FirstItem stores file position where is stored sequence of Int64 integers that store positions where records are defined. The records' count is stored in TTable.ItemsCount.

 

Table's record definition

TRecordInfo = packed record
                Parts:Int64;
                Placements:array[0..Parts-1] of packed record
                                                  ID:Int64;
                                                  Part:Int64;
                                                End;
              End;

TRecordPart = packed record
                Fragments:Int64;
                Placements:array[0..Fragments-1] of packed record
                                                      Position,
                                                      Size:Int64;
                                                    End;
              End;

TFragment = packed array[0..Size-1] of byte;

    Every table's record can be composed from items (one item for one column) and their values represents whole record. Every column can have only one value for one record. Table can have empty records, i.e. records with any items.

    Information about record, TRecordInfo, stores in Parts count of items stored in record and their placement in database file. Item TRecordInfo.Placements determines column, which owns the record's item, item ID, and where the record's item is stored in file, item Part.

    To get maximum coefficient of filling, record's item for one column can be stored at several places in file in the same way as whole record - they're stored in fragments. Item TRecordInfo.Placements[].Part stores position where the record TRecordPart is stored. TRecordPart.Fragments is count of table record's item fragments. If it is zero then record's item has the NULL value. TRecordPart.Placements stores information about these fragments. Location of one fragment represents Position and Size means its size. These values must be greater than zero. Fragment is sequence of bytes.

    To make usage of optimizations techniques possible, every table record's item should have only one fragment that is located immediately after TRecordPart. TRecordPart records should be placed immediately after TRecordInfo.

 

Unused file space information storing

TSpaces = packed record
            Spaces:Int64;
            Placements:array[0..Spaces-1] of packed record
                                               Position,
                                               Size:Int64;
                                             End;
          End;

    Item Spaces determines the count of unused blocks in database file. Blocks are located at positions Placements[].Position and their sizes are Placements[].Size.