XDB database format specification

 

Version:

0.7

Restrictions:

only tables and unused blocks info saving is supported

Objective:

possibility to store big data volumes with maximum coefficient of filling

Author:

Tomáš Koutný

 

Conventions

    If not mentioned else, then all declarations and data types are same as those ones in Object Pascal implemented in Borland Delphi 5.0. All names of tables, indexes, quiestions and others are case sensitive ones. Empty string is not valid name, but it identifies that string is not entered.

 

Position and size of items

    Positions in file and the sizes of records are stored as 64 bits long integer (Int64). Zero means that referenced item is not present in file. The most significant bit of Int64 (negative values) is resereved for future utilization.

 

File header

TXDBHeader = packed record
               Signature: array of char = 'XDTKDataBase';
               LoVersion,
               HiVersion: word;
               Tables,
               Indexes,
               Queries,
               Spaces,
               DTypes,
               Specs,
               Info:Int64;
             End;

Signature

file type identification

LoVersion,
HiVersion

version of database file structure; HiVersion.LoVersion

Tables

position, where the tables are defined

Indexes

position, where the indexes are defined

Queries

position, where the queries are defined

Spaces

position, where the spaces are defined

DTypes

position, where the non-standard data types are defined

Specs

position, where the special data are stored

Info

position, where the additional informations about database are stored

 

Data types

Name

Size in bytes

Description

ID

Int64

8

integer number

0

Double

8

real number

1

Boolean

1

Yes/No; value <> 0 means Yes

2

Currency

8

Currency

3

DateTime

8

time in TDateTime format

4

WideText

8+?

text in UNICODE format

5

RawData

8+?

byte sequence

6

    When data types variable length (WideText & RawData) are stored, they begin of integer(Int64) that stores their size in bytes and after this integer, the data are stored.
    You can use RawData to defined your own data types.
    ID represents data type within next definition, for example it's used in deftinition of table's column. ID is always stored as Int64. Negative values represents definitions of non-standard data types.

 

Tables storing

    Item Tables from database file header refers to position where is stored Int64 that represents count of defined tables. Then follow packed array of Int64 with count Int64s. Each Int64 of these Int64s refers to location where is stored definition of one table - TTable record defines table.

Table defininition - TTable record

TTable = packed record
           ItemsCount: Int64;
           FirstItem:Int64;
           Name: WideText;
           PrimaryKey,
           SortedBy:WideText;
           ColCount: Int64;
           Cols: array[0..ColCount-1] of TColumn;
         End;

ItemsCount table's records count
FirstItem position where the table with records is stored
Name name of the table, must be unique
PrimaryKey name of index that determines primary key
SortedBy name of index that determines how the records are sorted
ColCount table's columns count
Cols sequence of column definitions TColumn records

    PrimaryKey & SortedBy may be empty strings.

 

Table's column defintion – TColumn

TColumn = packed record
            Kind:Int64;
            ID:Int64;
            Flags:Int64;
            Name,
            Comment:WideText;
            InputMask,
            OutputMask: WideText;
            InitialValue,
            MinValue,
            MaxValue:DataType;
            MinLength,
            MaxLength:Int64;
          End;

Kind        

data type identification; see Data types; required

ID          

unique column's identification; utilized at records' definitions; required

Flags       

reserved for future utilization

Name        

column's name; must be unique in its table; required

Comment     

column's comment

InputMask   

input mask, used for input text formatting

OutputMask  

output mask, used for output text formatting

InitialValue

record's item initial value

MinValue    

record's item minimum value

MaxValue    

record's item maximum value

MinLength   

record's item minimum length

MaxLength   

record's item maximum length

    Some items may be senseless for some data types. E.g. minimum and maximum lengths for boolean. Only Kind, Name and ID are required for every declaration. The rest of items are to be used by database applications to realize their needs.
    If MinLength and MaxLength are zeros then record's item's length isn't limited.

 

Table's records storing

    Item from record TTable.FirstItem stores file position where is stored sequence of Int64 integers that store positions where records are defined. The records' count is stored in TTable.ItemsCount.

 

Table's record definition

TRecordInfo = packed record
                Parts:Int64;
                Placements:array[0..Parts-1] of packed record
                                                  ID:Int64;
                                                  Part:Int64;
                                                End;
              End;

TRecordPart = packed record
                Fragments:Int64;
                Placements:array[0..Fragments-1] of packed record
                                                      Position,
                                                      Size:Int64;
                                                    End;
              End;

TFragment = packed array[0..Size-1] of byte;

    Every table's record can be composed from items(one item for one column) that values represents whole record. Every column can have only one value for one record. Table can have empty records, i.e. records with any items.

    Information about record, TRecordInfo, stores in Parts count of items stored in record and their placement in database file. Item TRecordInfo.Placements determines column which owns the record's item, item ID, and where the record's item is stored in file, item Part.

    To get maximum coefficient of filling, record's item for one column can be stored at several places in file in the same way as whole record - they're stored in fragments. Item TRecordInfo.Placements[].Part stores position where the record TRecordPart is stored. TRecordPart.Fragments is count of table record's item fragments. If it is zero then record's item has the NULL value. TRecordPart.Placements stores information about these fragments. Location of one fragment represents Position and Size means its size. These values must be greater than zero. Fragment is sequence of bytes.

    To make usage of optimalizations techniques possible, every table record's item should have only one fragment that is located immediately after TRecordPart. TRecordPart records should be placed immediately after TRecordInfo.

 

Unused file space informations storing

TSpaces = packed record
            Spaces:Int64;
            Placements:array[0..Spaces-1] of packed record
                                               Position,
                                               Size:Int64;
                                             End;
          End;

    Item Spaces determines the count of unused blocks in database file. Blocks are located at positions Placements[].Position and their sizes are Placements[].Size.