XDB database format specification
Version: |
0.7 |
Restrictions: |
only tables and unused blocks info saving is supported |
Objective: |
possibility to store big data volumes with maximum coefficient of filling |
Author: |
Tomáš Koutný |
Conventions
If not mentioned else, then all declarations and data types are same as those ones in Object Pascal implemented in Borland Delphi 5.0. All names of tables, indexes, quiestions and others are case sensitive ones. Empty string is not valid name, but it identifies that string is not entered.
Position and size of items
Positions in file and the sizes of records are stored as 64 bits long integer (Int64). Zero means that referenced item is not present in file. The most significant bit of Int64 (negative values) is resereved for future utilization.
File header
TXDBHeader = packed record
Signature: array of char = 'XDTKDataBase';
LoVersion,
HiVersion: word;
Tables,
Indexes,
Queries,
Spaces,
DTypes,
Specs,
Info:Int64;
End;
Signature |
file type identification |
LoVersion, |
version of database file structure; HiVersion.LoVersion |
position, where the tables are defined |
|
Indexes |
position, where the indexes are defined |
Queries |
position, where the queries are defined |
position, where the spaces are defined |
|
DTypes |
position, where the non-standard data types are defined |
Specs |
position, where the special data are stored |
Info |
position, where the additional informations about database are stored |
Name |
Size in bytes |
Description |
ID |
Int64 |
8 |
integer number |
0 |
Double |
8 |
real number |
1 |
Boolean |
1 |
Yes/No; value <> 0 means Yes |
2 |
Currency |
8 |
Currency |
3 |
DateTime |
8 |
time in TDateTime format |
4 |
WideText |
8+? |
text in UNICODE format |
5 |
RawData |
8+? |
byte sequence |
6 |
When data types variable length (WideText & RawData)
are stored, they begin of integer(Int64) that stores their size in bytes and after this
integer, the data are stored.
You can use RawData to defined your own data types.
ID represents data type within next definition, for example
it's used in deftinition of table's column. ID is always stored as Int64.
Negative values represents definitions of non-standard data types.
Item Tables from database file header refers to position where is stored Int64 that represents count of defined tables. Then follow packed array of Int64 with count Int64s. Each Int64 of these Int64s refers to location where is stored definition of one table - TTable record defines table.
Table defininition - TTable record
TTable = packed record
ItemsCount: Int64;
FirstItem:Int64;
Name: WideText;
PrimaryKey,
SortedBy:WideText;
ColCount: Int64;
Cols: array[0..ColCount-1] of
TColumn;
End;
ItemsCount | table's records count |
FirstItem | position where the table with records is stored |
Name | name of the table, must be unique |
PrimaryKey | name of index that determines primary key |
SortedBy | name of index that determines how the records are sorted |
ColCount | table's columns count |
Cols | sequence of column definitions TColumn records |
PrimaryKey & SortedBy may be empty strings.
Table's column defintion – TColumn
TColumn = packed record
Kind:Int64;
ID:Int64;
Flags:Int64;
Name,
Comment:WideText;
InputMask,
OutputMask: WideText;
InitialValue,
MinValue,
MaxValue:DataType;
MinLength,
MaxLength:Int64;
End;
Kind |
data type identification; see Data types; required |
ID |
unique column's identification; utilized at records' definitions; required |
Flags |
reserved for future utilization |
Name |
column's name; must be unique in its table; required |
Comment |
column's comment |
InputMask |
input mask, used for input text formatting |
OutputMask |
output mask, used for output text formatting |
InitialValue |
record's item initial value |
MinValue |
record's item minimum value |
MaxValue |
record's item maximum value |
MinLength |
record's item minimum length |
MaxLength |
record's item maximum length |
Some items may be senseless for some data types.
E.g. minimum and maximum lengths for boolean. Only Kind, Name and ID
are required for every declaration. The rest of items are to be used by database
applications to realize their needs.
If MinLength and MaxLength are zeros then record's
item's length isn't limited.
Table's records storing
Item from record TTable.FirstItem stores file position where is stored sequence of Int64 integers that store positions where records are defined. The records' count is stored in TTable.ItemsCount.
Table's record definition
TRecordInfo = packed record
Parts:Int64;
Placements:array[0..Parts-1] of packed record
ID:Int64;
Part:Int64;
End;
End;
TRecordPart = packed record
Fragments:Int64;
Placements:array[0..Fragments-1] of packed record
Position,
Size:Int64;
End;
End;
TFragment = packed array[0..Size-1] of byte;
Every table's record can be composed from items(one item for one column) that values represents whole record. Every column can have only one value for one record. Table can have empty records, i.e. records with any items.
Information about record, TRecordInfo, stores in Parts count of items stored in record and their placement in database file. Item TRecordInfo.Placements determines column which owns the record's item, item ID, and where the record's item is stored in file, item Part.
To get maximum coefficient of filling, record's item for one column can be stored at several places in file in the same way as whole record - they're stored in fragments. Item TRecordInfo.Placements[].Part stores position where the record TRecordPart is stored. TRecordPart.Fragments is count of table record's item fragments. If it is zero then record's item has the NULL value. TRecordPart.Placements stores information about these fragments. Location of one fragment represents Position and Size means its size. These values must be greater than zero. Fragment is sequence of bytes.
To make usage of optimalizations techniques possible, every table record's item should have only one fragment that is located immediately after TRecordPart. TRecordPart records should be placed immediately after TRecordInfo.
Unused file space informations storing
TSpaces = packed record
Spaces:Int64;
Placements:array[0..Spaces-1] of packed record
Position,
Size:Int64;
End;
End;
Item Spaces determines the count of unused blocks in database file. Blocks are located at positions Placements[].Position and their sizes are Placements[].Size.