Data Modelling: How to solve the table traversal problem -

Unibase uniquely models application data in an active data dictionary, driving the application directly from this model, unlike traditional database systems. This meta file describes tables, attributes (stored and calculated), indexes, relationships, and restrictions. Instead of embedding SQL, Unibase uses this static dictionary to find paths between tables. The dictionary, a human-readable text file, has sections for attributes, indexes (crucial for performance and relationships), associations (defining table links via indexed attributes), and restrictions (relational projections). Alias tables allow reusing tables in different ways.

Unibase handles semantic variations in expressions based on context (direct calculation vs. aggregation). It also intelligently determines valid paths between remote tables, considering relationship types and foreign key traversal, solving complex data relationship challenges. Referencing remote attributes often returns a sum across associated records, simplifying complex logic. Importantly, Unibase understands the semantic context of numbers (e.g., price vs. value), providing specific syntax for operations. It even supports ‘fuzzy’ associations for curve fitting. In conclusion, Unibase’s data modeling approach simplifies the development of large, complex applications by automating data structure and relationship management.

Data modeling is an essential concept for anyone seeking to understand how to structure entities and effectively manage information collections. Unlike other database systems, Unibase builds a model of the application data and drives the application directly from this model.

How does Unibase achieve this? It utilizes an active data dictionary that provides a complete description of the application’s data. This meta file contains the tables, their attributes, calculations, and indexes, along with the critical relationships between tables.

Unlike SQL, which functions as an interpreted or compiled language, the Unibase data dictionary offers a static description of the application data. Consequently, programmers do not embed SQL clauses like ‘where’ in their code. Instead, Unibase references the data dictionary to locate valid paths between tables.

The basic structure

The data dictionary serves as a single text file utilizing a human-readable syntax, comparable to that of many configuration files. It contains table definitions, each comprising four distinct sections.

For more detailed information, consult the section on the Data Dictionary.

The data dictionary’s four sections are: attributes (fields), indexes (keys), associations (join definitions), and restrictions (projections).

Attribute definitions specify a name, character count, optional array size (arrays are first-class objects in Unibase), attribute type, and extra information based on the type. You can define attributes as both stored and calculated; however, programmers experience no difference when using them. Unibase calculates calculated attributes as needed. Expressions for a calculated attribute can reference attributes in the current table or any other table. Unibase automatically determines how to retrieve the appropriate table and calculate the required attribute, which may also be calculated. This involves significant semantic as well as syntactic considerations, which we will explore further after detailing the components.

Next, we discuss indexes. Unibase requires explicit index definitions to ensure optimal application performance. It also uses these definitions in table associations, defining how information in one table relates to information in another. Unlike Entity-Relationship designs, which emphasize the relationship’s name, Unibase focuses solely on the attributes that define the relationship. Importantly, these attributes can be calculated rather than stored, but they must exist as an index in the target table.

The next section covers associations. This section defines how tables relate. An association links a destination table through a combination of attributes that must form an index in the associated table. This is crucial because referencing an attribute in a remote table prompts Unibase to search for a path to the destination table via these associations.

Finally, restrictions. These function as relational projections, limiting access to records that meet specified conditions.

Alias tables allow referring to the same table by a different name. This is necessary because Unibase requires the model to be static at any point to enable analysis and compute a work strategy. Alias tables thus enable using a table in different ways within a single program and defining it as a projection.

Semantics

Unibase handles variations in how entities are used. For example, an expression calculates differently based on its context. When used simply, Unibase calculates an expression directly. However, when incorporated into subtotals or totals, Unibase determines whether to sum the values or recalculate the expression for the aggregate. This capability relates to the principle that ‘the sum of the averages is not the average of the sums’.

Unibase also provides a solution to the problem of identifying the correct path between remote tables. It addresses complexities such as path validity (one-to-one, one-to-many, many-to-one relationships) and the effects of foreign key traversal. This capability is critical for Unibase to solve the table traversal problem and fundamentally enables it to solve data relationships far more complex than a programmer can typically handle. This solution differs from the traveling salesman problem as it considers path characteristics; for instance, a path’s validity may depend on previously traversed paths. This solution is similar to the Djikstra solution, but with extensions.

References to a remote attribute typically return the sum of that attribute across all associated records. This significantly simplifies writing one-line expressions that describe otherwise complex application logic.

Numbers, however, are not always merely numbers. This represents a major limitation of traditional databases and serves as an important semantic differentiator for Unibase. For example, while a product price is numerically represented, summing individual product prices rarely makes business sense (unless calculating an average). Conversely, multiplying price by quantity to derive a value does. Unibase provides specific syntax to handle such semantic distinctions.

Unibase also supports ‘fuzzy’ associations. When performing curve fitting, you may not find exact values for a given attribute in the target table, whether calculated or stored. In such cases, Unibase finds records where the matching attribute is less than or equal to the source attribute.

Conclusion

This concludes our brief introduction to data modeling in Unibase. Unibase models large, complex systems with ease, often solving problems that present significant challenges for programmers. This capability extends the range of applications developers can create.