This document describes some of the design ideas and architecture
behind the XORM project. As implied by its name, a primary goal of
XORM is to map object data to relational data. Many of the subsystems
within the implementation reflect this distinction.
The relational level is represented by the classes in the
org.xorm.datastore package. The metadata for this area comes from the
database schema XML file. This defines any number of Table and Column
instances, and indicates, if applicable, the data types of each
column. The types at this level correspond exactly with the data in
the datastore.
Furthermore, XORM assumes that relational instances -- Rows -- may
have primary keys that are unique for each Table. Currently only
single-column primary keys are supported.
The set of tables may include many-to-many mapping tables. These are
treated no differently than any other table at this level.
At runtime, data from the relational datastore is directly
instantiated, creating Row objects. Each Row instance holds the data
from one datastore Row. The only data transformation that occurs is a
mapping of datastore-specific data types to Java types, as defined by
the "type" (and optionally "format") field in the database schema XML.
For instance, the SQL "VARCHAR" type becomes a Java String.
The input and output of datastore Row instances is handled by
instances of org.xorm.datastore.DatastoreDriver. Drivers are
responsible for reading Row data via the select() method, creating new
datastore Rows using the create() method, updating modified Rows using
the update() method, and deleting Rows using the delete() method.
To facilitate pooled configuration data and connections, a
DatastoreDriver instance must also provide an interrogation method
that indicates the type of ConnectionInfo class it requires.
There is one instance of a DatastoreDriver per JDO Transaction
instance. During its lifecycle, a single driver may be reused for
multiple logical transactions. The sequence of calls made to the
driver always starts with begin() and terminates with either commit()
or rollback(). Between begin and end, any of the read/write
operations may be called. Drivers that utilize connection pooling
should acquire a new connection when begin() is called, and release it
when commit() or rollback() is called.
The object level corresponds to your Java object model. In XORM this
is captured by the definition of abstract classes and interfaces.
These must then be identified in the JDO metadata file in order to be
mapped to relational items.
At runtime, these objects are created by using the CGLIB enhancer.
This occurs either via the XORM.newInstance() static method, or
internally during a query or lookup execution. The resulting
instances, which implement your interfaces or extend your abstract
classes, are proxy objects. They delegate their methods' behavior to
XORM in the case of interface or abstract methods (we say these
methods are "enhanced" by CGLIB).
Most of the work that XORM does involves translating data back and
forth between the relational model and the object model. The mapping
information is all present in the JDO XML metadata file or files for
the application.
When a new persistence capable class is referenced, XORM attempts to
read the *.jdo file for the class or its package, if it has not been
read already. The org.xorm.util.jdoxml classes are a straightforward
translation of the JDO XML data into Java form; they were written this
way to enable reuse in other scenarios where programs (such as
configuration tools) need access to raw JDO configuration data.
The jdoxml.* objects are then processed by an instance of
org.xorm.ModelMapping. A ModelMapping represents the particular set
of mappings in place for a given InterfaceManagerFactory. The
ModelMapping reads the information contained in the JDO file and
creates a set of objects called ClassMappings, one for each
persistence-capable class. A ClassMapping instance holds the
information mapping a particular java Class to a particular datastore
Table. This includes mapping each property of the class to a
datastore Column or a RelationshipMapping. Mappings to a Column are
used for properties that contain primitive data like Strings and
numbers that correspond exactly to columns in the datastore. They are
also used for properties that are object references in the object
model and exist in the datastore as foreign key columns.
RelationshipMappings can be one-to-many or many-to-many. These
indicate relationships to collections of objects that do not get
stored as part of the primary Row for a data object. One-to-many
relationships are those where the primary key of the owning object
becomes part of the referenced object's row. Many-to-many
relationships use a secondary lookup table.
At runtime, instantiated relationships (collection properties of model
objects) are represented by instances of org.xorm.RelationshipProxy.
Backing each proxy instance at runtime is an
InterfaceInvocationHandler object. A handler is connected to both the
object model (proxy) and the relation model (Row). This object is
invoked for every enhanced method called on the proxy, with one
argument being the java.lang.reflect.Method instance for the method
invoked and another being the arguments to that method. In the
simplest terms, the handler examines the Method and arguments and
determines the bean property being accessed, and whether the access is
a read or write method. The handler then reads or writes the Row
object or a RelationshipProxy, invoking the InterfaceManager if it
needs additional data read from the datastore.
Just like the different layers of objects in XORM, the query model has
multiple layers. Queries themselves are expressed in terms of the
Java object model and properties of Java classes. The internal
representation of a query in XORM is managed by QueryExample and
QueryCondition classes. It provides, in a sense, a "query by
example", where the applicable Class and conditions for meeting the
query are defined. A QueryCondition identifies a class property, an
operator, and a value; the value itself may be another example object.
To support JDOQL, the JDO Query Language, a JavaCC-generated parser is
used. This parser validates and translates the String queries in
JDOQL, turning them into QueryExample instances with attached
QueryConditions. Other query languages could be supported using a
similar approach.
To actually execute the query, it needs to be rephrased in terms of
the relational model. A corresponding set of objects exists for this
purpose. The Selector object encapsulates a relational query tree;
attached to it are any number of Condition objects. Conditions may in
turn reference additional Selectors. The BoundQuery class handles
conversion of a QueryExample object into a Selector. As part of this
process, any many-to-many mapping tables that need to be traversed are
included.
The final step of query parsing is handled by the DatastoreDriver.
The responsibility of the driver is to translate the Selector passed
in as an argument to the select() method into a native query format.
In the case of a JDBC database, this native query format is SQL. The
org.xorm.datastore.sql.SQLQuery class is tasked with translating a
Selector to SQL.
One additional point is that nested Selectors may be marked as outer
joins. This indicates that the condition is an optional one that
should be traversed if available. Because of this, Selectors may
return data from multiple Tables. DatastoreDrivers are encouraged but
not required to support this feature, which allows XORM to prefetch
data based on user fetch group preferences.
XORM keeps a cache of transactional data associated with each ongoing
transaction as specified by the JDO specification. That is, reading a
particular object will go to the datastore at most once during a given
transaction.
In addition, there is a second-level cache associated with each
instance of PersistenceManagerFactory. All PersistenceManagers
associated with the factory share access to the same cache. This
cache contains Row objects, not Java proxies. XORM queries the cache
before going to the datastore when it can (single-object lookups,
primarily).
The cache strategy is pluggable but defaults to a SoftReference-based
implementation, leaving the JVM responsible for managing memory.
Other strategies, such as LRU, are available, and may be more
configurable.
When a second-level cache is in use, all datastore select() results go
into the cache immediately, and all inserts, updates, and deletes are
propagated at transaction commit time. There is a slight risk here of
repeatable reads bringing bad data into the cache; this will need to
be addressed in the future.
The cache mechanism does not preclude distributed caching, but this is
not currently implemented.
|