Tuesday, February 22, 2005

Managing NonStop SQL/MX Modules

With SQL/MX we wanted to design a more flexible architecture that was not dependent on proprietary compilers. Also, we wanted the ability to be able to create an executable from multiple objects containing embedded SQL, potentially generated from source code written in different languages.

As a result of that we have a separate preprocessor step that separates the language source code from the SQL statements. Then the standard language compiler is used to compile the host language source code to generate an object file. Multiple object files can then be bound together into a single executable. This executable is not an SQL object and is not registered into the SQL metadata. The SQL statements are put into a module definition file that is compiled by the SQL compiler, mxcmp, to generate the plan that it stores in a module file. An alternative method for module definitions is provided in SQL/MX Release 2.0 (discussed later). The executable has a reference to the name of this module file that it then loads at execution time. The module file does not know what executables are using it since the object file associated with the module (usually a temporary artifact) can be linked independently into many executables.


By contrast, SQL/MP uses an architecture that stores query plans in the executable program file. SQL/MP uses a proprietary compiler that creates an object file with the embedded SQL statements separated out within it. The preprocessor in effect is built into the compiler. SQLCOMP, the SQL compiler, then uses these statements within the object to generate an executable containing the SQL plans. This executable is registered into the SQL/MP catalog and the executable is marked as a SQL object. The executable has to be dropped like any other SQL object.

So let's see what some of the differences are between SQL/MP and SQL/MX before we discuss how to manage some of the complexity that having separate modules has introduced.

What the SQL/MP model provides:
1. A simple compilation model
2. A single self-contained executable to move to the production environment
3. The registration of a SQL executable in the catalog which facilitates "where used" reports for executables using certain SQL objects .
4.Invalidation of plans when DDL operations are performed.

What the SQL/MX model provides:
1. Independence from the language compiler being used
2. Standard object files not requiring proprietary handling by other tools, such as NOFT, NLD
3. The ability to link multiple objects with embedded SQL into a single executable
4. The ability to develop these multiple objects in different languages
5. The capability to create reusable embedded SQL component libraries.

Before a discussion on the issues with managing modules and some potential solutions, a bit of background would help. The module file is designed to have an ANSI three-part name composed of the catalog, schema, and module name since it is considered a SQL object. The catalog and schema for modules is currently completely independent of catalog and schema associated with the SQL objects it uses. This schema may not even exist.

The module name can be fully or partially qualified. If a catalog and schema is not specified, the preprocessor assumes a default catalog and schema name. The default catalog and schema name when preprocessing on the NSK under OSS is the group and user name respectively. The default names when preprocessing is done on a workstation, say using ETK, are SQLMX_DEFAULT_CATALOG_

.SQLMX_DEFAULT_SCHEMA_. Preprocessor run-time options -g moduleCatalog and -g moduleSchema can be used to provide specific catalog and schema names. The third part of the name is the name specified in the MODULE statement in the source program. The MODULE statement can specify the catalog and schema name as well. These will override the preprocessor options if specified. If a MODULE statement is not specified the preprocessor will generate a module name based on the timestamp, e.g., SQLMX_DEFAULT_MODULE_211954938763281467.

Issue: Grouping, Versioning, and Re-targeting Modules

What if I want to create a new version of a program without overlaying the old version? What if I want to use the same program but compile it to run against a different set of tables? What if I just wanted to group my modules in some way?

To address these issues, the first set of module management features was introduced in Release 1.8 with preprocessor options to add suffixes or prefixes to the module name. The options are -g moduleVersion, -g moduleTableSet, -g moduleGroup that result in a module name ^^^.

This allows you to have a new version of the same program by using a suffix, such as v2, so that a different module is generated than the older version. It does not overlay the older module which can still be used. Similarly you can create a new module and an executable for the same program that runs against a different set of tables by using a suffix for it. Finally, you can provide a prefix so that within the catalog and schema you have used the modules will be grouped together for easier management.

Issue: Doing All of the Above Without Preprocessing
What if I have multiple environments on the system for the same application, such as development and QA? The above scheme would work but it would require preprocessing all of the programs with potentially a different prefix or suffix for the module names. What if instead I want to just replicate my application from development to QA and not have development updates of the modules affect my QA work?

To address this issue in SPR AAV we released the capability during a SQL compilation to place modules globally or locally. The syntax at SQL compile time is to use -g moduleGlobal or -g moduleLocal [=]. See Support Note S03140 Multienvironment User Modules in NonStopAA SQL/MX or the SQL/MX Release 2.0 manuals for details. In this way, all development modules can reside in one local directory. These modules need to be collocated with the executables that use them. The option provided to specify an OSS directory is to facilitate the placing of the modules in the same directory as the executable will be in. These can then be duplicated over to a QA local directory without any preprocessing. However, to point this new set of modules to a different set of target tables, unless you want to use late binding to achieve that, you would have to SQL compile these modules now located in the QA local directory.

The above paragraph used development and QA just as examples of two environments that you might maintain. But of course, these multiple environments could be of your own choosing with potentially no overlap of modules between them. In some sense global and local module support supersede the functionality provided by the Version, Table Set, and Group support provided earlier. However, you could potentially use both of those capabilities together. That is, you could have globally or locally placed modules with versions, groups, or table sets being maintained in the same directory.

Issue: Targeting a Different Catalog and Schema

So I can have multiple environments, such as development and QA on the same machine. However, if I specified a catalog and schema in my source file how can I target a different catalog and schema for my SQL objects for each environment without changing my source files or re-preprocessing?

There are SQL compile time options available since R1.8 using Control Query Default parameters (-d option) to specify a different catalog and schema name at SQL compile time. These options are different from the module catalog and schema options available during preprocessing. They work only if SQL objects referenced in your program are not fully qualified. If a SQL object is fully qualified these compile time options do not have any effect and do not remap the catalog and schema for that object. Declaring a catalog and schema within your program fully qualifies a SQL object and is equivalent to using its three-part ANSI name. With SQL/MP tables this is relevant only if MP aliases are being used. If DEFINEs are being used the question is irrelevant since the target objects will have to be specified as run-time DEFINEs.

Issue: Securing Modules
How can I prevent someone else from deleting or updating my module files?

Besides the fact that you may not want someone to clobber your module files, these files need to be secured since they have privileged access. There is a way using the OSS commands, such as chmod, chown, and umask, to secure the module directory using the sticky bit. Using this facility any non-privileged user can create module files in this directory. But once the module is created only that user (besides SUPER) can update or delete that module file. See Support Note S04015: Securing SQL/MX User Modules or the SQL/MX Release 2.0 manuals for details.

Issue: Managing Multiple Module Definition Files

I link multiple objects into a single executable and therefore have multiple module definition files associated with a single executable. If I have to move my application into production I have to move the executable and all its related module definition files into production and then recompile them individually. I liked the MP way!

With SQL/MX Release 2.0 you can now get functionality similar to SQL/MP. The preprocessor annotates the output source file so that the SQL statements that would have been in the module definition files now reside in the output file of the preprocessor. When this file is compiled with the language processor an object file is created. A utility called mxCompileUserModule now replaces mxcmp (it actually invokes mxcmp on your behalf). This utility can be run against the object file or the executable itself. If an object file will be bound into multiple executables it may be best to run the utility against that object file. Otherwise, it is best, especially when the executable is moved to production, to run the utility against the executable. You do need to be careful that when using the local directory capability that all the required modules are collocated with the executables that need them. It is still different than SQL/MP in the sense that SQL/MP puts the compiled plan into the executable whereas SQL/MX still creates separate module files.

Issue: Getting Module Usage Information

If I will be performing a DDL or Utility operation on a table how would I know which modules will be impacted by such an operation?

DISPLAY USE OF is available for modules starting with SPR ABC so that you can find out what objects are being used by a module or what modules are using a specific object. For the latter case this utility goes through all the module files to determine if the object is being accessed. Considering the infrequency of the need for this information the performance of this utility should be acceptable.What Is in the Future for Module Management?

We consider that the issues discussed above have been satisfactorily addressed. If a customer is not satisfied with the current solutions to address them we would need to get a Request for Enhancement for us to work on those issues further.

One issue that has not been addressed is module invalidation that was available with SQL/MP. It is not clear how important this is for our customers since most customers do not want an application down because an invalidated program fails to execute in production. However, if the requirement is to understand what modules will require recompilation in order to execute after a DDL or Utility operation we understand that requirement. The challenging part of addressing that requirement is that the development environment is not as tight and controlled. The modules do not really know which executables they ended up in. Users can delete modules or executables, and there is nothing that guarantees consistency and currency of the objects in the development environment. So a solution that tracks the modules that would be "invalidated" by a DDL operation is handicapped by the fact that the modules that were compiled may not exist or may not even be used with an executable. Certainly SQL is not aware of the executables and their movement or deletion. So resolving this issue needs a qualification of the requirement and a fair amount of deliberation. If you use consistent naming conventions for source code, modules, and executables it becomes somewhat easier to determine the relationships between your application objects. It may also make it easier to determine obsolete objects for deletion.

With the availability of SQL/MX native tables we would like users to be able to map a three part SQL object name to a different name at run-time. While most customers like to pre-compile their programs against the target SQL objects they will run against, many rely on late name binding to execute against a different set of SQL objects at run-time. DEFINEs can be used for SQL/MP tables to map a table to a different one at run-time. With SQL/MX tables this mapping can be achieved using PROTOTYPE support that uses host variables populated with the correct object names at run-time. This requires knowing the names of the run-time objects at compile time. It also requires logic in the application to choose the correct run-time object names if there are multiple environments that the application may run in. The SQL syntax for prototype is an extension of the ANSI standard syntax. The preferred mechanism would be similar to DEFINEs that would simply remap the SQL objects, or even just the catalog and schema, at run-time. We are looking at providing this functionality sometime in the future.

-Rohit Jain
Database Architect and Evangelist
HP