Managing Teradata changes with RTC SCM and Jazz Build

Managing database schema changes can be a non-trivial task. Very often the danger associated with potential bad changes being deployed makes DBAs very protective of their territories and resistant to changing tried and true methods.

However, the push towards more Agile development methods, adoption of DevOps practices and Continuous Delivery capabilities is including database development in its embrace; the last 10 years or so has seen a lot of work towards aligning database development with application development best practices. That said, there are probably quite a few organisations out there where the normal practice for deploying database changes is to use some sort of “list of instructions” and various GUI DB tools with maybe some scripts thrown in. Even adopting practices such as SCM and CI can appear daunting.

In this post I look at how easy it is to actually adopt SCM and CI for Teradata database development with RTC. There is no rocket science here nor a silver bullet; just some useful notes on software engineering practice in the context of database change management.

1. Setting up

For the database I use:

  • Teradata Express: I use the “Teradata Express for VMware Player” VM (after converting it and importing it into VirtualBox), which has a free, fully-functional Teradata database ready to go.
  • Teradata Plug-in for Eclipse: This makes it easy to connect to and manipulate a Teradata database from within Eclipse.  I’m lazy so this makes it a breeze to generate DDLs from existing sample DBs in the Teradata Express VM to use for testing. An added benefit is the SQL Ant Wizard, which makes it simple to integrate Teradata SQL execution with Ant and the Jazz Build Engine.

On the RTC front I have:

  • The RTC Eclipse Client installed into the same Eclipse instance that the Teradata Plug-in is installed into. This means I do all I need to (Teradata tasks, SQL development, RTC SCM, Jazz Build) in the one interface, switching views as needed.
  • A CCM Project Area created
  • The latest Ant build so I can use some of the newer features
  • Ant-contrib for some of the additional Ant tasks

2. RTC Project Area Setup

I use two components : “config” to hold build related artifacts and “teradata” to hold Teradata artifacts (mostly SQL). Right-sizing RTC components is important and though for this exercise I use a single, monolithic component for all Teradata artifacts, a more realistic approach may be to use a component per database and one (or more) for common artifacts. See my previous post “Managing Android platform source with RTC” for some additional guidelines on components, even if in a different context.

A possible stream strategy is shown below:

tdstreamstrategy

The two “xxDevxx” streams at the bottom of the hierarchy are used for day to day development of SQL and changesets and baselines flow to the “Integration” Stream which in turn flows to a “SystemTest” Stream. This loosely follows the strategy outlined in “Rational Team Concert source control makes continuous delivery easier“, and additional streams such as “UAT”, “Pre-Prod” can be added as required. The general idea is to have each stream capture the versions of artifacts (SQL scripts in this case) that are used to deploy into database instances in corresponding environments.

With the above strategy a typical change would flow as follows:

1. Marco creates the changesets in his workspace that flows into the “00DevTeam1” Stream.

2. Marco requests a Personal Build that would deploy and test his changes against his/his team’s development database instance.

3. When happy with his changes Marco delivers them to the 00DevTeam1 Stream, where they are picked up by a scheduled build and deployed and tested on his team’s development database instance.

4. If the deployment and tests succeed (ie. the build is Green) the changes are delivered to the “01IntegrationStream” Stream.

5. The changes show up as Incoming changes to the 00DevTeam2 Stream and when accepted as Incoming changes to any Workspaces flowing with 00DevTeam2 Stream.

6. The changes are further promoted (manually or automatically) to 02SystemTest as needed.

Here is a sequence of Pending Changes view screenshots that shows this:

When using the “Post-Build Deliver” feature a snapshot capturing the changes is automatically created and delivered by the build. The following image shows the “divergence” in the streams when a change (20140423_1955) has only been promoted up to 01Integration from 01DevTeam1 and hasn’t yet been promoted to 02SystemTest or accepted by 00DevTeam2.

streamversions

Comparing the different baselines will show the new change and individual artifacts can be annotated to show exactly why, when, what and by whom a change was made:

streamdiff annotate

Here is the same image after the change has been promoted to 02SystemTest and accepted by 00DevTeam2.

streaminsync3. Making Jazz Build and Ant do some of the heavy lifting

So much for the SCM piece. Careful use of the Jazz Build capability for automation can eliminate much of the drudgery associated with many repetitive manual tasks. In this example I use the Jazz Build Engine with Ant to push changes to Teradata. There are many,many ways in which Jazz Build/Ant can be used and I will explore three variants here. The common thread in each of the three variants is that any object names that are expected to be changed across invocations of the code are parametrised as Ant properties.  The runtime names to be used for these objects are provided in a properties file which Ant will expand when running the build.

Build Variant 1: Execute SQL script via JDBC

I call this “lazy” as this variant simply uses the Teradata SQL Ant Wizard to generate everything including the Ant build script and the properties file. The “SQL Ant Wizard” article on Teradata Developer exchange goes through how it is used in detail. On the Jazz build side all I need to to is create a JBE/Ant Build Definition which invokes the generated build script. The SCM and Ant property sheets are shown below:

I enhance the build script with a couple of Build Toolkit tasks to a) add some progress monitoring (startBuildActivity) and b) publish the output file from the SQL execution (logPublisher). The results of requesting the build are shown below.

Build variant 2: Execute all SQL in a directory via BTEQ

This variant uses a short Ant script to collate all the SQL scripts in a specified directory into  a single BTEQ script for execution.

The code snippets from the Ant script follow:

<target name="TD_transform">
 <!-- replace tokens taking values from ${tokenfile} -->  
 <replace dir="${sqlrundir}" includes="**/*.tbl,**/*.vw,**/*.sql"
            replacefilterfile="${tokenfile}" />
</target>
<target name="TD_createrunfile">
<!--Generate .RUN for each SQL file-->
  <ac:for param="sqlfile" trim="true">
    <path>
      <fileset dir="${sqlrundir}" includes="**/*.tbl,**/*.vw,**/*.sql" /> </path>
  <sequential>
    <echo file="${sqlrunfile}" append="true"> 
      .RUN file='@{sqlfile}';
      .IF ERRORLEVEL != 0 THEN .EXIT ERRORCODE; 
    </echo>
  </sequential>
</ac:for>
<!--replace credential tokens in logon file with real values-->
<replace file="${TD_logon}" replacefilterfile="${TD_logontokenfile}" />
<!--Concat header, .RUNs and footer into single BTEQ script -->
<concat destfile="${sqlrunfile}.run">
 <fileset file="${headerfile}" />
 <fileset file="${sqlrunfile}" />
 <fileset file="${footerfile}" />
 <filterchain>
</concat>
</target>

The ${headerfile} has the following content:

.run file .\Teradata\config\logon.txt 

.SET MAXERROR 4 
.SET ERRORLEVEL 3807 SEVERITY 2 

The credentials for the Teradata instance are in a tokenised file logon.txt, the values for which are expanded using a user supplied property file (${TD_logontokenfile})

The ${footerfile} has the following content:

.LOGOFF  
.EXIT 0  
   
.LABEL ERR 
.LOGOFF 
.EXIT 1

Build Variant 3: Execute incremental changes only via BTEQ

This is similar to Variant 2 except that ONLY those SQL scripts changed and accepted into the build workspace since the last snapshot are executed. The generateChangeLog task is used to find the list of changed SQL files and the Ant target code snippet looks like this:

<!--Generate list of changed files into buildChanges.txt -->
<generateChangeLog currentBuildResultUUID="${buildResultUUID}"
 repositoryAddress="${repositoryAddress}" showFileChanges="true"
 changeLogFile="${team.scm.fetchDestination}\buildChanges.txt" userId="${userId}" passwordFile="${passwordFile}" verbose="true" />

<!--Extract only .tbl or .sql or .vw files; trim spaces, sort and prefix with "**". generateChangeLog will produce a listing that starts with the RTC component name so the "**" prefix is necessary to make it a pattern for the replace task in next replace task --> 
<copy tofile="${changelogfile}" file="${team.scm.fetchDestination}\buildChanges.txt">
 <filterchain> <sortfilter /> 
<tokenfilter> <trim />  </tokenfilter>
 <linecontainsregexp> <regexp pattern="\.(tbl|vw|sql)${line.separator}" /> </linecontainsregexp>
 <prefixlines prefix="**" /> </filterchain>
 </copy>
 
<!--replace tokens in each file in list -->
<replace dir="${team.scm.fetchDestination}" includesfile="${changelogfile}" replacefilterfile="${tokenfile}" />

<!--copy transformed files into a specified directory -->
<copy todir="${sqlrundir}"> <fileset dir="${team.scm.fetchDestination}" includesfile="${changelogfile}" /> </copy>

The last two variants can be invoked either by separate build definitions or invoked by a single build definition by specifying different Ant targets when requesting the build.

Whichever variant is used, changes are pushed to a target stream using the post-build deliver option.

postbuild

Closing thoughts

Hopefully this post has shown that adopting SCM and CI practices with RTC for database changes isn’t scary at all. The simplest possible component/stream strategy that makes sense and very little Ant is all it takes. With RTC adding pieces to this puzzle is very simple:

1.Want to link changesets  to Defects, Tasks etc.?

Work Items (the “why” for a change”) are a built-in capability just waiting to be used. A simple right-click and Associate is all it takes, creating Work Items on the fly if needed.

2. Want to track and manage progress and plan iterations/sprints?

Planning capabilities are also built-in. Just create Categories for Work Items to be Filed Against and iterations/sprints they can be Planned For, start changing Work Item States and Estimates and the relevant plans are updated in real time.

3. Not ready to allow concurrent changes to SQL? Use RTC’s pessimistic locking to make changes sequential.

4. Don’t fully trust automated builds/deployments? Use any combination of RTC’s built-in role-based controls to tighten the process up. Or roll your own process extensions.

In short the RTC parts of such a solution are relatively easy. What is usually harder is coming up with good database schema change practices for things like bug fixes when developing in parallel and roll-back strategies.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s