Wednesday, April 30, 2014

ODI Series - Digging into ODI repository

I believe it is no new situation for any ODI developer, where he/she had to go through a long list of ODI objects to identify their usage in interfaces or packages, LKMs /IKMs used within interfaces, variables and their usage, getting rid of unwanted objects, etc I had been in such a situation and wanted to keep a track of any such objects. I was able to beg, borrow, steal or develop SQL queries to query ODI work repository and following list of SQL queries can be used to put down a list of objects and their association

Query 1: Variables Used in mappings


Query 2: Table Names used as Source schema & Datastore Tabs


Query 3: Interface Information


Query 4: Interfaces not used in Packages



--- Updated May 12th, 2014 ---

Query 5: Procedures used in Packages

Query 6: Procedures not used in Packages
Query 7: Scheduling Information

This scheduling information query fetches all scheduling information except repetition information



SQL Queries can be downloaded here.

Hope this helps !!!

Thursday, March 27, 2014

ODI Series - Pulling file names and timestamp from directory

Recently, I was asked a question by my colleague, Linga (he always comes up with something or the other) - Can we pull file names from a directory and process the latest one out of them, while keeping the history of processed files ?

I could recall that one other colleague Parasuram PJP did something similar. When I tried out his code, that did not work for me due to some reason. Anyways, he was kind enough to share some pearls of wisdom and I sat down searching details about Jython to fetch file names and additional information. To find out the latest file out of all, we need to capture the timestamp of its last modification.

Since we need to keep a log of all files which gets processed, we will be maintaining a RDBMS table (SQL Server in my case) -  FILENAMES with following structure


Along with this comes a twist where initial part file names will be same having different suffix. Just to simulate the scenario, I could think of some files which were on my system used while exporting Essbase data. The directory looks something like this. We have multiple files with their names starting with 'DtDemo'. The task is to load these files into SQL table and then find out the latest file to process it.


This can be achieved by writing few lines of code in Jython, call it as a procedure and then put it in a package. 
We start with creating a new procedure to use Jython technology and add a step 'PullFile' to it



Define an option for procedure to store the source directory name - SOURCE_DIR



'Command on Source' uses the RDBMS schema where the FILENAMES table exists and 'Command on Target' uses Jython with code to pull filenames, timestamp for last modification. This information is pushed to SQL table.




The SQL table have all files with names starting with 'DtDemo'. I played around with 'DtDemoBasic.txt' to save it at different times to get different timestamps for this file. Every time the procedure is executed, it loads the names of the files with their last modified timestamp. The filename for the latest one can be pulled by doing a MAX on timestamp.


It was not less that Linga added another requirement to optimize his integration - Process the file only if the file got updated, else ignore it

All I could think of  - compare the timestamp of latest two files and then decide whether to process it or not. Lets create procedure which executes an update command to update the PRCSD_FLAG to 'Y' for the latest file. This procedure should get executed only when the timestamp of latest file is different from second latest file. We would do this check in an ODI  variable and then  call the procedure based on variable's value


The variable would be executed on the RDBMS schema with table FILENAMES and have the code to compare the timestamps of top first file with the top second file when files are arranged in descending order of timestamps


The complete flow can be arranged in a package with following steps:

  • Pull all files from directory in a table
  • Refresh variable with SQL to compare timestamps of top two files
  • Evaluate the variable to check if value is '0' (zero)
  • Execute procedure to update flag if variable value is zero

Jython code and SQL queries can be downloaded from here.

Hope this helps !!!

Sunday, March 9, 2014

Automation of SmartView reports

The most commonly used reporting tool with Hyperion setup is the Financial Reporting aka HFR. HFR brings with it, the ability to design reports using various datasources - Essbase, Planning, SAPBW, MSOLAP, HFM. It has been quite popular to burst static PDF reports using inbuilt scheduler or using the command line utility for Essbase, Planning and HFM. Along with HFR, SmartView is another tool used for reporting.

SmartView provide options to download HFR reports in spreadsheets and carryout analysis as most analyst love to do. One common requirement which HFR do not offer as an out-of-box solution, is the ability to burst reports in spreadsheets. This post delves into an alternate approach to cater such requirements.

The Smart View has been enhanced lately to include lot of new capabilities - Retaining Excel based formatting, butterfly reports, multi-grid reports, etc. - which combined with its macros functions help analysts design, automate and burst spreadsheet based reports in an easy and convenient way. Rather than focusing on new features, we would focus on its automation capability and take a small example to demonstrate the same.

Let's consider a SmartView template for the sample application Demo-Basic. The template pulls data for all accounts by periods for all products & all markets. The formatting includes few conditions and relative icons to give template some cosmetic look and feel. We can make sure conditional formatting is retained by enabling excel formatting from SmartView options. To automate refreshing of such templates, SmartView uses functions which can be embedded within excel macros. 



Navigate to Developer tab in Excel and open Visual Basic to open VBA editor.


The left pane of editor shows the project which consists of sheets and modules. Sheets refer to the tabs in workbook and modules store code. The modules are files with extension .bas, usually function library or custom code.



Before using the SmartView functions we need to import the library of SmartView functions under modules. Right click on Modules -> Import and navigate to smartview.bas. This file is located under SmartView\Bin folder, usually C:\Oracle\Middleware\SmartView\Bin. This gets imported as SmartViewVBA
We need to add a new module where we will use these SmartView function as excel macros. Right click on Modules -> Insert -> Module and name it anything. We named it Module5.


The code to retrieve sheet is shown below
  • The code first checks existence of connection with name "MyConn". We can name it anything, more appropriately application-dbname
  • In case the connection does not exist, it creates a connection
  • Login to Essbase using the parameters passed
  • Retrieve sheet and logout. We can all multiple sheets to retrieve


In order to schedule it, we create a VBScript file which makes call to the subroutine "RetrieveSheets" and refresh sheet(s). 



We can also add some VBScript code to email the spreadsheet to users (code not included here). You can download these files here.

Attachments:

Run_SmartView.vbs.txt: Remove the .txt extension (I added it to avoid being blocked by server).
This is a vbscript file which makes call to the Excel sheet and executes the Macro embedded in it. You are required to update the path of Excel spreadsheet, username, password, appname & dbname in this file. This username, password, appname and dbname will be passed on to the Excel macro when it is called. Double clicking this file or calling it from cmd will run it. Right click -> Edit to edit file

SmartView_Retrieve.xlsm: Macro enabled spreadsheet. Please make sure you have macros enabled to run and view code.

Hope this helps !!!

Sunday, February 23, 2014

Migrating Planning 11.1.1.3 artifacts to 11.1.2.3 without upgrading to 11.1.1.4

Disclaimer: The steps mentioned in this post are a workaround and in no scenario intend to replace standard steps of migration published in Oracle documentation. Please take necessary backup before following these steps. These steps may help to import artifacts when we do not have configured environment from 11.1.1 release to follow upgrade path.

I wanted to setup a Hyperion Planning 11.1.2.3 environment to try out new features under this release and thought of creating an application from the extract files of one of my app which I had used long back for training and proof of concept. I created a new Planning app shell similar to the app I was about to import.

Since 11.1.2.3 LCM offers ability to upload the LCM extract from shared services console in to File System directory (import_export by default), I followed these steps to upload the LCM extract.



On trying to open the app under the File System, it did not display any of the artifacts in the detail pane. Initially, I thought it could be some system error and restarted the services once. But that didn't help. On digging into the documentation, online forums and help from Nitesh, I realized that starting 11.1.2, LCM has undergone changes when it comes to importing and exporting of artifacts. Any artifact from 11.1.1.3 need to be first upgraded to 11.1.1.4 before migrating to 11.1.2. That is what was happening with this extract too. The SourceInfo.xml in the extract provide details about the LCM version used to create extract.




Few of the changes that can be easily identified:
  • The console no longer has the option to 'Define Migration' while importing or exporting artifacts
  • The LCM export in 11.1.2 stores the import and export definition XMLs.



 The app extract which I used was from 11.1.1.3 and I don't have any configured setup from 11.1.1.3 release to upgrade them to 11.1.1.4 and re-export. The alternative is to make changes to make it ready for 11.1.2.3.

Lets export the newly created app shell and compare the two extracts to find out how they differ. Lets name the LCM extract for the app shell as RETL_POC_APP.


  • The 11.1.2.3 extract have the import.xml along with info and resource directories which did not exist in old extract. This is because old version allowed us to create migration definition file

  • The listing.xml in 11.1.1.3 have a property "id" for folder tag which does not exist in the listing.xml from 11.1.2.3. But this field does not seems to have any unique value which may cause any problem during import. Lets ignore it as of now



There does not seem to be any major difference between the extracts. All I did is copied the Import.xml into the extract zip of 11.1.1.3 without changing any other xml, zip it and placed it back in the import_export directory. This time it worked !!!


Step 1: Select all artifacts and import
Selected all artifacts and started the import process. But it failed with errors while importing User Preferences.xml, Exchange Rates, Period, Planning Units and Security XMLs for groups and users.



Step 2: Deselect Standard Dimensions - Period & Year
Since the standard dimensions already exist in the target application, deselect the standard dimensions - Period & Year under Plan Type/PnL/Standard Dimensions. When ran the import again, it failed again


Step 3: Deselect User Preferences.xml and Security XMLs
The errors hold information that import process was unable to find groups and users. Since these users and groups does not exist in the Shared Services, these errors are valid. User Preferences.xml also hold information about users which should exist in Shared Services. This time import failed giving error for Planning Units.


On deselecting Planning Units.xml and importing it again leads to success. Voila !!!

All artifacts except security, Planning Units, User Preferences gets imported successfully. Though if we create the users and groups in shared services, Planning security would also get imported. I need to investigate more why Planning Units import failed even though the xml did not have any Planning Unit details.


Just to summarize, 
  • Providing the Import.xml (Migration definition file), LCM 11.1.2.3 was able to identify object to import. 
  • Users & Groups must exist in Shared Services to get migrated
  • Standard Dimensions - Period & Year may not get imported as the number of years and period hierarchy may depend how we have defined them in target
As said earlier, these steps are an alternate option to import artifacts when we do not have configured environment from 11.1.1 release to follow upgrade path.

Hope this helps !!!

Tuesday, February 11, 2014

ODI Series – Developing Efficient Integration

It has been a long time that I posted anything , and thought of writing something I have been working with lately – ODI.
Recently, I was training few newbies in ODI when I was asked a question – “How can we be sure whether we have developed an interface which is an efficient one? “
I guess there is no standard answer to this question. It all depends on the technology we are working with and the transformation we intend to achieve. We need to remember that ODI utilizes the underlying technology to perform its transformation and as a developer we need to make use of it in the best way. In the following post, I will be talking about the process of analyzing our data sources and technology to improve integration process.
Let’s consider following scenario and data sources:

Flat File called Products.txt which stores information about product codes


Staging DB which stores a Translation table. This table has the Alias names for all the product codes which should be populated in target


Target DB which stores Product information pulled from flat file and translation table both with following structure of table


We proceed with the assumption that we know how to reverse engineer flat files and RDBMS tables and bring them into ODI as models/datastores. Once reversed, the Datastores can be dragged on to interface as source and target and mapped as required. The interface should look something like this:


One easy way to optimize your integration is by minimizing the number of steps involved for data movement and transformation between source and target. Though, this may not matter much with small chunks of data files or records, but may severely affect integration time when involving huge volumes of data. In the following scenarios we would see how this can be handled well.

Scenario 1: Staging layer is neither source nor target
Let’s consider a scenario where you choose a staging area which is neither the source not target. In this interface we have flat file and RDBMS table (schema called DBStaging) as source and RDBMS table (schema called DBTarget) and thus we choose In-Memory Engine: SUNOPSIS_MEMORY_ENGINE as our staging area.



The flow tab diagram shows the flow of data from source to staging and then to target. The execution takes 15 steps to complete movement of data


Scenario 2: Staging layer is one of the source schema
Since one of our source is RDBMS (Translation table schema) we can use the same schema as our staging layer. This will reduce the step 1  of moving data from RDBMS table to SrcSet0 in the above scenario as the table exists in the same schema.



The execution now takes 11 steps to complete data movement. We actually skipped the steps to move data from Translation Table into SrcSet0 in Scenario 1

Scenario 3: Explore Knowledge Modules
It's always good to explore all the available Knowledge Modules and understand their features. In the above two scenarios, we used IKM SQL Control Append for populating data into target, while for this scenario we would use IKM SQL to SQL Control Append.
IKM SQL Control Append can only read data from single source and due to this reason we used LKM SQL to SQL at the staging layer. This performs the join between SrcSets and load data to a temp table. IKM SQL Control Append then pulls data from this temp table. On the other hand IKM SQL to SQL Control Append has the ability to read data from multiple sources. This can be observed in the flow diagram below, where LKM SQL to SQL is no longer required.



The execution now takes just 7 steps to complete data movement. We skipped the steps to load data to Temp table at the staging layer by using IKM SQL to SQL Control Append, reducing steps for overall integration.

As said earlier, all these optimization steps are dependent on underlying technology being used in integration and the transformation activity. Since the target here was a RDBMS schema, we were able to make use of IKM SQL to SQL Control Append; had the target been either Essbase or Planning, we would be bound to use the specific knowledge modules and need to look out for alternate ways to optimize our steps.

Hope this helps !!!