Datastage

C
 * **Code:** ||

How to test and develope a C++ Routine. DataStage Parallel routines made really easy Joshy George (Consulting Employee) posted 12/1/2007 | Comments (9) DataStage is a powerful ETL tool with lot of inbuilt stages/routines which can do most of the functionalities required; for those things DataStage EE canâ€™t do, there are parallel routines which can be written in C++.

This primer can teach you how you can create a parallel routine in few minutes, regardless of whether or not you are a C/C++ programmer. But to write some real good codes you might have to learn some C++ programming. Starting C programming with Linux is a good link to start with.

Before we begin, few points to be noted:

Parallel routines are C++ components built and compiled external to DataStage. Note - they must be compiled as C++ components, not C.

This C++ program should be without main and compiled using the compiler option specified under â€œAPT_COMPILEOPTâ€ which can be found under Administrator parameter option and create an object (*.o). This will create runtime libraries which are compiled code, without main ie. non self-contained executable file.

Compiler and compiler options can be found in DataStage --> Administrator --> Properties --> Environment --> Parallel --> Compiler Ex: compiler = g++ compiler options = -O -fPIC -Wno-deprecated â€“c Compile command syntax Compiler : compiler options : {filename with extenstion} Ex: g++ -O -fPIC -Wno-deprecated -c {filename with extenstion}

Here's the typical sequence of steps for creating a DataStage parallel routine:

Create --> Compile --> Link --> Execute

1) Create

Create a C++ program with main Test it and if successful remove the main

2) Compile

Compile using the compiler option specified under â€œAPT_COMPILEOPTâ€. Note:Compiler and compiler options can be found in "DataStage --> Administrator --> Properties --> Environment --> Parallel --> Compiler" and create an object (*.o) file and put this object file onto this directory.

3) Link

Link the above object (*.o) to a DataStage Parallel routine by making the relevant entries in General tab: Routine Name: {Parallel Routine Name} Type: External Function Object Type: Object / Library External subroutine name: {Function Name specified inside your C++ program} Library Path: {Specified in 2) Compile section + object (*.o) file name } Also specify the Return Type and if you have any input parameters to be passed specify that in Arguments tab.

4) Execute

Now your parallel routine will be available inside your job. Include and compile your job and execute.

Step by step Example: Creating a shared object

1) Create a C++ program with main Create a text file with cpp extn (Ex: OBJTEST.cpp )

Ex:


 * 1) include 
 * 2) include 

int main { char* OutStr; OutStr="Hello World - Object Testing"; printf(OutStr); return 0; }

Test this program Copy your compiler specification from "DataStage --> Administrator --> Properties --> Environment --> Parallel --> Compiler" and compile the created C++ program Syntax: g++ program.cpp â€“o program Ex: g++ OBJTEST.cpp -o OBJTEST

Run/Execute using the below command Syntax: ./program ./OBJTEST Output --> Hello World - Object Testing If you get above output, that means your program is successfully executed.

Re-write the program without main

Ex:


 * 1) include 
 * 2) include 

char * ObjTestOne { char* OutStr; OutStr="Hello World - Object Testing"; return OutStr; }

2)Compile the program Get compiler and compiler options from: DataStage --> Administrator --> Properties --> Environment --> Parallel --> Compiler Ex: compiler = g++ compiler options = -O -fPIC -Wno-deprecated â€“c Compile command syntax Compiler : compiler options : {filename with extenstion} Ex: g++ -O -fPIC -Wno-deprecated -c {filename with extenstion} Execute the below command: g++ -O -fPIC -Wno-deprecated â€“c OBJTEST.cpp This will make and object file with .o extn -->Ex: OBJTEST.o Move this object file to any of the Library Path of your preference: Ex: /datastage/Ascential/DataStage/PXEngine/lib I usually put in "lib" directory. You can locate your "lib" directory from Library Path (LD_LIBRARY_PATH).

3) Link Link the above object (*.o) to a DataStage Parallel routine. In the repository pallet â€œright clickâ€ and chose â€œNew parallel routineâ€  and make these entries in the General tab: Routine Name: {Parallel Routine Name} Ex: OBJECTTEST Type: External Function Object Type: Object External subroutine name: {Function Name specified inside your C++ program} Ex: ObjTestOne (Remember? This is the function name we replaced for main ie. char * ObjTestOne ) Library Path: {Specified in Compile section + object (*.o) file name } Ex: /datastage/Ascential/DataStage/PXEngine/lib/OBJTEST.o Return Type: char*

Note:As we donâ€™t have any input parameters to be passed we are not making any entries in Arguments tab. Now save and close the window.

4) Execute

Create a test job and call this parallel routine inside your job. Ex: Row Generator --> Transformer --> Sequential File In the transformer call this routine in your output column derivation. Compile and run the job.

Example


 * #include "stdio.h"


 * 1) include "string.h"
 * 2) include "stdlib.h"
 * 3) include "ctype.h"

char* ConvMCT(char *str) //Function with string input and string { char *result = (char *)malloc (sizeof(char *)); int x=0, Flag=1;// Setting Flag to 1 to make the first letter capital. char CheckStr[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

while(*str) { if(Flag=1) //Check if the last character was not alphabet. { if(isalpha(*str) and islower(*str))// Convert to uppercase if its a **lower** **case** alphabet. { result[x] = **upper** (*str); } else { result[x] = *str; //No Change if its already in uppercase or not an alphabet. }

} else { if(isalpha(*str) and isupper(*str)) { result[x] = tolower(*str);// Convert to lowercase except the first character. } else { result[x] = *str; } }

if(!strchr(CheckStr, *str)) //Check if the string is not a-z and A-Z. { Flag=1; } else { Flag =0; } ++x; ++str; } result[x] = '\0';// Terminate the string return result; //Return the replaced string } ||