Monday, April 30, 2012

Informatica - 1

HOW TO USE A COBOL FILE FOR TRANSFORMATION
 
Informatica allows reading data from cobol copybook formatted data files. These files mostly 
come from mainframe based source systems. Given that many of the world's leading business
systems still use IBM Mainframe as their computing systems, e.g. airlines, banks, insurance
companies etc, these systems act as a major source of information for Data warehouses,
and thus to our Informatica mappings. 
For using a cobol copy book structure as a source, you'd have to put that copybook in a 
empty skeleton cobol program.
IDENTIFICATION DIVISION.
PROGRAM-ID. RAGHAV.

ENVIRONMENT DIVISION.
SELECT FILE-ONE ASSIGN TO "MYFILE".

DATA DIVISION.
FILE SECTION.
FD FILE-ONE.

COPY "RAGHAV_COPYBOOK.CPY".

WORKING-STORAGE SECTION.

PROCEDURE DIVISION.

STOP RUN.

The copybook file can by a plain record structure.
Read more about defining copybooks around here.

Need of Scheduling and Commonly used Schedulers
 
Any and all Data warehousing environments need some kind of scheduler setup to enable
jobs being run at periodic intervals without human intervention. Another important feature
is the repeatability of the jobs set up such. Without the help of a scheduler, things would
become very ad-hoc and thus prone to errors and messups.
Oracle provides an built in scheduling facility, accessible through its dbms_scheduler package.
Unix provides basic scheduling facility using cron command. Similarly, Informatica also
provides basic scheduling facilities in the Workflow Manager client.
The features provided by these scheduling tools are fairly limited, often limited to launching
a job at a given time, providing basic dependency management etc.
However, in real time data warehousing solutions, the required functionality is lot more 
sophisticated than whats offered by these basic features. Therefore, the need for full
fledged scheduling tools, e.g. Tivoli Workload Scheduler, Redwood Cronacle, Control-M,
Cisco Tidal etc..
Most of these tools provide sophisticated launch control, dependency management features 
and therefore allow the data warehouse to be instrumented at finer levels.
Some of the tools, e.g. Tidal for informatica and Redwood for Oracle, provide support for
the Tools' API as well, therefore integrating even better with the corresponding tool.

1 comment: