My Experiences with Life: March 2012

Thursday, March 29, 2012

Informatica Questions from a friend - Part 1

HOW TO USE A COBOL FILE FOR TRANSFORMATION

Informatica allows reading data from cobol copybook formatted data files. These files mostly 
come from mainframe based source systems. Given that many of the world's leading business 
systems still use IBM Mainframe as their computing systems, e.g. airlines, banks, insurance 
companies etc, these systems act as a major source of information for Data warehouses, 
and thus to our Informatica mappings.

For using a cobol copy book structure as a source, you'd have to put that copybook in a 
empty skeleton cobol program.

IDENTIFICATION DIVISION.
PROGRAM-ID. RAGHAV.

ENVIRONMENT DIVISION.
SELECT FILE-ONE ASSIGN TO "MYFILE". 

DATA DIVISION.
FILE SECTION.
FD FILE-ONE.

COPY "RAGHAV_COPYBOOK.CPY".

WORKING-STORAGE SECTION.

PROCEDURE DIVISION.

STOP RUN.

The copybook file can by a plain record structure.
Read more about defining copybooks around here.

Wednesday, March 28, 2012

Counting columns in a tab delimited file

It sounds so simple, however, when you sit down to write this, specially as a single line expression, it can take a while.

In my experiments I found it rather easy to count it with other delimiters as compared to TAB character.

Here is the command for counting columns

cat <FILENAME>| awk -F'\t' '{print NF}'

cat can be slow at times, especially with larger files, therefore an alternative without that...

awk -F'\t' '{print NF}' <FILENAME>

Tuesday, March 13, 2012

how to find sql id of a long running sql in oracle

Sql for finding out sql id etc details of some long running query. Often useful for sending kill/monitoring instructions to DBA friends..

select distinct t.sql_id, s.inst_id,s.sid, s.serial#,s.osuser, s.program, s.status, t.sql_text

from gv$session s, gv$sqlarea t

where s.username = '<USERNAME>'

and s.sql_id = t.sql_id

and t.sql_text like '<%provide a segment of sql to identify it%>'

Thursday, March 8, 2012

Partitioning in Informatica

Sourced from Internet

All transformations have some basic counters that indicate the number of input rows, output rows, and error rows.

Source Qualifier, Normalizer, and target transformations have additional counters that indicate the efficiency of data moving into and out of buffers. Use these counters to locate performance bottlenecks.

Some transformations have counters specific to their functionality. For example, each Lookup transformation has a counter that indicates the number of rows stored in the lookup cache.

When you view the performance details file, the first column displays the transformation name as it appears in the mapping, the second column contains the counter name, and the third column holds the resulting number or efficiency percentage. If you use a Joiner transformation, the first column shows two instances of the Joiner transformation:

	<Joiner transformation> [M]. Displays performance details about the master pipeline of the Joiner transformation.
	<Joiner transformation> [D]. Displays performance details about the detail pipeline of the Joiner transformation.

When you create multiple partitions, the Integration Service generates one set of counters for each partition. The following performance counters illustrate two partitions for an Expression transformation:

Transformation Name	Counter Name	Counter Value
EXPTRANS [1]	Expression_input rows	8
	Expression_output rows	8
EXPTRANS [2]	Expression_input rows	16
	Expression_output rows	16

Note: When you increase the number of partitions, the number of aggregate or rank input rows may be different from the number of output rows from the previous transformation.

The following table describes the counters that may appear in the Session Performance Details area or in the performance details file:

Transformation	Counters	Description
Aggregator and Rank Transformations	Aggregator/Rank_inputrows	Number of rows passed into the transformation.
	Aggregator/Rank_outputrows	Number of rows sent out of the transformation.
	Aggregator/Rank_errorrows	Number of rows in which the Integration Service encountered an error.
	Aggregator/Rank_readfromcache	Number of times the Integration Service read from the index or data cache.
	Aggregator/Rank_writetocache	Number of times the Integration Service wrote to the index or data cache.
	Aggregator/Rank_readfromdisk	Number of times the Integration Service read from the index or data file on the local disk, instead of using cached data.
	Aggregator/Rank_writetodisk	Number of times the Integration Service wrote to the index or data file on the local disk, instead of using cached data.
	Aggregator/Rank_newgroupkey	Number of new groups the Integration Service created.
	Aggregator/Rank_oldgroupkey	Number of times the Integration Service used existing groups.
Lookup Transformation	Lookup_inputrows	Number of rows passed into the transformation.
	Lookup_outputrows	Number of rows sent out of the transformation.
	Lookup_errorrows	Number of rows in which the Integration Service encountered an error.
	Lookup_rowsinlookupcache	Number of rows stored in the lookup cache.
Joiner Transformation (Master and Detail)	Joiner_inputMasterRows	Number of rows the master source passed into the transformation.
	Joiner_inputDetailRows	Number of rows the detail source passed into the transformation.
	Joiner_outputrows	Number of rows sent out of the transformation.
	Joiner_errorrows	Number of rows in which the Integration Service encountered an error.
	Joiner_readfromcache	Number of times the Integration Service read from the index or data cache.
	Joiner_writetocache	Number of times the Integration Service wrote to the index or data cache.
	Joiner_readfromdisk*	Number of times the Integration Service read from the index or data files on the local disk, instead of using cached data.
	Joiner_writetodisk*	Number of times the Integration Service wrote to the index or data files on the local disk, instead of using cached data.
	Joiner_readBlockFromDisk**	Number of times the Integration Service read from the index or data files on the local disk, instead of using cached data.
	Joiner_writeBlockToDisk**	Number of times the Integration Service wrote to the index or data cache.
	Joiner_seekToBlockInDisk**	Number of times the Integration Service accessed the index or data files on the local disk.
	Joiner_insertInDetailCache*	Number of times the Integration Service wrote to the detail cache. The Integration Service generates this counter if you join data from a single source.
	Joiner_duplicaterows	Number of duplicate rows the Integration Service found in the master relation.
	Joiner_duplicaterowsused	Number of times the Integration Service used the duplicate rows in the master relation.
All Other Transformations	Transformation_inputrows	Number of rows passed into the transformation.
	Transformation_outputrows	Number of rows sent out of the transformation.
	Transformation_errorrows	Number of rows in which the Integration Service encountered an error.
The Integration Service generates this counter when you use sorted input for the Joiner transformation. *The Integration Service generates this counter when you do not use sorted input for the Joiner transformation.

If you have multiple source qualifiers and targets, evaluate them as a whole. For source qualifiers and targets, a high value is considered 80-100 percent. Low is considered 0-20 percent.