Thursday, March 29, 2012

Informatica Questions from a friend - Part 1

HOW TO USE A COBOL FILE FOR TRANSFORMATION
 
Informatica allows reading data from cobol copybook formatted data files. These files mostly 
come from mainframe based source systems. Given that many of the world's leading business
systems still use IBM Mainframe as their computing systems, e.g. airlines, banks, insurance
companies etc, these systems act as a major source of information for Data warehouses,
and thus to our Informatica mappings. 
For using a cobol copy book structure as a source, you'd have to put that copybook in a 
empty skeleton cobol program.
IDENTIFICATION DIVISION.
PROGRAM-ID. RAGHAV.

ENVIRONMENT DIVISION.
SELECT FILE-ONE ASSIGN TO "MYFILE".

DATA DIVISION.
FILE SECTION.
FD FILE-ONE.

COPY "RAGHAV_COPYBOOK.CPY".

WORKING-STORAGE SECTION.

PROCEDURE DIVISION.

STOP RUN.

The copybook file can by a plain record structure.
Read more about defining copybooks around here.
 
 
 

Wednesday, March 28, 2012

Counting columns in a tab delimited file


It sounds so simple, however, when you sit down to write this, specially as a single line expression, it can take a while.

In my experiments I found it rather easy to count it with other delimiters as compared to TAB character.
Here is the command for counting columns

cat <FILENAME>| awk -F'\t' '{print NF}'

 cat can be slow at times, especially with larger files, therefore an alternative without that...

awk -F'\t' '{print NF}' <FILENAME>


 

Tuesday, March 13, 2012

how to find sql id of a long running sql in oracle


Sql for finding out sql id etc details of some long running query. Often useful for sending kill/monitoring instructions to DBA friends..

select distinct t.sql_id, s.inst_id,s.sid, s.serial#,s.osuser, s.program, s.status, t.sql_text
from gv$session s, gv$sqlarea t
where s.username = '<USERNAME>'
and s.sql_id = t.sql_id
and t.sql_text like '<%provide a segment of sql to identify it%>'

Thursday, March 8, 2012

Partitioning in Informatica

Sourced from Internet

All transformations have some basic counters that indicate the number of input rows, output rows, and error rows.
Source Qualifier, Normalizer, and target transformations have additional counters that indicate the efficiency of data moving into and out of buffers. Use these counters to locate performance bottlenecks.
Some transformations have counters specific to their functionality. For example, each Lookup transformation has a counter that indicates the number of rows stored in the lookup cache.
When you view the performance details file, the first column displays the transformation name as it appears in the mapping, the second column contains the counter name, and the third column holds the resulting number or efficiency percentage. If you use a Joiner transformation, the first column shows two instances of the Joiner transformation:

<Joiner transformation> [M]. Displays performance details about the master pipeline of the Joiner transformation.
<Joiner transformation> [D]. Displays performance details about the detail pipeline of the Joiner transformation.

When you create multiple partitions, the Integration Service generates one set of counters for each partition. The following performance counters illustrate two partitions for an Expression transformation:
Transformation Name
Counter Name
Counter Value
EXPTRANS [1]
Expression_input rows
8
Expression_output rows
8
EXPTRANS [2]
Expression_input rows
16
Expression_output rows
16
Note: When you increase the number of partitions, the number of aggregate or rank input rows may be different from the number of output rows from the previous transformation.
The following table describes the counters that may appear in the Session Performance Details area or in the performance details file:
Transformation
Counters
Description
Aggregator and Rank Transformations
Aggregator/Rank_inputrows
Number of rows passed into the transformation.
Aggregator/Rank_outputrows
Number of rows sent out of the transformation.
Aggregator/Rank_errorrows
Number of rows in which the Integration Service encountered an error.
Aggregator/Rank_readfromcache
Number of times the Integration Service read from the index or data cache.
Aggregator/Rank_writetocache
Number of times the Integration Service wrote to the index or data cache.
Aggregator/Rank_readfromdisk
Number of times the Integration Service read from the index or data file on the local disk, instead of using cached data.
Aggregator/Rank_writetodisk
Number of times the Integration Service wrote to the index or data file on the local disk, instead of using cached data.
Aggregator/Rank_newgroupkey
Number of new groups the Integration Service created.
Aggregator/Rank_oldgroupkey
Number of times the Integration Service used existing groups.
Lookup Transformation
Lookup_inputrows
Number of rows passed into the transformation.
Lookup_outputrows
Number of rows sent out of the transformation.
Lookup_errorrows
Number of rows in which the Integration Service encountered an error.
Lookup_rowsinlookupcache
Number of rows stored in the lookup cache.
Joiner Transformation
(Master and Detail)
Joiner_inputMasterRows
Number of rows the master source passed into the transformation.
Joiner_inputDetailRows
Number of rows the detail source passed into the transformation.
Joiner_outputrows
Number of rows sent out of the transformation.
Joiner_errorrows
Number of rows in which the Integration Service encountered an error.
Joiner_readfromcache
Number of times the Integration Service read from the index or data cache.
Joiner_writetocache
Number of times the Integration Service wrote to the index or data cache.
Joiner_readfromdisk*
Number of times the Integration Service read from the index or data files on the local disk, instead of using cached data.
Joiner_writetodisk*
Number of times the Integration Service wrote to the index or data files on the local disk, instead of using cached data.
Joiner_readBlockFromDisk**
Number of times the Integration Service read from the index or data files on the local disk, instead of using cached data.
Joiner_writeBlockToDisk**
Number of times the Integration Service wrote to the index or data cache.
Joiner_seekToBlockInDisk**
Number of times the Integration Service accessed the index or data files on the local disk.
Joiner_insertInDetailCache*
Number of times the Integration Service wrote to the detail cache. The Integration Service generates this counter if you join data from a single source.
Joiner_duplicaterows
Number of duplicate rows the Integration Service found in the master relation.
Joiner_duplicaterowsused
Number of times the Integration Service used the duplicate rows in the master relation.
All Other Transformations
Transformation_inputrows
Number of rows passed into the transformation.
Transformation_outputrows
Number of rows sent out of the transformation.
Transformation_errorrows
Number of rows in which the Integration Service encountered an error.
*The Integration Service generates this counter when you use sorted input for the Joiner transformation.
**The Integration Service generates this counter when you do not use sorted input for the Joiner transformation.
If you have multiple source qualifiers and targets, evaluate them as a whole. For source qualifiers and targets, a high value is considered 80-100 percent. Low is considered 0-20 percent.