Need of Scheduling and Commonly used Schedulers
Any and all Data warehousing environments need some kind of scheduler setup to enable
jobs being run at periodic intervals without human intervention. Another important feature
is the repeatability of the jobs set up such. Without the help of a scheduler, things would
become very ad-hoc and thus prone to errors and messups.
Oracle provides an built in scheduling facility, accessible through its dbms_scheduler package.
Unix provides basic scheduling facility using cron command. Similarly, Informatica also
provides basic scheduling facilities in the Workflow Manager client.
The features provided by these scheduling tools are fairly limited, often limited to launching
a job at a given time, providing basic dependency management etc.
However, in real time data warehousing solutions, the required functionality is lot more
sophisticated than whats offered by these basic features. Therefore, the need for full
fledged scheduling tools, e.g. Tivoli Workload Scheduler, Redwood Cronacle, Control-M,
Cisco Tidal etc..
Most of these tools provide sophisticated launch control, dependency management features
and therefore allow the data warehouse to be instrumented at finer levels.
Some of the tools, e.g. Tidal for informatica and Redwood for Oracle, provide support for
the Tools' API as well, therefore integrating even better with the corresponding tool.