Module I: Introduction on data stage
- brief history of data stage
- where does data stage fits with in data ware housing contest?
- What is IBM information server 8.0 and Web Sphere data stage?
- Where web sphere data stage packs fit with in the IBM Information server architecture
- various versions available
- Introduction to data stage server components
- Repository
- Data stage server
- Data stage package installer
- Introduction to data stage client components
- Data stage administrator
- Data stage designer
- Data stage director
- Data stage manager (removed in latest version)
IBM web sphere data stage and quality stage designer overview
- How to connect to a project
- The designer A quick tour
- IBM information server repository
- Developing a job
- Introduction to job properties
- Introduction to job parameters
- Introduction to table definitions
- Importing and exporting from the repository
- Report assistance and documentation tools
- Configuration files editor.
Module III: Web Sphere Data Stage Server Jobs:
Introduction to server jobs
handling databases in server jobs
handling databases in server jobs
- Handling special characters (#and$)
- Loading tables.
- Data type conversion- writing to oracle
- Data type conversion- reading from oracle
- Looking up on oracle table
- Updating on oracle table
- ODBC Stage
- Universe stage
- Handling files in server jobs
- Sequential file stage
- How to use sequential file stage
- Defining sequential file input data
- Defining sequential file output data
- How the sequential stage behaves
- Folder stages
- Handling processing stages in server jobs
- Transformer stage
- How to use transformer stage
- Transformer editor components
- The data stage expression editor
- Transformer stage properties
- Overview of transformer function
- Using transformer as a look up stage
- Aggregator stage
- How to use aggregator stage
- Defining the input colomn sort order
- Aggregating data
- Merge stage
- Sort stage
Parallel processing in data stage
Infrastructure as a foundation for data warehousing
- various hardware and the operating systems available
- What are the various platform option
- Client server architecture for data warehouse
- Various server hardware available
- SMP (symmetric multiprocessing)
- Clusters
- MPP ( massively parallel processing)
- CCNUMA OR NUMA(cache-coherent non – Uniform memory architecture)
Types of parallel processing in data stage
- pipeline parallelism
- partition parallelism
- Combining pipeline and partition parallelism
- Repartitioning data
- Parallel processing environments
- The configuration file
Types of partitioning in data stage
- round robin
- random
- same
- entire
- hash by field
- modulus
- range
- DB2
- Auto
` Type of collecting in data stage
- • round robin
- • ordered
- • sorted merge
- • auto
- The mechanics of partitioning and collecting
Web sphere data stage parallel jobs
- introduction to data stage parallel jobs
- difference between a passive stage and active stage
- handling metadata in data stage
- Running column propagation (RPC)
- Table definitions
- Schema files and partial schemas
- Data types
- Data and time formats
- Complex data types
Handling oracle enterprise stage in parallel jobs
- handling special characters(# and $)
- loading tables
- type conversions writing to oracle
- updating an oracle database
- deleting rows from an oracle database
- leading an oracle database
- reading an oracle database
- performing a direct lookup on an oracle database table
- using SQL builder
Handling transformer stage in parallel jobs
- how it is different from server transformer stage
- creating and deleting columns
- handling null values
- defining constraints and handling otherwise links
- specifying link order
- defining local stage variables
- what is a BASIC transformer stage
- transformer functions
? combining data in data stage parallel jobs
- horizontal and vertical combining
- join stage
- inner
- Left outer
- Right outer
- Full order
- Look up stage
- Merger stage
- Comparison between join merge and look up stage
- Partitioning in reference links
- Aggregator stage
- Funnel stage
- Funnel mode
- Sort funnel mode
- Sequence
Some more useful stages in data stage parallel jobs
- sort stage
- sequential sort
- Parallel sort
- Total sort
- Partitioning requirement
- Remove duplicates stage
- Modify stage
- Dropping and keeping columns
- Changing data type
- Null handling
- Pivot stage
- Limitations in pivot stage
- Modify stage
- Copy stage
- Filter stage
- External filter stage
- Switch stage
- Compress stage
- Expand stage
- Encode stage
- Decode stage
- FIP enterprise stage
- Generic stage
- Surrogate key generator stage
- SAS stage
Capturing changes in data stage parallel jobs
- change capture stage
- Change apply stage
- Difference stage
- Compare stage
- Slowly changing dimension stage
Handling develop / debug stages in data stage parallel jobs
- Head stage
- Head stage
- Head stage default behavior
- Skipping data
- Tail stage
- Sample stage
- Peek stage
- Row generator stage
- How to specify data to be generated
- Generating data in parallel
Colomn generator stage
Write range map stage
- How to perform range look up in data stage
- handling restructure stages in data stage parallel jobs
- colomn import stage
- Colomn export stage
- Make sub record stage
- Split sub record stage
- Combine records stage
- Promote sub record stage
- Make vector stage
- Split vector stage
Handling XML file in data stage parallel jobs
- Introduction to XML files
- Using the XML meta data importer
- Using xml input stage
- Validating documents and schemas
- Processing namespaces
- Supported x path expressions
Using XML output stage
- Processing names spaces
- Supported x path expressions
- Aggregating input rows on output
- Writing output to your file system
- Processing NULLS and empty values
- How repetition paths work
Using xml transformer stage
- Optimizing performance in server and parallel jobs
Web sphere data stage jobs and processes
interpreting performances statistics in server jobs
improving performance in server jobs
- CPU limited jobs single processor systems
- CPU limited jobs multiprocessor systems
- I/O limited jobs
- Hashed file stages
- Hash file design
Inter process stages in sever jobs
Link collector stages in server jobs
Link partitioned stages in server jobs
Job design tips in parallel jobs
- Processing large volumes of data
- Modular development
- Designing for good performance
Database sparse lookup vs. join
Improving performance in parallel jobs
- Understanding a flow
- Performance monitoring
- Resolving bottlenecks
- Ensuring data is evenly partitioned
Programming in data stage
Introduction to programming components
Routines
- Transform functions
- Before /after subroutines
- Custom universe functions
- Active (ole) functions
- Subroutines
- Creating a routine
- Defining custom transforms
Transforms
Macros
Precedence rules
BASIC programming
Built in transforms and routines
- Handling web services in data stageIntroduction to web services technologies
Encoding requests and responses
Using the soap framework
Publishing web service operations
Accessing web services
What is the web service pack
Using the web service meta data importer
Using the web services transformer stage
Using the web services client stage
Creating web service routines
How to expose data stage job as a web service
? Using IBM information console
Job scheduling using job sequences in data stage
Creating a job sequence
- Overview of activity stages
- Triggers
- Expressions
- Job activity properties
- Routine activity properties
- Email notification activity properties
- Wait for file activity properties
- exception activity properties
- Nested condition activity properties
- Start loop activity properties
- End loop activity properties
- User variables activity properties
- Compiling and restarting the job sequence
Some advanced concepts in data stage
- Achieving reusability in data stage using containers
- Types of containers
- Local containers
- Server shared containers
- Parallel shared containers
- Creating a shared containers
- Using shared containers in data stage jobs
- Converting shared containers to local containers
- Deconstruction of shared containers
- Specifying our own parallel stage
- Defining custom stage
- Defining build stage
- Build stage macros
- Defining wrapper stage
- Usage of administrator client in datastage
- Adding environment variables
- Setting job parameters default values
- Changing license details
- Handling projects
- Buffer settings in data stage
- Multiple instances of jobs in data stage
- Data stage job control utility
- Jobs – compilation execution and checking of logs using data stage tool
- Handling multilingual data in data stage
- How to enable NLS on data stage
- Orchestrate architecture and commands
- Orchestrate parallel processing framework in datastage
- Orchestrate utility in data stage
- Surrogate key generation using data stage
- Version control in data stage
For more details click here - Locate a center
|