A flat file contains 200 records. I want to load first 50
records at first time running the job, second 50 records at
second time running and so on, how u can develop this job?



A flat file contains 200 records. I want to load first 50 records at first time running the job, se..

Answer / subhash

1st Way:
1. Add 'row number' column in Seq File stage, so that each
record has a number associated with it.
2. Add a job param with which we can provide the number of
record from where we want to run the job. We can pass this
either using Sequence Start LOOP(List type variables-
50,100,150,200) or by shell script.
3. In the tfm, use a stage variable to run only from the
record number till 50 records by counting each record.

2nd way:
Design the job like this:
1. Add 'row number' column in Seq File stage, so that each
record has a number associated with it.
2. Use filter stage and write the conditions like this:
a. row number column<=50(in 1st link to load the records
in target file/database)
b. row number column>50 (in 2nd link to load the records
in the file with the same name as input file name, in
overwrite mode)


So, first time when your job runs first 50 records will be
loaded in the target and same time the input file records
are overwritten with records next first 50 records i.e. 51
to 200.
2nd time when your job runs first 50 records(i.e. 51-100)
will be loaded in the target and same time the input file
records are overwritten with records next first 50 records
i.e. 101 to 200.
And so on, all 50-50 records will be loaded in each run to
the target

Is This Answer Correct ?    8 Yes 1 No

Post New Answer

More Data Stage Interview Questions

what is the difference between == and eq in UNIX shell scripting?

1 Answers   CTS,


I am running a job with 1000 records.. If the job gots aborted after loading 400 records into target... In this case i want to load the records in the target with 401 record... How will we do it??? This scenario is not for sequence job it's only in the job Ex: Seq file--> Trans--> Dataset..

9 Answers   Cognizant, IBM, TCS, Virtusa,


In work load management there are three options of Low priority, Medium priority and High Priority Jobs which can be used for resource management. why this feature is developed when there is already jobs prescheduled by scheduler or autosys. what will be the use of workload management then?

1 Answers  


What are the processing stages?

1 Answers  


How you Implemented SCD Type 1 & Type 2 in your project?

1 Answers  


what is the definitions for Datawarehose and Datamart?

4 Answers  


how to cleansing data

6 Answers   Cap Gemini,


How the ipc stage work?

1 Answers  


what is stage is used for below Input columns: dept|mgr|employee|salary Output columns: mgr|count of employee per mgr|avg salary per dept note: each dept has one mgr and each mgr has many employees

1 Answers  


what is factless fact table?

3 Answers   IBM,


how can we perform the 2nd time extraction of client database without accepting the data which is already loaded in first time extraction

1 Answers   Reliance,


How many types of hash files are there?

1 Answers  


Categories