what is the diff between sequential file and fileset stages?
Answers were Sorted based on User's Feedback
Answer / aparna kanduri
File set can be stored on multiple Unix files as flat files.
file set contains discriptor file and individual raw data
files . No. of raw data files depends on configuration file.
There will be some limit like 2G limit on some file
systems . we can distribute data over nodes to prevent
overrun.
In such cases file set will be useful than sequential file.
| Is This Answer Correct ? | 11 Yes | 6 No |
Answer / san
Seq. stage:
--------------
Seq file stage by default run in sqe. mode and you define a
seq file stage to run in parallel mode.. by setting option
"number of readers per node" > 1.
Seq. FS is used to read or write data from .txt .dat format
files. But the limit of seq stage is 2 GB.
Fileset stage:
---------------
by default Fileset stage run in parallel mode. more than 2GB
data can be stored. But the fileset hold to typse of information
i) files descriptor -> points to metadata, data location
ii) contains the data, contains multiple files if your using
more than one nodes config file
| Is This Answer Correct ? | 5 Yes | 1 No |
Answer / venugopal [patni]
Sequential file is used to read the data sequentially.
It can be configured to execute both in parallel and
sequential mode.We cann't perform lookups using sequential
file.
Fileset stage is used to import the exported files list.
It executes only in parallel mode.
the most importance of it is 2G.B limit on the size of a
file and we need to distribute files among the nodes to
prevent overruns.
| Is This Answer Correct ? | 7 Yes | 8 No |
why do we need a datawarehouse when we have databases to store data?
1.what is repartionoing technique? 2.what deliverables transferred to client using datastage? 3.how to write loop statements using nested loop sequence?
1.what is stagearea?what is stage variable? 2.this is my source source:id, name target:id, name 100, murty 100,madan we have three duplicate records for the id column,how can we getthe source record? 100,madan 100,saran
How much data u can get every day? 2)which data ur project contains? 3) what is the source in ur project?what is the biggest table & size in ur schema or in ur project?
for example You have One Table with 4 Columns (Mgr ID, Department ID, Salary, Employee ID). Can you find out the Average Salary and Number of Employee present per Department and Mgr
i have a table col1 10 20 30 40 10 20 50 my requirement is how to retrive only duplicates like 10 10 20 20 like this how it's possible in SQL?
if i have two tables table1 table2 1a 1a,b,c,d 1b 2a,b,c,d,e 1c 1d 2a 2b 2c 2d 2e how can i get data as same as in tables?chandu how can i implement scd typ1 and type2 in both server and in parallel?chandu field1 field2 field3 suresh , 10,324 , 355 , 1234 ram , 23,456 , 450 , 456 balu ,40,346,23 , 275, 5678 how to remove the ,inthe fields?
Why we need datasets ratherthan sequential files?
how to do pergformence tuning in datastage?
how can u find out the datastage job is running on how many nodes
data stores in which location while using data set stage as the target?
What is RCP?