Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) Can You Use Apache Spark To Analyze and Access Data Stored In Cassandra Databases?
How to specify more than one directory as input to the MapReduce Job?
How data or file is written into Hadoop HDFS?
What is the use of checkpoints in spark?
Is it possible to leverage real-time analysis of the big data collected by Flume directly? If yes, then explain how?
Explain plucktuple?
What are the different modes available in Pig?
What is Sqoop Job?
How to write 'foreach' statement for tuple datatype in pig scripts?
What is the usage of "void close()" method?
Since the data is replicated thrice in hdfs, does it mean that any calculation done on one node will also be replicated on the other two?
Define Writable data types in MapReduce?
Why we need impala hadoop?
What is the use of Combiner?
Differentiate between the physical plan and logical plan in Pig script?