Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) Why do we use HDFS for applications having large data sets and not when there are lot of small files?
1 2677
What is the difference between apache mahout and apache spark’s mllib?
What are snapshots and how do you create one in cassandra?
In MapReduce Data Flow, when Combiner is called?
What is rdd in spark with example?
How to restrict the number of lines to be printed in pig ?
Name the management tools in Cassandra?
Can you explain worker node?
How will you implement joins in HBase?
Explain Spark Driver?
What is the MapReduce plan in pig architecture?
What can be optimum value for Reducer?
What are components of Cassandra Data Model?
What is the purpose of sqoop-merge?
Explain what do you understand by cassandra- cql collections?
In hadoop_pid_dir, what does pid stands for?