Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
Why do we need buckets?
How many Daemon processes run on a Hadoop system?
Explain keys() operation in Apache spark?
What are the independent extensions that are contributed to the ambari codebase?
What are the port numbers of job tracker?
What problem does Apache Pig solve?
What is shuffle in spark?
Is it necessary to write a mapreduce job in java?
What are Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift?
What are the side data distribution techniques?
How is RDD in Spark different from Distributed Storage Management?
What are producer-consumer queues?
How to specify more than one path for storage in Hadoop?
what does /*streamtable(table_name)*/ do?
What are the main benefits of using cassandra?