Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) What is spark sqlcontext?
When application is on high latency (high response time)?
Where is kafka used?
Why spark is faster than hive?
If you run a select * query in hive, why does it not run mapreduce?
What is the difference between apache mahout and apache spark’s mllib?
Hdfs stores data using commodity hardware which has higher chances of failures. So, how hdfs ensures the fault tolerance capability of the system?
Explain ALTER Table statement in Hive?
Can NameNode and DataNode be a commodity hardware?
Do we need to install scala for spark?
What does a Spark Engine do?
Can you use Spark to access and analyse data stored in Cassandra databases?
Describe how hbase uses zookeeper?
Ideally what should be the replication factor in hadoop?
Can you explain clustering in mahout?