Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) How will you calculate the number of executors required to do real-time processing using Apache Spark? What factors need to be considered for deciding on the number of nodes for real-time processing?
289In a given spark program, how will you identify whether a given operation is Transformation or Action ?
341
Apache Flume support third-party plugins also?
How is spark different from hadoop?
Clarify about the smb join in hive?
What is the importance of eval tool?
Which modes can Hadoop be run in? List a few features for each mode?
How can you use producer api code?
How is the distance between two nodes defined in Hadoop?
What is metadata storage service in bookkeeper?
Explain the usage of Context Object?
what is Speculative Execution?
What is data replication in Cassandra?
What does adminclient api in kafka?
mapper or reducer?
Define Actions.
What is mapreduce algorithm?