How would you use Map/Reduce to split a very large graph into smaller pieces and parallelize the computation of edges according to the fast/dynamic change of data?
416Write a Hive UDF that returns a sentiment score. For example, if good = 1, bad = -1, and average = 0, then a review of a restaurant states "Good food, bad service," your score might be 1 - 1 = 0.
406Suppose that your data is stored in collections, for instance, some binary data, message data or metadata is all keyed on the same value. Will you use HBase for this?
115
What is the role of alter keyspace?
What do you mean by meta information in hdfs? List the documents related to metadata.
What are problems with small files and hdfs?
How many InputSplits is made by a Hadoop Framework?
Please explain apache kafka?
What is the driver program in spark?
What are the various storages from which Spark can read data?
Why Hadoop performs replication, although it results in data redundancy?
What is spark databricks?
What is the purpose of RecordReader in hadoop?
When to use Hive?
How is big data analysis helpful in increasing business revenue?
How can we scale apache mahout in cloud?
What kind of datawarehouse application is suitable for Hive?
Explain the sequence of execution of all the components of MapReduce like a map, reduce, recordReader, split, combiner, partitioner, sort, shuffle.