What is the difference between DSM and RDD?
Answer / Parvind Pal
"Apache Spark's DataSet API (DSM) was introduced as a successor to the RDD. While RDD provides a general distributed collection, DSM focuses on optimizing common data processing patterns by providing more efficient APIs for operations like joins and aggregations. DSM is built on top of RDD, offering similar functionality but with performance improvements."n
| Is This Answer Correct ? | 0 Yes | 0 No |
Which are the various data sources available in spark sql?
What are Actions? Give some examples.
Is apache spark going to replace hadoop?
Name three features of using Apache Spark
What happens when we submit a spark job?
What do you understand by Executor Memory in a Spark application?
Explain Catalyst framework?
Define Partition and Partitioner in Apache Spark?
What is executor memory in spark?
What do you understand by Lazy Evaluation?
What apache spark is used for?
When to use spark sql?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)