Explain Dsstream with reference to Apache Spark
Answer / Pallavi Saxena
"DStream": A continuous stream of data, where each record has a timestamp. DStream is a key abstraction provided by Apache Spark's Streaming API for processing real-time data streams, such as live Twitter feeds or network logs. It enables the processing of streaming data in micro-batches with a configurable duration called the batch interval."
| Is This Answer Correct ? | 0 Yes | 0 No |
Do I need scala for spark?
What is a spark context?
What is apache spark core?
What is Spark.executor.memory in a Spark Application?
What is flatmap in apache spark?
What are the components of spark?
What is meant by in-memory processing in Spark?
Does Spark provide the storage layer too?
Is it necessary to install spark on all the nodes of a YARN cluster while running Apache Spark on YARN ?
Explain the concept of resilient distributed dataset (rdd).
Why are spark transformations lazy?
Can rdd be shared between sparkcontexts?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)