Which are the methods to create rdd in spark?
Answer / Mukesh Chaurasia
There are multiple ways to create RDDs (Resilient Distributed Datasets) in Spark. Some common methods include:nt* parallelize(iterable): Creates an RDD from a local iterable collection.nt* textFile(path): Reads text files and converts them into an RDD of lines.nt* wholeTextFiles(path): Reads text files as key-value pairs, where keys are file paths and values are the content of the files.n
| Is This Answer Correct ? | 0 Yes | 0 No |
Is spark sql a database?
What is spark code?
How sparksql is different from hql and sql?
State the difference between Spark SQL and Hql
Where does Spark Driver run on Yarn?
Define the level of parallelism and its need in spark streaming?
How is RDD in Spark different from Distributed Storage Management?
What is speculative execution in spark?
Explain Spark SQL caching and uncaching?
What are the advantages of DataSets?
Explain Spark streaming?
What do you mean by Speculative execution in Apache Spark?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)