What is the difference between coalesce and repartition in spark?

Golgappa.net | Golgappa.org | BagIndia.net | BodyIndia.Com | CabIndia.net | CarsBikes.net | CarsBikes.org | CashIndia.net | ConsumerIndia.net | CookingIndia.net | DataIndia.net | DealIndia.net | EmailIndia.net | FirstTablet.com | FirstTourist.com | ForsaleIndia.net | IndiaBody.Com | IndiaCab.net | IndiaCash.net | IndiaModel.net | KidForum.net | OfficeIndia.net | PaysIndia.com | RestaurantIndia.net | RestaurantsIndia.net | SaleForum.net | SellForum.net | SoldIndia.com | StarIndia.net | TomatoCab.com | TomatoCabs.com | TownIndia.com
Interested to Buy Any Domain ? << Click Here >> for more details...

What is the difference between coalesce and repartition in spark?

Question Posted / asha rani

1 Answers
335 Views
I also Faced
E-Mail Answers

What is the difference between coalesce and repartition in spark?..

Answer / Prem Shankar Jha

Coalesce and Repartition are operations used to change the number of partitions of a DataFrame or RDD. Repartition re-distributes data across new partitions, while Coalesce consolidates existing partitions (and potentially reducing the overall partition count) by combining smaller ones into larger ones. The main difference is that Repartition may shuffle the data more than Coalesce as it needs to redistribute the data among different partitions.

Is This Answer Correct ?

0 Yes

0 No

Post New Answer

More Apache Spark Interview Questions

Does spark sql use hive?

What are the drawbacks of Apache Spark?

Explain about mappartitions() and mappartitionswithindex()

What is standalone mode in spark?

What are the benefits of lazy evaluation?

What is the standalone mode in spark cluster?

What is action, how it process data in apache spark

What are the limitations of Spark?

What is lambda in spark?

What is the difference between rdd and dataframe?

What is partitioner spark?

How do you parse data in xml? Which kind of class do you use with java to pass data?

For more Apache Spark Interview Questions Click Here

Categories

Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)