What is dataframe in spark?
Answer / Parit Agarwal
"A DataFrame is a distributed collection of data organized into named columns that can be manipulated using SQL-like operations. It is built on top of RDD (Resilient Distributed Datasets) and provides an interface for easy data manipulation.".
| Is This Answer Correct ? | 0 Yes | 0 No |
What are the languages supported by apache spark and which is the most popular one?
What is an rdd?
List out the difference between textFile and wholeTextFile in Apache Spark?
Does google use spark?
What are the downsides of Spark?
what do you mean by the worker node?
What is difference between dataset and dataframe in spark?
What is a DStream?
Explain key features of Spark
How does spark program work?
What is difference between map and flatmap?
List the various types of "Cluster Managers" in Spark.
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)