Explain bucketing in Hive?
Answer / Arun Yadav
Bucketing in Hive is a technique used for data locality optimization. It divides the table into equal-sized partitions called buckets. By default, Hive creates 128 buckets for a table. Bucketing can significantly reduce the number of blocks that need to be scanned during a query, thereby improving performance. Here's an example:
```sql
CREATE TABLE table_name (columns...) BUCKETED BY bucket_column;
```
Replace `table_name`, `columns`, and `bucket_column` with appropriate names.
| Is This Answer Correct ? | 0 Yes | 0 No |
What are the default record and field delimiter used for hive text files?
What do you mean by schema on reading?
What are the different parts of Hive ?
RLIKE in Hive?
Can the name of a view be same as the name of a hive table?
Is hive suitable to be used for oltp systems? Why?
What is a generic UDF in the hive?
What is the difference between CREATE TABLE AND CREATE EXTERNAL TABLE?
Specify the partitions in hive?
What are the uses of explode hive?
What are the components used in Hive query processor?
What is the major difference between local and remote meta-store?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)