Define data cleansing?
Answer / Achin Kumar
Data cleansing, also known as data cleaning or data scrubbing, is a process of detecting and correcting or removing errors, inaccuracies, and inconsistencies in datasets. The goal of data cleansing is to improve the quality of data by reducing noise and minimizing the effect of outliers.
| Is This Answer Correct ? | 0 Yes | 0 No |
Do we need to place 2nd and 3rd data in rack 2 only?
What is pseudo-distributed mode?
What are the important modes of hadoop?
What does hadoop-env.sh do?
Can we deploye job tracker other than name node?
Can we use windows for hadoop?
How many daemon processes run on a hadoop cluster?
What is the command to change the replication factor ?
What is Identity reducer?
When should be hadoop archive create?
What is the default replication factor and how will you change it?
Can you explain record reader?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)