Hadoop – Hive – Loading Data

Introduction As a data-warehouse project, Hive does not support the traditional singleton database insert record statements. Data inserts has to be through bulk-record modes.  One of those avenues is through a built-in load statement. Data Model Data Model - Logical Looked online for sample data models  that are representative of familiar data entities.  And, the … Continue reading Hadoop – Hive – Loading Data

Hadoop/Cloudera [CDH]/Hive v2 – Installation

Pre-requisites There are quite a few pre-requisites that should be met prior to installing and running Hive. They include: Have in a place a base Hadoop install (HDFS, Map Reduce v1 or v2).  A working demo is available @https://danieladeniji.wordpress.com/2013/05/12/hadoopcloudera-v4-2-1-installation-on-centos/ Having one of supported databases installed.  This database is needed by the metastore.  A mySQL implementation is … Continue reading Hadoop/Cloudera [CDH]/Hive v2 – Installation

Hadoop/Cloudera [CDH]/MetaStore – MySQL Database

Background Laying the groundwork for Hadoop/Cloudera [CDH]/Hive installation and trying to do my homework.  One of the pre-requisites is a Hive MetaStore. The Hive MetaStore stores metadata information for Hive Tables.  It provides the base plumbing for the Hive Tables. The Hive database is a bit better positioned by externalizing some of its 'metadata'; as a so called … Continue reading Hadoop/Cloudera [CDH]/MetaStore – MySQL Database

Hadoop/Cloudera (v4.2.1) – Installation on CentOS (32 bit [x32]) – HBase

Introduction For Cloudera Distribution of Hadoop, the install is bundled as an RPM. And, thus it is a straight install. Once installed, there are some post install configuration steps. Let us see how things work out. Installation - HBASE (Base) Introduction Let us review the install binary.  The package's is eponymously named HBASE. Review Package … Continue reading Hadoop/Cloudera (v4.2.1) – Installation on CentOS (32 bit [x32]) – HBase

Hadoop/Cloudera (v4.2.1) – Installation on CentOS (32 bit [x32])

Introduction Here is quick preparation, processing, and validation steps for installing Cloudera - Hadoop (v4.2.1) on 32-bit CentOS. Blueprint I am using Cloudera's fine documentation "http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Quick-Start/cdh4qs_topic_3_2.html" as a basis. It is a very good documentation, but I stumble a lot for lack of education and glossing over important details.  And, so I chose to write … Continue reading Hadoop/Cloudera (v4.2.1) – Installation on CentOS (32 bit [x32])

Hadoop – Hive – What is the Version # of Hive Service and Clients that you are running?

Introduction Hadoop is a speeding bullet.  You look online, Google for things, try it out, and sometimes you hit, but often you miss. What do I mean by that? Well this evening I was trying to play with Hive; specifically using Sqoop to import a table from MS SQL Server into Hive. A bit of … Continue reading Hadoop – Hive – What is the Version # of Hive Service and Clients that you are running?

Technical: Hadoop – ZooKeeper – Client – Cloudera

Technical: Hadoop - ZooKeeper - Client (Cloudera) Introduction http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_21.html ZooKeeper is a high-performance coordination service for distributed applications. It exposes common services — such as naming, configuration management, synchronization, and group services - in a simple interface so you don't have to write them from scratch. You can use it off-the-shelf to implement consensus, group … Continue reading Technical: Hadoop – ZooKeeper – Client – Cloudera

Hadoop – Sqoop – Command – Export Data (from HDFS to Microsoft SQL Server)

Introduction Having fun with Hadoop; specifically exporting data with Sqoop. Pre-requisites Hopefully, you have installed\validated that Hadoop\Sqoop is installed and running properly. If not, please read Technical: Hadoop – Sqoop on Cloudera (CDH) – Is Sqoop Set up and Configured for MS SQL Server (https://danieladeniji.wordpress.com/2013/05/03/technical-hadoop-sqoop-on-cloudera-cdh/ ) Generate Sample Data Data, data everywhere but none to share without … Continue reading Hadoop – Sqoop – Command – Export Data (from HDFS to Microsoft SQL Server)

Cloudera Hadoop Demo VM on VirtualBox – Installation

Cloudera Hadoop Demo VM on VirtualBox - Installation ( All thanks to Thomas Lockney   (http://blog.cloudera.com/blog/2009/07/cloudera-training-vm-virtualbox/) for writing this down and making it so beautiful to follow In some cases, authors quickly do things and move on.  But, Thomas made the presentation so clear and elegant and it would have stayed on my mind; unless … Continue reading Cloudera Hadoop Demo VM on VirtualBox – Installation