CRUNCH YOUR WAY IN HADOOP
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
 Welcome Back Friends... Sorry.... It took me some time for posting my blog & deleting few of my previous blogs. But don't worry I am sure you will like this blog as I am sure this will create a new view towards working out MR (MapReduce) programming in Hadoop. In this blog we will learn about Apache Crunch and will workout a use-case for more understanding. The Apache Crunch Java library provides a framework for writing, testing, and running MapReduce pipelines. Its goal is to make pipelines that are composed of many user-defined functions simple to write, easy to test, and effici ..read more
Visit website
HIVE INTERVIEW RELATED PREPARATION
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
Dear Friends.... Few days I spent preparing and giving interviews for job change in HADOOP and few HIVE questions were like most common for almost every interview I faced. Few questions were being asked by few of my friends for which I decided to give a showcase for practical understanding. Here, with my practicals I am showing various concept on HIVE. 1. Difference between External & Managed Tables:- Ans:> Almost asked by all Interviewers to explain about external and manged tables. Though its very common to answer but what I observed was that all it matters on how you answer the ..read more
Visit website
A USECASE ON TRAVEL APP
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
Dear Friends, Welcome Back.... Day by Day I am learning different thing which I like to share with you all.  As a great person said " Learning is a journey not a Destination". I was little busy last few day, acquiring different know-how on Hadoop. I was asked whether I can solve this usecase by one of the Travel app provider to know where is the demand of their services so that they can take decision on how to keep offers/discounts to lure the customers to use their services.  In this blog, I am using this usecase to solve the problem and help the client to take decision for bett ..read more
Visit website
HIVE ON RESCUE- A HEALTHCARE USE_CASE ON STRUCTURED DATA
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
Dear Friends, We know that Hadoop's HIVE component is very good for structured data processing.  Structured data first depends on creating a data model – a model of the types of business data that will be recorded and how they will be stored, processed and accessed. This includes defining what fields of data will be stored and how that data will be stored: data type (numeric, currency, alphabetic, name, date, address). Structured data has the advantage of being easily entered, stored, queried and analyzed. At one time, because of the high cost and performance limitations of storage, mem ..read more
Visit website
WAYS TO BULK LOAD DATA IN HBASE
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
Dear Friends, Going ahead with my post, this one was asked by one of my friend about HBase, for which I am sharing my thoughts and working procedure for the Loading of Bulk Data in HBase. HBase is an open-source, NoSQL, distributed, column-oriented data store which has been implemented from Google BigTable that runs on top of HDFS. It was developed as part of Apache’s Hadoop project and runs on top of HDFS (Hadoop Distributed File System). HBase provides all the features of Google BigTable. We can call HBase a “Data Store” than a “Data Base” as it lacks many of the features available in tradi ..read more
Visit website
MULTIPLE OUTPUT WITH MULTIPLE INPUT FILE NAME
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
Dear Friends, I was being asked to solve how to process different files at a time and store the same under each file name. Its a real-time problem where say for example, you have log files from different places and you have to process the  same logic on all but have to store it in different file name. How to do this???? In this Blog, I will take you through how to do the same using simple multiple output method in  MapReduce program. Here I am using wordcount program logic. Problem Statement is as below. 1. N no.of input files will be in HDFS. Each input file is having list of sent ..read more
Visit website
XML FILE PROCESSING IN HADOOP
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
Dear Friends, Welcome back, after a long time. I was asked by one of my friend to explain about XML processing in hadoop. I went through many articles, weblinks, etc in search of that answer and now I am ready to showcase the same in this blog. PROBLEM Working with XML is painful. XML structure is variable by design, which means no universal mapping to native Pig data structures. This is the price we pay for such a flexible, robust markup, but, as Software Developers, we can’t continue to ignore this problem. There’s XML data everywhere just waiting for us to crack it open and extract valu ..read more
Visit website
HADOOP POC ON EXCEL DATA WEATHER REPORT ANALYSIS
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
Hello Friends, Glad to present this blog which is for analysis of Weather Report POC, which is in Excel Format. This POC  was given to me and asked by one of my friends to complete it. Most of the time we get data in Excel Format and according to that we have to make changes in our coding. So, in this POC I have modified my previous code to accept the excel data, for the convenience of making you all understand the concept. NOTE:- Though this POC is to read EXCEL data, I have not used the same in my coding but still it worked. (I have no idea how & why it happened. Kindly share if ..read more
Visit website
HADOOP (PROOF OF CONCEPTS) WEATHER REPORT ANALYSIS
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
Hello Friends, Welcome back... This blog is for analysis of Weather Report POC which was given to me and asked by one of my friends to complete it. While searching for the same I came across a very good website which I just can't wait to share with you all.. In this POC I have modified and used both for convenience of making you all understand the concept. Problem Statement: 1. The system receives temperatures of various cities captured at regular intervals of time on each day in an input file. 2. All cities weather information for a week will be inputted to the system in a single input fi ..read more
Visit website
HIVE 2.1.1 INSTALLATION IN HADOOP 2.7.3 IN UBUNTU 16
UNDERSTANDING HADOOP BY MAHESH MAHARANA
by
3y ago
Hello Friends, Welcome to the blog where I am going to explain and take you through the installation procedures of Hive 2.1.1 on Hadoop 2.7.3 in Ubuntu 16. The recent release of hive is quite different then the previous one and why it shouldn't be. The working mechanism we will go through some other time now its time to take you through the installation part. HIVE VERSION  - WE ARE USING - HIVE-2.1.1 STEP 1:- Download the HIVE 2.1.1 version:- You will fine the Hive 2.1.1 version in the below link and download the hive.2.1.1.bin/tar.gz file onto your desktop:- http://www-us.apache.or ..read more
Visit website

Follow UNDERSTANDING HADOOP BY MAHESH MAHARANA on FeedSpot

Continue with Google
Continue with Apple
OR