Experience in performing in-memory data processing and real time streaming analytics using Apache Spark with Scala, Java and Python.
· Developed applications for Distributed Environment using Hadoop, MapReduce and Python.
· Developed MapReduce jobs to automate transfer of data from HBase.
· Developing and Maintenance the Web Applications using the Web Server Tomcat.
· Experience in integrating Hadoop with Ganglia and have good understanding of Hadoop metrics and visualization using Ganglia.
· Good experience working with Hortonworks Distribution and Cloudera Distribution.
· Very good understanding/knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, Secondary Namenode, and MapReduce concepts.
· Experience in data extraction and transformation using MapReduce jobs.
· Proficient in working with Hadoop, HDFS, writing PIG scripts and Sqoop scripts.
· Performed data analysis using Hive and Pig.
· Expert in creating Pig and Hive UDFs using Java in order to analyze the data efficiently.
· Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
· Strong understanding of NoSql databases like HBase, MongoDB & Cassandra.
· Experience in working on various Hadoop data access components like MapReduce, Pig, Hive, HBase, Spark andKafka.