Job Description :

The Advanced Analytics Platform Administrator is responsible for supporting our analytics strategy and solutions across critical business areas, using technologies in the disciplines of Data Base, Big Data, Cloud, Data Science, Artificial Intelligence, Machine Learning and Data Visualization. This role will work as a member of a multi-functional team and participate in Administration, design, development, and deployment of analytic solutions.
The candidate must have strong technical, analytical and problem-solving skills. The team member will have exposure to tool administration, installation, configuration, code deployment, security administration, batch monitoring, data analysis, querying and troubleshooting. This position will have access to and ensure protection of confidential data. Relocation is not available for this role.
Responsibilities:
• Administer, maintain and deploy analytical solutions using specialized Cluster compute/database/development platforms -
o Hortonworks/Cloudera Hadoop Eco System( Ambari, Cloudera Manager, HDFS, YARN, Map/Reduce, Hive, Sqoop, Pig, Oozie, Zookeeper, Spark, Ranger, Knox etc.).
o Microfocus Vertica Analytics Platform.
o Microsoft Azure (ADLS, Databricks, Datafactory, Integration Runtime, PowerBI etc.)
o Python, R, Anaconda Navigator etc.
• Expertise in product upgrades and migrations (e.g. Hortonworks Data Platform to Cloudera Data Platform (CDP DC Private Cloud)).
• Administer Data Lake Security using Ranger, Knox, Active Directory and Kerberos in a Multifactor Authentication Setup.
• Develop streamlined data pipelines and workflows, to support custom workloads ensuring data quality standards, and automate using CA Autosys R11, Python, Hadoop(e.g. sqoop, hive, spark), Vertica(e.g. Bulk Loader), Microsoft Azure(IR, ADF, Databricks, Azure DevOps, MLFlow & MS SQL Server).
• Administer and monitor DataLake batch and near real-time data processing capabilities.
• Integrate DataLake with Natural Language Processing and visualizations Platforms like(Tableau, Power BI) to provide multiple data streams to enable rich visualizations for Corporate users,.
• Expertise in emerging tools within Advanced Analytics, Big Data and AI technologies including cloud solutions (Azure).
• Working knowledge of Kafka/Nifi, Spark, Storm and other data engineering tools. Experience working with NoSQL databases like HBase, Cassandra.
• Knowledge of Statistical, mathematical, predictive modelling techniques.
Query, retrieve, integrate and prepare data for profiling, analysis and to support the development of advanced data science models by our Data Scientists.
• Analyze/Profile various data sets to quickly identify sensitive fields and help in formulating rules for data anonymization and tokenization.
• Collecting descriptive statistics like min, max, count and sum. Collecting data types, length and recurring patterns. Tagging data with keywords, descriptions or categories.
• Performing data quality assessment, risk of performing joins on the data. Discovering metadata and assessing its accuracy.
• Identifying distributions, key candidates, foreign-key candidates, functional dependencies, embedded value dependencies, and performing inter-table analysis.
Build infrastructure to support Advanced Analytics and BigData driven projects/solutions. Assist in installation, configuration and maintenance of BigData/Cloud environments (Hortonworks/Cloudera Hadoop, Microfocus Vertica, Microsoft Azure etc.) and manage efficient configuration of all platforms. Analyze all platform level changes and monitor impact for same and provide appropriate technical solutions to resolve all issues efficiently. Administer configuration of all software services in various environments. Provide an efficient interface with various teams and provide appropriate technical support to all teams working on various platforms.
Provide Management with regular and timely status reports and benchmarking metrics of projects, issues, system implementations and upgrades.
Communicate status of issues with both the project team as well as business team in a timely manner.
Create documentation for project status reports, timelines, project issues and risks, change requests, requirements, design, testing plans and test cases as needed.
Define and document application best practices, required standards, processes, and guidelines regarding the use and administration of the BI tools.
Technical interface with IT local landscape teams and Enterprise Data Warehouse teams Work with IT Local Landscape teams, DIAA Teams, as well as various SMEs and representatives from vendors to identify and resolve issues and remove obstacles in achieving the Advanced analytics initiatives.

 

 

• Analytical Skills - Ability to collect, query data, establish facts and identify trends and variances.
• Ability to query databases and perform data analysis
• Good Understanding of Database Administration Processes, concepts – Modeling, Query, Security Administration etc.
• Working knowledge of database software (DB2, SQL Server, Rapid SQL, etc.).
• Must have good SQL writing skills.
• Knowledge of information technology systems, infrastructure and operations.
• Expertise in Linux System Administration and management.
• Knowledge of latest big data technologies.
• Knowledge of Hadoop and Vertica Eco System tools in a Linux Environment.
• Strong people skills including the ability to interact with employees at all levels
Good to Have
• Expertise in cloud technologies like Microsoft Azure(HDInsights, Data Bricks, R Server, SQL Data Warehouse, Cosmos DB, Azure ML etc.), AWS etc.
Experience with Python, Shell Scripting, Java.