Data Scientist.

VUPICO (HortonWorks system integrator partner) is seeking a senior Hadoop consultant with top-notch technical skills to define, design and build a multi-tenant hadoop cluster that can meet the needs for self-service analytics along with scheduled jobs. You will be responsible to define the self-service analytics layer, hosting model and will ensure that hadoop infrastructure is implemented in best practice fashion aligned with the enterprise architecture road map.

    Responsibilities

    • Define the self-service analytics layer for hadoop
    • Define the hosting model to meet the scalability, performance, high availability, and disaster recovery requirements
    • Identify tool for self-service analytics user interface [HUE alternatives
    • Identify tools for query governance [weed out inefficient queries
    • Identify tool for monitoring (servers, hadoop distribution, job SLA)
    • Define enterprise scheduler for scheduling hadoop jobs
    • Define security layer for hive, hbase tables, encryption for HDFS data
    • Present Architecture and Design vision and strategy to our customers at levels including executives to technical management and individual contributors
    • Architect will take the lead in process design/redesign, solution architecture design, infrastructure design and planning, system upgrades and migrations, acceptance testing, and maintenance strategy development. This role may engage independently with the client to provide architectural advice and oversight, or may participate on the project team as a senior technical lead
    • Ability to see and present the big picture and offer/architect solutions to make it better, specifically around business analytic systems
    • Direct experience working on a large scale data warehouse or Hadoop platform is a very strong asset
    • Intermediate programming and computation skills, preferably with Perl, Python Java, BASH/Shell Scripting and/or C/C++
    • Strong SQL programming skills
    • System administration of Linux is desired
    • RDBMS related product experience is strongly preferred (Teradata, SAP, Netezza, Vertica, Oracle, DB2, Postgres,, etc.)
    • Understanding of traditional DW/BI components (ETL, Staging, DW, ODS, Data Marts, BI Tools)
       

    Delivery Skills:

    • Minimum 2-3 years of experience in deploying and administering multi petabyte hadoop cluster on AWS or other cloud solutions
    • Well versed with hadoop challenges related to scaling and self-service analytics
    • Well versed with Cloudera and Hortonworks distribution
    • Well versed with hive, spark, hbase and latest developments in the hadoop eco system
    • Excellent knowledge of hadoop integration points with enterprise BI and EDW tools
    • Experience in complex environments building and deploying MPP Databases, Hadoop ecosystem, in-memory data grids, Java EE, BI/DW, or equivalent enterprise scenarios
    • This position requires deep, architect level knowledge and experience with building and deploying systems leveraging Hadoop and database technologies
    • Knowledge of Java and application development is important
    • Proven track record of successfully leading a team of developers and/or consultants
    • This position requires deep, architect level knowledge and experience with building and deploying systems leveraging Hadoop and database technologies
    • Proven track record of successfully leading a team of developers and/or consultants
    • Ability to see and present the big picture and offer/architect solutions to make it better, specifically around business analytic systems
    • Strong customer facing and relationship building skills including strong listening and question based knowledge gathering skills
    • Readiness to travel globally
    • Direct experience working on a large scale data warehouse or Hadoop platform is a very strong asset

Apply Now:

* Please upload your resume (max size: 5 MB)