Job Description :
This is for real time systems, not after the fact as in marketing.

Some hands on coding experience with Java and Scala

Requires real time machine learrning, algarythm to code

Not static type guys - Pure Data Scientist

People who have worked at Netflix, Amazon, Intuit etc

Overview:

We are a highly motivated group of Big Data engineers, Data Scientists and Applications Engineers, working in small agile groups to solve sophisticated and high impact problems. We are building systems that ingest, model and analyze massive flow of data from online, social, mobile and offline commerce/user activity to set key business attributes for millions of products in real time. We use cutting edge machine learning, data mining and optimization algorithms underneath it all to analyze all this data on top of Hadoop/HBase/Hive. We then build compelling data visualizations and interactive dashboards to showcase our work internally and externally.

Your work will be immediately visible to millions of people and you will have a direct impact on the business goals of Fortune #1 company. If you talk, speak and think data we want to talk to you. Come join our small team and be part of this exciting journey.

1.Through your work have direct impact on revenue and bottom line of the entire company. 2. Research, design and implement data models and cutting edge statistical and optimization models to drive pricing. 3. Collaborate closely with product teams to deliver valuable analyses to our consumers and guide product enhancements. 4. Help answer the thorniest business questions – including ones we didn’t think we had.

Essential Functions:
Search processes billions of queries for millions of products on company''s sites and mobile apps worldwide. We mine structured and semi-structured data from product catalogs, social web, transactions, query logs, etc. at an unprecedented scale. We work on big data problems, cutting edge relevance algorithms from information retrieval, machine learning, and ranking to deliver a high-availability, low-latency service, which directly impacts business metrics

A Data Scientist is responsible for analyzing large data sets to develop custom models and algorithms to drive business solutions. Data Scientists work on project teams in order to provide analytical support to projects (for example, email targeting, business optimization, consumer recommendations) for eCommerce. Data Scientists are responsible for building large data sets from multiple sources in order to build algorithms for predicting future data characteristics. Those algorithms will be tested, validated, and applied to large data sets. Data Scientists are responsible for training the algorithms so they can be applied to future data sets and provide the appropriate search results. Data Scientists are responsible for researching new trends in the industry and utilizing up-to-date technology (for example, HBase, MapReduce, LAPack, Gurobi) and analytical skills to support their assigned project.
Build complex data sets from multiple data sources, both internally and externally.
Build learning systems to analyze and filter continuous data flows and offline data analysis.
Combine data features to determine search models.
Conduct advanced statistical analysis to determine trends and significant data relationships.
Demonstrates up-to-date expertise and applies this to the development, execution, and improvement of action plans
Develop custom data models to drive innovative business solutions.
Develop models of current state in order to determine needed improvements.
Models compliance with company policies and procedures and supports company mission, values, and standards of ethics and integrity
Provides and supports the implementation of business solutions
Research new techniques and best practices within the industry.
Scale new algorithms to large data sets.
Train algorithms to apply models to new data sets.
Utilize system tools including (MySQL, Hadoop, Weka, R, Matlab,ILog
Validate models and algorithmic techniques.
Work with cross-functional partners across the business

Qualifications:

PhD in computer science or similar field or MS with at least 2-5 years of related experience
Deep knowledge of machine learning, information retrieval, data mining, statistics, NLP or related field.
Good functional coding skills in C++ or Java(Java is highly preferred) – talent must be capable of spending up to 10% daily work day in writing production code in either C++/Java/Hadoop/Hive
Expert level knowledge of one of the scripting languages such as Python or Perl.
Superior ability to analyze and interpret the results of product experiments.
Proven experience working with statistical languages such as R.
Experience working with large data sets and distributed computing tools a plus (Map/Reduce, Hadoop, Hive, Spark etc
Strong communication skills both written and verbal
Willing to learn new technologies.
Self starter, Quick learner, Keen observer, eye for detail and someone who relishes challenges