Remote - Data Scientist

Remote, Work from Home Remote Work from Home 00000

Date : Oct-15-21

Remote, Work from Home

Oct-15-21

Work Authorization

US Citizen
GC
H1B
OPT EAD, CPT EAD, GC EAD, L2 EAD, H4 EAD, TN EAD

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Midlevel

Rate/Salary ($)

Market

Duration

6 Months

Sp. Area

AI, ML, NLP, Data Science

Sp. Skills

Data Scientist

Consulting / Contract

Remote Work from Home

Required Skills :

Data Scientist, Machine learning, Python, Kubernetes, SPARC, C++, Cloud Computing, DevOps, Docker, JAVA, Kafka, SQL, AB Testing, Agile, AWS, Bamboo, B

Preferred Skills :

Domain :

Work Authorization

US Citizen
GC
OPT EAD, CPT EAD, GC EAD, L2 EAD, H4 EAD, TN EAD
H1B

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Midlevel

Rate/Salary ($)

Market

Duration

6 Months

Sp. Area

AI, ML, NLP, Data Science

Sp. Skills

Data Scientist

Consulting / Contract

Remote Work from Home

Required Skills :

Data Scientist, Machine learning, Python, Kubernetes, SPARC, C++, Cloud Computing, DevOps, Docker, JAVA, Kafka, SQL, AB Testing, Agile, AWS, Bamboo, B

Preferred Skills :

Domain :

Compunnel
Plainsboro, NJ
Post Resume to
View Contact Details &
Apply for Job

Job Description :

Job Title : Data Scientist

Duration : 6 months+

Location: Remote

This position will be based in the Bay area/ Fully remote

Job Description :

We are building a data science team with a mission to discover insightful information hidden in vast amounts of data and help us make smarter predictive and prescriptive analytical decisions relevant to the business problems at hand. Your primary focus shall range from understanding the business problems, doing statistical analysis to experimenting with latest Machine Learning modelling techniques (including classical data mining and the latest deep learning) with an aim to build high quality prediction systems integrated with our products and solutions.

Examples of the types of tasks that this may involve are :

developing data pipeline stages like cleaning, validation, wrangling etc. for a variety of data types like text, images and categorical or custom
design and develop metrics/scoring pipelines using machine learning techniques
develop feature extraction pipelines from raw data stored in a variety of formats
design and develop feature representation using a variety of data formats like SQL databases, key-value or object storage, or knowledge graphs for better predictions
work with standardized libraries like sklearn, NumPy, pandas to implement models for classification and regression tasks
work with tensorflow, keras, pytorch etc. to implement various custom and pre-built neural network models like RNNs, CNNs
develop internal A/B testing and multi-arm bandit or ensembled models and pipelines
work in mixed programming/scripting language environments as per application requirements like python, java, C++
work within state-of-the art MLOps/CICD/DevOps platforms based on standardized sparc, kubernetes, kafka based batch, streaming/real-time, or transactional distributed architectures used to host the model training, test, and inference pipelines.
Must have experience doing exploratory data analysis and visualization using state of the art python based libraries like pandas, numpy, matplotlib, searborn, plotly, streamlit etc.
Must have experience building models/algorithms for training/inference workloads using libraries like sklearn, tensorflow, pytorch

Responsibilities

Selecting features, building, and optimizing classifiers using machine learning techniques
Data mining and experimental analysis using state-of-the-art methods
Processing, cleansing, and verifying the integrity of data used for analysis and training/inference
Collect/understand business requirements with varying degree of crispness
Define and design data science techniques and pipelines that address specific business problems
Work with datasets of varying degrees of size and complexity including both structured and unstructured data.
Developing pipelines to process massive data-streams in distributed computing environments such as sparc, kubernetes/docker microservices
Develop proprietary algorithms to build customized solutions that go beyond standard industry tools and lead to innovative solutions.
Develop sophisticated visualization of analysis output for business users.
Provide control/analytics for all output produced to monitor/ensure established indicators/targets are met both during initial development and on an ongoing basis.
Identify opportunities for continuous improvement of current algorithms, solutions, and methodologies employed
Proactively collaborate with business partners to monitor solution health and changing requirements and develop actionable plans to address the same while optimizing for quality, use, cost, time-to-market amongst other variables.

Requirements

Bachelor's degree in Statistics, Computer Science, Mathematics, Machine Learning, Econometrics, Physics, Biostatistics or related Quantitative disciplines and 3 or more years' experience in an enterprise data science organization
Graduate degree preferred
Must have experience doing exploratory data analysis and visualization using state of the art python based libraries like pandas, numpy, matplotlib, searborn, plotly, streamlit etc.
Must have advanced expertise with software such as Python as well as expertise with JSON, SQL, experience using other programming languages like R, Java, C++, and expertise in GraphQL is preferred
Must have experience working with enterprise data warehouses, data marts, data bases, data lakes, or other distributed or cloud-based data storage systems
Must have experience working in cross-functional teams and ability to communicate results to non-technical audiences.
Familiarity with synchronous/event-based system/data/orchestration architectures for batch, streaming/real-time and/or transactional workloads that employ one or more of the following technologies - Message Queues, Kafka,
RESTful microservices, sparc, kubernetes/docker
Experience with cloud platform and SAAS environments & tools like Azure, AWS, GCP preferred
Familiarity with CICD/DevOps tools such as Bitbucket, Bamboo, Jira, Confluence required
Experience doing test driven development, using standard logging, and debugging techniques is required
Work experience in Agile (Scrum) development teams required