Job Description :
Coding knowledge and experience in several languages: for example, R, Python, SQL, Java, C++, etc.
Knowledge and experience in statistical and data mining techniques: generalized linear model (GLM)/regression, random forest, boosting, trees, text mining, hierarchical clustering, deep learning, convolutional neural network (CNN), recurrent neural network (RNN), T-distributed Stochastic Neighbor Embedding (t-SNE), graph analysis, etc.
A specialization in text analytics, image recognition, graph analysis or other specialized ML techniques such as deep learning, etc., is preferred.
Experience of working across multiple deployment environments including cloud, on-premises and hybrid, multiple operating systems and through containerization techniques such as Docker, Kubernetes, AWS Elastic Container Service, and others.
Experience with distributed data/computing and database tools: MapReduce, Hadoop, Hive, Kafka, MySQL, Postgres, DB2 or Greenplum, etc.
Collaborate with ML operations (MLOps), data engineers, and IT to evaluate and implement ML deployment option
Problem Analysis and Project Management
Data Exploration and Preparation
Data Collection and Integration
Data Exploration and Preparation
Machine Learning and Statistical Modelling
Apply statistical analysis and visualization techniques to various data, such as hierarchical clustering, T-distributed Stochastic Neighbor Embedding (t-SNE), principal components analysis (PCA)