Job Description :

Job Title: PYSPARK / Google Cloud Platform Developer

Location: Hartford CT

Duration: 1 year

Job Description:

  • Cloud components, including cluster management, Kubernetes, or other containerized services, storage, and workspace management
  • Spark tuning or equivalent performance re-engineering on a big data platform
  • Selection of infrastructure components such as GPU's or CPU's based on the job/compute needs
  • Working with platform teams (Azure, GCP, or AWS) in resolving technical platform issues impacting big data applications
  • Translating analytical problems into structured programs, including PySpark or Scala
  • Designing data models and solutions for analytical and reporting use cases
  • Designing and building a cost model to monitor cloud usage
  • Working as a lead on a large-scale cloud implementation
  • Managing cloud components, including cluster management, Kubernetes, or other containerized services, storage, and workspace management
  • Tools used to automate CI/CD pipelines, such as Jenkins, GIT, or Control-M
  • Creating Semantic Layer Data Models, preferably on Teradata
  • Migrating RDBMS pipelines to Cloud/Hadoop using PySpark or Scala on Spark
  • Complex SQL constructs and code optimization on RDBMS, Teradata preferred
  • Designing and implementing end-to-end solutions and frameworks using Machine Learning or NLP, such as Scikit-Learn, SpaCity, Pytorch, or Spark NLP
  • Proficiency with tools to automate CI/CD pipelines, such as Jenkins, GIT, or Control-M
  • Data architecture and data modeling, including creating Semantic Layer Data Models
  • "Big data" platforms including Hadoop (preferably Azure or GCP) and technologies including Spark, Airflow, Kafka, Hbase, Pig, NoSQL, etc.
  • Traditional relational data warehouse technologies, such as Oracle, Teradata, or DB2 SQL, including Analytical SQL functions
             

Similar Jobs you may be interested in ..