Analyze and categorize errors using SQL in Hadoop dataset migration code to AWS. Design and develop a Python-based pipeline to transfer categorized Hadoop datasets to AWS raw zone based on error criteria. Participate in the ongoing execution of system migration, maintenance, and decommissioning of the Hadoop platform. Engineer and automate data pipelines using Python scripting and Control-M orchestration to process, transfer, and ingest data across diverse platforms. Design and implement Python scripts to achieve code coverage, enhancing software quality and reliability. Analyze and resolve dataset failures during Hadoop-to-AWS migration, automating rerun processes with a Python pipeline. Identify and implement automation solutions to optimize workflows, minimize manual effort, and boost operational efficiency. Author Terraform scripts to provision and manage cloud infrastructure for efficient deployment. Work under supervision. Travel and/or relocation to unanticipated client sites throughout USA is required.
Six (6) months of experience working with Python and SQL is required. Travel and/or relocation is required to unanticipated client sites within USA. International travel is not required. The frequency of travel is currently not known as it depends on the client and project requirements that cannot be currently anticipated. Employer provides information technology services to various clients in USA and hence implementing projects will require such travel.