A Site Reliability Engineer with 5+ years of experience managing complex systems in AWS. I primarily use Go and Terraform to develop project components using Infrastructure as Code and Serverless methods.
Investigated and upgraded an unknown Go codebase to unblock a portfolio wide migration, the dependency was discovered late in the migration and had to be upgraded within just a few days to meet company timelines
Led the technical migration of an internal tool from inception to delivery to meet the timelines for a vendor transition, this included collaborating with dependent teams and mitigating risks via LaunchDarkly
Mentored fellow Engineers on Go best practices, allowing for more consistent/readable codebases
Facilitated SLO and Incident training for teams across the US/UK
Provided support via on call rotation using PagerDuty for SRE
Managed large scale incidents across multiple teams as the Incident Commander
Site Reliability Engineer
Jan 2020 - Dec 2023
Responsibilities:
Architected a product to inform teams of required AWS Maintenance, providing reminders, due dates, and reports for managers
Developed and maintained an automated system to tag AWS resources using Serverless (Lambda, SQS, SNS), written in Go and Terraform
Created a frontend page to surface AWS infrastructure information for teams, written in Typescript using React
Created a service to search over past incidents to easily surface incident learnings
Created monitors, alerts, and dashboards for our Products using NewRelic and SumoLogic
Wrote Terraform modules to deploy resources across multiple AWS accounts
Site Reliability Intern
May 2019 - Aug 2019
Responsibilities:
Learned the tenants of Serverless computing while focusing on learning AWS
Developed a system to automatically renew AWS ACM certs