This position is open to being remote but all applicants must reside in the United States.
SafeMoon is a leading innovative blockchain tech company serving over 2.5 million holders and a part of an alternative financial system which enables trading on peer to peer chain networks without the interventions of centralised systems. With multiple successful products and a community driven, fair launched DeFi Token, our mission is to increase economic freedom around the world.
We are looking for a Site Reliability Engineer to work directly with the Head of Infrastructure and build out our highly available and scalable infrastructure function.
What will you be doing?
- Build software and systems to manage platform infrastructure and applications to increase SafeMoon’s automation across our IT operations
- Ensure that the underlying infrastructure is running smoothly and that all systems and tools are working as expected
- Participate in an On-Call Rotation ensuring all issues are responded to and managed in accordance with SLA’s
- Operate and build infrastructure in Kubernetes (EKS) and Terraform
- Participate in system design consulting, platform management, and capacity planning
- Debug Production issues across a variety of services and products within SafeMoon
- Plan and execute disaster recovery
- Develop CI/CD/CD processes with a GitOps focus to empower boring deployments for our devs
What do you need to have to be successful?
- Demonstrated prior experience working as an SRE is essential coupled with private/public cloud computing, including infrastructure, storage, platforms and data management required
- Strong and demonstrable documentation and communication skills is critical
- Bachelor’s degree in Computer Science
- Rich computer science background, specifically with UNIX based systems
- Demonstrated hands-on experience with private/public cloud computing, including infrastructure, storage, platforms and data management required
- Constant desire to learn and improve within the infra space
- Understanding of networking protocols specifically TCP/IP, HTTP and DNS
- Demonstrated ability to quickly identify the root cause and resolve critical issues by looking across multiple layers (OS, network, virtualization, application/DB and storage)
- Possess a systems focussed mindset where edge cases, failure modes, behaviors and specific implementations are always at the forefront of your decision making.
- Demonstrated experience with Nginx or Apache, HAproxy, Docker, K8s, Terraform or similar technologies
- Have experience with an issue tracking tool like Jira and a documentation tool like Confluence
- Some experience with or exposure to building an LDAP infrastructure is advantageous
- AWS Certifications (CCP, CSA) and K8s Certifications (CKA, CKD) would be advantageous
- Experience with modern scripting languages such as Go and Python
You may be required to complete a short (30 minute) coding test prior to being short-listed for
Our teams are rewarded for their effort through highly competitive pay and benefits, an excellent team focussed culture and a fun and dynamic work environment.
We look forward to hearing from you!