Technical Team Lead - Reliability Platform

Sydney, New South Wales, Australia – Full-time

Canva’s Commitment and Mission

At Canva, we celebrate diversity. We deeply believe that bringing together diversity of thoughts, perspectives and expression is key to building the best product, team and company. We look for many different skills and abilities, as well as how you can enhance Canva and our culture. So, even if you don’t think you quite meet all of the skills listed or tick all the boxes, we’d still love to hear from you! 

Our mission at Canva is to empower the world to design and since launching in 2013, we have grown exponentially, amassing over 86 million monthly active users across 190 different countries and a team of over 3,000 people… and the best bit is that we’ve only achieved 1% of what we know we’re capable of. 

Join us and design your future.

About the Reliability Platform Group


The Reliability Platform Group is responsible for providing the tools and processes to scale reliability across all Canva services. Our teams work together, and with other groups, to deliver preventive and detective tooling, processes, and best practices that uplift Canva’s reliability. We do this by driving operational excellence, reducing the impact of incidents, and providing visibility and accountability across the broader Engineering community. The group encompasses Observability, High Availability & Incident Detection, Incident Response, and Pre-Emption domains and is set to grow rapidly in the near future as we shoot for some ambitious goals.

What you'll do:

  • Drive delivery of reliability tooling projects from ideation to completion within start-up time frames to ensure Canva’s overall reliability and performance
  • Drive technical strategy and provide leadership to your team in a fast-paced innovation-focused environment
  • Be effective at (mainly internal) stakeholder management to ensure we’re working on the most impactful things
  • Develop and grow a team of reliability engineers through effective coaching, engagement and retention strategies, and strategic hiring
  • Provide technical guidance through review of technical designs and code reviews
  • Inspire a culture within the Engineering org that puts reliability first and establishes processes and policies that drive reliability within product engineering teams
  • Promote a safe and healthy culture with a focus on collaboration and open communication
  • Set up and run ongoing feedback sessions, initiatives to help drive a healthy code review culture, knowledge sharing, design showcases, and helping improve processes through retrospectives.

What you'll bring to the role:

  • Experience with technical and people leadership - having previously led high-performing teams where everyone is able to share their best ideas and be their best selves
  • Commercial experience working with developing complex, distributed web applications.
  • Experience working with a mainstream programming language. However, our services and libraries are primarily written in Java 13, so a willingness to work with Java environments is a must
  • Solid understanding of resiliency techniques and patterns – load balancing, throttling, back pressure, circuit breaking, etc; 
  • Disciplined coding practices, experience with code reviews and pull requests, and a creative and conceptual problem-solving approach
  • Strong communication and team collaboration skills, both written and verbal, as you will need to share knowledge, communicate and coordinate changes across multiple teams.
  • Be capable of leading by example - promoting Canva’s values, no-blame mentality, and engineering values

Nice to haves:

  • Experience working with microservice architectures in large distributed cloud environments (ideally AWS). We’re hosted on AWS and leverage the tools they provide as much as possible
  • Experience with RPC Frameworks, Finagle, Thrift, or gRPC will be a huge plus, but not required; Understanding how services communicate with each other is crucial to find out where a failure can occur
  • Knowledge of networking protocols such as TCP, HTTP/2, WebSockets, etc. would be a big plus; The life of a request doesn’t start inside the backend web server, but rather in the browser of a user
  • Previous experience of working as a reliability/chaos engineer and/or strong knowledge of Google SRE corpus et. al

Working at Canva 

Our culture is unlike anywhere else and we design your #CanvaLife experience to empower you to do the best work of your life.  

Whether you’re in the office, working from home or choosing your own adventure, our benefits for permanent Canvanauts include: 

Equity packages for you to truly be a part of the Canva journey. 

We have a hybrid work model (in-office & from home), with our offices are always open to you balancing flexibility and connection

Flexible leave so you can recharge, give back, support others or focus on your own professional development.

Inclusive parental leave policy that supports all parents and carers throughout their parenting and caring journey.

An annual Vibe & Thrive allowance. This is for you to spend on whatever will support your wellbeing and development.. because you know what you need to Vibe and Thrive, better than anyone.

Virtual and in-office wellness benefits including Canva University, Employee Assistant Programs and Fitness & Meditation Classes.

Canva For Good program matching your not-for-profit donations, Force for Good leave (3 paid volunteering days) and a range of sustainability and ethical initiatives to get involved in.   

We make hiring decisions based on your experience, skills and passion. Please note that interviews are conducted virtually. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.