The Sr Tech Ops Engineer will focus on minimizing the impacts of production incidents while ensuring the organization learns quickly to become more resilient.
Tech Ops Engineer is accountable for tracking and managing the restoration and communication of incidents with pace. The role is responsible for managing the processes, guidelines and tools related to Major and other high priority Incidents, performing Root Cause Analysis, and proactively enhancing related tools and processes for monitoring and alerting. The Sr Tech Ops Engineer will establish service level objectives, indicators and agreements (SLOs, SLIs and SLAs) for critical systems and channels, including championing automated monitoring and data aggregating to streamline reporting of metrics.
This role will record and manage problems through to resolution by performing root cause analysis, coordinating third party analysis, and communicating with stakeholders to ensure workaround and solution suitability. This role will be supporting EQ Bank, Equitable Bank, the direct-to-consumer digital bank channels and core banking applications, as well as any other Information Technology incidents and problems with a high and critical impact on business.
Determine the severity of an incident that has been raised by the business (per risk definition)
Work in partnership with senior IT Subject matter experts and business counterparts to identify opportunities that drive business value and improve effectiveness
Host both technical and managerial conference calls, and facilitate effective incident management throughout the incident lifecycle
Spin-up command centres (virtual/physical) as needed based on priority of incidents, including call-tree escalations to Level 3 experts as needed
Own the creation and distribution of succinct communications on command centre progress and activities, for both front-line technical staff and executive leaders
Accountable for gathering required information from business and technology to enable effective Root Cause Analysis
Produce and maintain Executive level reporting of incidents, including number of customers impacted and restoration timing
Identify the underlying root cause and the triggers of a Problem and initiate the most appropriate and economical Problem solution or temporary workaround
Manage processes to ensure on-call team schedules, systematic call-trees, and distribution lists are accurate and up to date
University degree in technology-related field required, or equivalent work experience
~ Knowledge in Java software delivery or software development including multiple frameworks such as Hibernate, Spring MVC, Spring Security, SAML, OAuth, OIC
~ 6+ years of experience in a corporate SDLC function (developer, support, architect, etc.) in technology
~4 years of experience with technical incident resolution or product defect resolution
~ Strong written and verbal communication, with an ability to articulate issues simply, concisely, accurately and clearly for both technical working level teams and executives
~ Analytical mind capable of managing numerous information sources and providing meaningful data analysis reports to senior management
~ Respectable knowledge of Jira Service Desk, Confluence and Jira is required
~ Strong attention to detail, keen problem-solving skills, with experience performing impact and root cause analyses, recommending solutions, and supporting resolution efforts
~