Implement and maintain comprehensive monitoring solutions for mission-critical and financial applications (transaction processing, funds transfer, direct deposit, web services, mobile apps).
Analyze and troubleshoot complex alerts to identify potential issues and proactively prevent outages.
Lead incident response for high-severity issues, effectively coordinating with internal resources to diagnose and resolve problems quickly.
Utilize communication tools to keep stakeholders informed of issue status and resolution progress.
Conduct thorough Root Cause Analysis (RCA) to identify underlying causes of incidents and implement preventative measures.
Identify recurring issues and implement solutions to prevent future incidents.
Work collaboratively with stakeholders to define and maintain SLAs for client services.
Analyze service delivery data and generate detailed reports for senior management.
Develop and execute automated scripts to streamline monitoring, alerting, and incident management tasks.
Perform complex data analysis using databases and logs, ensuring data accuracy and reconciliation.
Manage and maintain various matrices and data sets to support daily operations.
Clearly document incidents, problems, and resolutions for knowledge sharing and future reference.
Effectively communicate with all levels of staff, both technical and non-technical, to ensure clear understanding of issues and resolutions.
We are looking for
Education : Bachelors degree in Computer Science, Software Engineering, Information Technology, or a similar field (or equivalent experience).
Experience : 4+ Years
Skills
Understanding of ITSM frameworks like ITIL for efficient service delivery.
Familiarity with change management processes to ensure smooth implementation of fixes and updates.
Knowledge of software development concepts, algorithms, and programming languages.
Expertise in monitoring tools and methodologies.
Extensive experience with incident management frameworks (ITIL, etc.).
Strong analytical and problem-solving skills.
Proficiency in SQL and database management tools.
Excellent communication, collaboration, and documentation skills.
Experience with Solaris and Linux platforms.
Experience with automated scripting languages (Python, PowerShell, etc.).
Experience working in a fast-paced, 24 / 7 environment.