




We are seeking an experienced **Observability SME** with deep expertise in observability architectures and leading monitoring platforms. This role will be responsible for designing, implementing, and optimizing end\-to\-end observability solutions for applications, infrastructure, and networks. The ideal candidate will have extensive hands\-on experience with platforms such as **ELK (Elasticsearch, Logstash, Kibana), Dynatrace, BMC TrueSight, and SolarWinds**, ensuring seamless monitoring, alerting, and analytics to enhance IT operations and service reliability. **Key Responsibilities:** * **Observability Strategy \& Architecture:** Design and implement comprehensive observability solutions to monitor applications, infrastructure, and network performance. * **Monitoring Tool Implementation \& Optimization:** Deploy and fine\-tune monitoring solutions using ELK, Dynatrace, BMC TrueSight, and SolarWinds. * **Log Management \& Analysis:** Establish centralized logging, log parsing, and correlation for improved event detection and troubleshooting. * **Metrics \& Performance Monitoring:** Define KPIs, dashboards, and alerts for proactive IT service monitoring. * **Incident Management \& Root Cause Analysis:** Collaborate with IT operations, DevOps, and SRE teams to diagnose and resolve performance issues. * **Automation \& Integration:** Integrate monitoring tools with ITSM platforms, AIOps solutions, and automation frameworks for enhanced efficiency. * **Capacity Planning \& Optimization:** Analyze historical trends and real\-time data to optimize resource allocation and performance. * **Stakeholder Collaboration:** Work closely with developers, network engineers, system administrators, and business units to ensure observability best practices are followed. * **Continuous Improvement:** Stay updated on emerging observability technologies and recommend improvements to existing processes and tools **Qualifications:** * Bachelor's degree in Computer Science, Information Technology, or related field (or equivalent experience). * **Expertise in Observability \& Monitoring Platforms:** 8\+ Years Hands\-on experience with ELK Stack, Dynatrace, BMC TrueSight, SolarWinds, and similar platforms. * **Strong Knowledge of Infrastructure \& Application Monitoring:** Experience monitoring cloud, on\-premise, and hybrid environments. * **Experience with Log \& Event Correlation:** Ability to configure and analyze logs for anomaly detection and security insights. * **Automation \& Scripting:** Proficiency in scripting languages such as Python, PowerShell, or Bash for automation. * **Cloud \& DevOps Understanding:** Experience with cloud platforms (AWS, Azure, GCP) and CI/CD pipelines. * **ITIL \& Incident Management Exposure:** Understanding of ITIL processes and IT service management (ITSM) practices. * **Networking \& Security Awareness:** Knowledge of network monitoring, SNMP, and security monitoring practices. * **Excellent Communication \& Documentation Skills:** Ability to present findings, create technical documentation, and train teams on observability best practices. **Preferred Qualifications:** * Certifications in **Dynatrace, ELK, BMC TrueSight, or SolarWinds**. * Experience with **AIOps, Machine Learning for Anomaly Detection, or AI\-driven Observability**. * Background in **Site Reliability Engineering (SRE) or DevOps**. * Familiarity with Infrastructure as Code (IaC) tools such as Terraform, Ansible.


