Lead- AI monitoring engineer

Company Details


Job Description


Bachelor’s degree in appropriate discipline, in the absence of a bachelor’s degree, 2-8 years of related experience.

Responsibilities: –

  • Scheduling data pipelines to be run at a certain schedule or in response to some event.
  • Provide monitoring for data pipelines for failures, deadlocks, and long-running tasks
  • Utilize ITSM system for managing and tracking of incidents and problems.
  • Produce Monitoring reports including time of the run, end to end time taken, failure reasons.
  • Support IT service owners to provide monitoring plan
  • Escalate and assist in service outage investigations
  • Administrator Monitoring tools

• Experience on monitoring tools

• 2+ years of experience in IT Monitoring service

• 1 – 2 years of experience with Linux, Windows and scripting(PowerShell & python)

• Knowledge of Networking

• ServiceNow

Knowledge on any one of monitoring tools

• Nagios, Airflow, AWS CloudWatch, Datadog


Experience on Nagios Monitoring tools Review business requirements and improve monitoring of the Nagios Platform Good hands-on in Nagios front end configuration and identifying issues and fixing Knowledge on scripting to create new templates Configure dashboard, alerts, and different types of infrastructure monitoring Expertise to define, monitor, and report on SLAs Expertise to detect and verify Application performance issues and determine the scope of the performance issue.

Strong diagnostic and analytical abilities.

Tagged as: airflow, datadog, Nagios, AWS CloudWatch

Visit us on LinkedInVisit us on FacebookVisit us on Twitter