Senior Manager - L1 Support
Senior Manager - L1 Support
India, IN
About Foundever:
Foundever™ is a global leader in the customer experience (CX) industry. With 170,000 associates across the globe, we are the team behind the best experiences for +750 of the world’s leading and digital-first brands. Our innovative CX solutions, technology, and expertise are designed to support operational needs for our clients and deliver a seamless experience to customers in the moments that matter.
We are looking for a Senior QA Engineer to join our globally distributed team of AI evangelists to build reliable, scalable and secure GenAI-powered products. You will help us shape the future of CX by driving quality initiatives, strengthening QA processes and ensuring we deliver high-quality and stable products to our users.
Job Summary:
he Senior Manager – L1 Support leads Foundever’s 24×7×365 Global Network Operations Center (NOC) — the first line of defense for Network, Telecom (voice / contact center), and Systems Infrastructure across a global Customer Experience footprint. This role owns end-to-end accountability for frontline event detection, alert triage, ticket creation and routing, first-touch remediation, and timely escalation to L2 / L3 engineering pods, ensuring no event goes unactioned and no incident dwells unattended.
The incumbent runs a high-tempo, multi-shift operation built on disciplined ticket hygiene, robust escalation matrices, sharp communication, and continuous improvement of runbooks, KB articles, and event-correlation logic. Working in close partnership with the Senior Manager – Systems Operations, L2 / L3 engineering leads, and the Incident Management function, this leader is the operational heartbeat of Technical Operations — the role that decides whether the next P1 is caught in 60 seconds or 60 minutes.
Key Responsibilities:
Pillar 1 • Network L1 Operations
-
Monitor the global enterprise network estate (LAN, WAN, SD-WAN, wireless, firewalls, load balancers, MPLS / Internet circuits, AWS / Azure cloud links) on a 24×7 basis through observability tooling and event-correlation platforms.
-
Triage network alerts within defined MTTA windows, perform first-touch diagnostics, execute runbook-driven recovery actions, and route unresolved events to Network L2 / L3 with complete diagnostic context.
-
Open, classify, and groom every network ticket with accurate priority, impact assessment, CI mapping, and customer / site identification — ensuring zero orphaned alerts and clean ticket trails for downstream RCA.
-
Coordinate carrier and ISP engagements for last-mile outages, MACDs, and circuit events — logging tickets with carriers, tracking ETRs, and chasing SLA-bound restoration.
-
Lead initial bridge management for network-impacting Severity-1 / Severity-2 events until L2 / Incident Management ownership is established, ensuring rapid stakeholder communication and tight situational awareness.
Pillar 2 • Telecom & Voice L1 Operations
-
Monitor contact center voice platforms (NICE CXone, Genesys Cloud / Engage, SBCs, SIP trunks, recording infrastructure) for availability, voice-quality degradation, recording gaps, and integration health.
-
Execute frontline triage on voice-quality events (MOS, jitter, packet loss, latency), call-drop spikes, IVR / ACD anomalies, login failures, and queue-stalled conditions; escalate confirmed issues with structured context.
-
Action standard requests — agent provisioning, skill assignments, queue / treatment toggles, recording flag changes — per change-controlled L1 work catalog and within defined SLAs.
-
Coordinate with WFM, Operations, and Client Services on voice-impacting events, providing real-time updates, business-impact translation, and recovery ETAs.
-
Engage carrier and OEM support (PSTN providers, NICE / Genesys vendor support) for L1-eligible incidents and track ticket progress to closure.
Pillar 3 • Systems Infrastructure L1 Operations
-
Monitor the global server estate (Windows / Linux, physical and virtual), hypervisor clusters, core services (AD, DNS, DHCP), storage, backup jobs, and patching status.
-
Execute runbook-driven first response — service restarts, queue clears, disk-space remediation, failed-job re-runs, backup-failure escalation, and password / account hygiene actions per defined L1 scope.
-
Govern alert hygiene on systems monitoring — suppress noise, validate thresholds with L2 / L3, and flag chronically noisy or false-positive alerts for engineering remediation.
-
Track patching and backup job health on a daily cadence, escalate misses and failures with structured ticketing, and own L1-side closure tracking against SLA.
-
Coordinate OEM and MSP engagement for hardware faults, vendor break-fix tickets, and on-site dispatches — managing ticket lifecycle, parts logistics, and engineer access.
Cross-Cutting Responsibilities
-
NOC Shift Operations — Govern roster design, shift hand-overs, coverage continuity, leave / shrinkage planning, and follow-the-sun handshakes across geographies for unbroken 24×7×365 service.
-
Ticket Hygiene & SLA Governance — Own ITIL-aligned ticket discipline (priority, impact, category, CI, work-notes, customer comms) and operational SLAs on logging, response, and routing across all three towers.
-
Escalation Discipline — Maintain accurate, regularly tested escalation matrices to L2 / L3 / Incident Management / Vendor / Executive paths; drive right-first-time escalation routing and time-bound handoffs.
-
Knowledge & Runbook Management — Curate the L1 KB and runbook library; ensure every recurring alert is backed by a tested runbook and every new pattern feeds back into the KB within defined SLA.
-
Communication Discipline — Drive consistent, audience-appropriate updates during incidents — operations bridges, client comms, executive notifications — with clear ETRs and accountability.
-
Tooling & Automation — Partner with engineering and AIOps teams on alert consolidation, event correlation, auto-ticketing, and runbook automation to reduce manual toil and false-positive noise.
-
People Leadership — Lead Shift Leads / Team Leaders and a multi-shift L1 engineer population; own hiring, onboarding, certification roadmaps, coaching, performance, attrition control, and succession.
-
Process & Compliance — Operate within SOC 2, HIPAA, PCI-DSS, ISO 27001, and GDPR control frameworks; deliver audit evidence on access, change, ticket integrity, and shift logs.
-
Continuous Improvement — Run weekly trend reviews, identify top-N noisy alerts, drive eliminate-or-engineer outcomes with L2 / L3, and measurably reduce ticket volume and toil quarter-over-quarter.
Required Qualifications
-
Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related discipline.
-
10+ years of progressive experience in IT Operations / NOC environments, with at least 4 years leading multi-shift L1 teams in a 24×7 setting.
-
Demonstrated working knowledge across all three towers — Networking (LAN / WAN / SD-WAN, firewalls, routing / switching basics), Telecom (contact center voice platforms, SIP, SBCs, call-quality fundamentals), and Systems Infrastructure (Windows / Linux server estate, hypervisors, AD / DNS / DHCP).
-
Proven command of NOC operating mechanics — shift design, escalation discipline, ticket hygiene, runbook execution, and bridge coordination during Severity-1 events.
-
Strong fluency in ITIL v4 — Incident, Event, Request, and Change processes — with hands-on operation of ITSM platforms (ServiceNow, Remedy, or equivalent).
-
Active certifications spanning at least two of: ITIL v4 Foundation / Practitioner, Cisco CCNA, Microsoft MCSA / Azure / AWS Associate, NICE / Genesys L1 product training.
-
Demonstrated people-leadership experience managing Shift Leads / Team Leaders and 40+ FTE NOC teams in a regulated, customer-facing environment.
Preferred Qualifications
-
Direct BPO / CX industry exposure supporting multi-client, multi-LOB contact center operations.
-
Hands-on familiarity with observability stacks — Datadog, Dynatrace, SolarWinds, Splunk, ThousandEyes, or equivalent.
-
Exposure to runbook automation tooling (Rundeck, ServiceNow Orchestration, Ansible Tower) and basic scripting literacy (PowerShell, Python, Bash).
-
Familiarity with WFM disciplines (Erlang-based capacity planning, shrinkage modeling, schedule adherence) applied to NOC staffing.
-
Audit-handling experience across SOC 2, PCI-DSS, HIPAA, ISO 27001, or GDPR with ownership of L1 evidence packs.
Behavioral Competencies
-
Executive Presence — Communicates with clarity and confidence to C-suite, client executives, and engineering teams; tailors message to audience.
-
Crisis Leadership — Decisive, calm, and structured during P1 events; commands MOBs, drives parallelization, and protects the customer outcome.
-
Data-Driven Decision Making — Uses telemetry, problem trends, and unit economics to prioritize investment; resists anecdotal management.
-
Vendor Negotiation — Balances commercial leverage with relationship stewardship; drives value beyond contractual minimums.
-
Engineering Team Development — Builds bench strength, mentors emerging leaders, and grows a culture of ownership, blamelessness, and continuous learning.
-
Bias for Automation — Treats toil as a defect; champions engineering solutions over heroics.
Working Conditions
-
Anchor role for a 24×7×365 Global NOC; flexibility to participate in rotating shifts during launches, audits, peak seasons, and major incidents.
-
Primary on-call escalation for L1 NOC operations and joint accountability with Incident Management for Severity-1 event coordination.
-
Global stakeholder interaction across multiple geographies; requirement to flex working hours for cross-region governance forums and shift hand-overs.
-
Onsite or hybrid presence at NOC delivery locations; willingness to travel up to 10% for site visits, vendor reviews, and audit engagements.
Job Segment:
Network, QA Engineer, Linux, QA, Quality Assurance, Technology, Engineering, Quality