Location
Bridgeport, CT, United States
Posted on
Jul 01, 2021
Profile
Job Information
Humana
Manager of Cloud Infrastructure Operations
in
Bridgeport
Connecticut
Description
The Manager of Cloud Infrastructure Operations will lead the Hosting Operations of our Azure, AWS and GCP Cloud offerings. The Manager is an expert in 24/7 operations with high performing and scaling systems that meet a high degree of uptime. An expert in all facets of Cloud hosting operations with the ability to effectively communicate with customers, and internal stake holders. A leader in continuous integration and continuous development with automated deployment in an Agile SDLC. This will include generating and providing recommendations on how Humana should optimize usage of Cloud services, compliance controls through analysis and develop automated reporting that enables teams to leverage best practices for running efficient Cloud solutions.
If you're passionate about innovation and love working in an environment where you can constantly improve and adopt new technologies to drive business results, then Humana's Cloud Infrastructure Operations team could be the place for you!
Responsibilities
Responsibilities:
Manage operations plans, staffing, budget and execution.
Lead escalated Incident Management team and develop maturity plans to continually improve - reducing MTTD/MTTR.
Improve incident and problem management functions while working to build a world-class incident response function for our customers.
Build out and/or automate required L2 SOP's for L2 MSP Team.
Participate in the Cloud Well Architect Framework to support overall Operations initiatives
Establish and refine automated monitoring tools to track systems' health, uptime and outages.
Ensure compliance with best security practices and continuously assess potential vulnerabilities.
Optimize operations costs across vendors and service providers.
Partner with monitoring team to build maturity around event management.
Partner with Engineering, L3 teams and DevOps on CICD and automated deployment.
Collaborate with vendor partners, maintain strategic relationships and identifying continuous improvement opportunities.
Identify key procedures that can be automated and either automate them or work with platform engineering team to develop automation.
Establish, report, and improve various metrics associated with the efficiency of operating the Humana foundation environment suite delivering value to our customers.
Adhere to established customer SLA's.
Executes data-driven decisions by delivering operational metrics by analyzing operational data to identify trends and potential problems.
What you bring and what you will do:
People and Leadership
Strong leadership and people management skills.
Ability to make solid business decisions in a dynamic and fast-paced environment.
Ability to work with minimal supervision, making decisions based upon priorities, schedules and an understanding of business initiatives.
Manage and optimize Cloud infrastructure and services.
Solution-oriented leadership and a management-based approach.
The ability to communicate effectively to executives, engineers and customers.
The ability to build effective relationships with internal business stakeholders and external partners.
An inspiring and creative leadership style that inspires and influences others.
Availability for off‐hours work related to 24/7 up-time and availability of the Cloud product suite; willingness to support the team who has on-call coverage expectations.
Provide guidance, objectives, and metrics and oversight to help teams maintain 24/7 uptime and availability of production mission critical customer facing services.
Oversee and refine processes, practices, and tooling that teams will use to meet their service level objectives.
Technical Acumen
Deep understanding of the key concepts and practices of Cloud observability, coupled with experience implementing robust systems that leverage metrics, logs, and traces to provide holistic state of the Cloud operations.
You have a deep understanding how to apply best practices around monitoring, alerting, logging and have implementation experience with one or more monitoring, alerting, and logging systems (Azure Monitor, CloudWatch, AppInsight, Log Analytics, Splunk, Dynatrace, BigPanda, ThousandEyes, SolarWinds, etc... ).
Knowledge of corporate IT, data centers, ticketing system implementations, monitoring software implementation, troubleshooting, and continuous improvement approaches.
Server-less computing experience with containers (AKS/EKS) and VM based workloads along with an solid understanding of the trade-offs of different serverless implementations emerging in public Cloud.
Experience with and enthusiasm for operating in an agile DevOps oriented organization and culture.
A technical business acumen that ensures the organization is operating efficiently and effectively in a hybrid environment.
Knowledge of monitoring systems for infrastructure monitoring as well as application performance monitoring including SLAs/KPIs and reporting approaches for the multi Cloud platforms.
Skill and knowledge in ITIL processes related to Incident Management, Service Requests, Event Management, Access Management, Change Management, Knowledge Management and Escalated Incident Management.
Partner with Engineering and Architecture team to design key concepts and practices of Observability, coupled with experience implementing robust systems that leverage metrics, logs, and traces to provide understanding of system state. Advocate for that strategy with engineers, managers, and executives.
Required Qualifications
Bachelor's Degree in Computer Science, Information Technology, or equivalent experience.
2 years of experience managing 24/7 production operations for a high-volume, business-critical Cloud service.
2 years' experience with Azure and/or AWS.
2 years' experience working with a Managed Service Provider and managing IT vendor relationships.
2 years of transformational experience running Cloud at scale.
Must be passionate about contributing to an organization focused on continuously improving consumer experiences.
2 years of management experience.
Desired Qualifications
Azure cloud certification
Advanced understanding of Cloud platforms, consoles, and services (Azure, Google and AWS).
Knowledge or experience with Ansible Tower, API queries, and Power BI.
Scripting knowledge using Python, Perl, PowerShell, JavaScript, or similar scripting languages.
LI#Remote
#Cloud
Scheduled Weekly Hours
40
Company info
Sign Up Now - ManagerCrossing.com