Systems Reliability Engineer - SME
Herndon, VA 
Share
Posted 15 days ago
Job Description
Job Brief

.

Job Description

HTS (iNovex) was built on the principle that people matter first and foremost.We believe in providing a strong work/life balance by investing in our employees and encouraging professional and personal growth.We do this by offering exceptional benefits, flexible schedules, and the tools necessary to achieve success through paid training, mentoring, and the opportunity to work alongside top-notch technical professionals.

Your effort and expertise are crucial to the success and execution of this impactful mission that is critical in ensuring mission success throughSystem Engineering, Network Engineering, Systems Integration, and Software Engineering & Development,by improving, protecting, and defending our Nation's Security.

We are looking for experienced Systems Engineer/Site Reliability Engineer (SRE) to join our technology-based program supporting a key Government customer. The Systems Engineer/SRE provides subject expertise and guidance to IT developers during the software development life cycle. Overseeing the development, testing, and implementation of technical solutions. Determining whether technical solutions meet defined requirements. The SRE may also provide Agile DevOps support to mission critical systems. The Systems Engineer/SRE may have the opportunity to build strong systems, software, and cloud environments and provide operations and maintenance for critical systems. The candidate will provide technical expertise and support in the design, development, implementation and testing of customer tools and applications. Based in a DevOps framework, participate in and/or direct major deliverables of projects through all aspects of the software development lifecycle including scope and work estimation, architecture and design, coding, and unit testing.

Required Education, Experience, & Skills
The Systems Engineer will support the team in the following activities (including but not limited to):

  • Ensuring reliability, getting systems back to steady-state as quickly as possible
  • Eliminating toil, automating wherever possible
  • Driving better cross-team collaboration
  • Gaining full visibility into IT systems and services for system health
  • Identify system deficiencies and recommend solutions
  • Developing Service Level Indicators (SLI) for IT systems and services
  • Developing Service Level Objectives (SLO) for IT systems and services
  • Developing Service Level Agreements (SLA) for IT systems and services
  • Maintenance and continuous improvement of the processes, standards, policies, working methods and tools
  • Ensure appropriate tools and processes are in place to have a development/production environment that is reliable and reproducible
  • Ensure tool configuration consistency across Development, Testing, Integration, and Production environments
  • Participate in on-going production support and end user support
  • Research, understand, and develop using new technologies and standards as needed
  • Evaluates interface between hardware and software, operational requirements, and characteristics of overall system

Required Education, Experience, & Skills:

  • A minimum of sixteen (16) years relevant experience with Bachelor's or Master's degrees
  • Knowledgeable in Incident Management, organizing Incident Response Teams, communicating with stakeholders and devising a strategy for resolving incidents
  • Good understanding of how the incident response role is structured and incident response concepts to automate the complex process required for rapid, effective incident resolution
  • Knowledgeable with SLO to help Operations provide, define, improve SLO for specific reliability for IT systems
  • CI/CD implementation expertise
  • Scripting to automate tasks, extract information, front-end and back-end such as MS PowerShell, Python, JavaScript, Ruby, PHP etc.
  • Ability to efficiently and appropriately estimate work effort requirements
  • Ability to communicate effectively through written and verbal methods
  • Ability to handle multiple tasks and meet deadlines
  • Ability to work independently and in a team environment
  • Able to adapt to a constantly changing environment
  • Aptitude and willingness to work with variety of newer emerging technologies/tools as opportunities demand
  • Ability to deliver enhanced functionality, aid with new implementations, and provide continuous support within the scheduled time while preserving system integrity required
  • High degree of initiative, creativity, and technical ability to function on this team required
  • Ability to identify issues and implement corrective actions

Preferred Education, Experience, & Skills

  • The ideal candidate would also have IT project management experience, and be familiar with Scrum, Lean, Agile and DevOps.
  • Experience with Java, Ruby, DevOps and DevSecOps
  • Knowledgeable with IT Operations Management (ITOM) software, intended to represent all the tools needed to manage the provisioning, capacity, performance and availability of computing, networking, and application resources - as well as the overall quality, efficiency, and experience of their delivery.
  • Understanding of Quality Assurance and Test Automation for software pre-deployment
  • Good understanding of DevOps concepts and best practices
  • Issue troubleshooting experience.
  • Understanding of Networking concepts
  • Linux/Unix Concepts
  • ServiceNow knowledge in developing products using JavaScript and other coding applications.
  • Database Administration (Oracle or MYSQL) experience

We're an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.


All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.


 

Job Summary
Start Date
As soon as possible
Employment Term and Type
Regular, Full Time
Required Education
Bachelor's Degree
Required Experience
16 years
Email this Job to Yourself or a Friend
Indicates required fields