PeachNet Campus Site Down Escalation Procedure
The following information is to define the escalation procedures in the event of an unscheduled outage affecting the main campus of one of the USG institutions or a similarly important location. This process should begin immediately upon becoming aware of the outage, however this process applies only to a complete outage or some level of malfunction that prevents the institution from being able to effectively use its PeachNet connection. Scheduled outages by the institution, non-service affecting events, or events at smaller branch sites are exempt from this procedure.
Upon becoming aware of an outage at one of the USG institution main campus or similar location:
- Ensure that the customer has been contacted to let them know that we are aware of the problem and are looking into it, and to learn from them any information regarding the cause of the outage that they might have.
- Begin preliminary investigation into the cause of the outage, e.g. equipment failure, cable cut, power outage, etc.
- Ensure that the Director of Networking is aware of the outage.
- Issue a NetStatus email indicating that the outage is being looked into. The subject line should include the date, the site name and DOWN.
Upon determining the cause of the outage, begin the appropriate mitigation processes, e.g.:
- In the event of a circuit failure contact the carrier’s NOC and open a ticket. If this is a GTA circuit also immediately open a ticket with the ATT Escalation Manager.
- In the event of a fiber cut, contact the appropriate fiber provider’s NOC and open a ticket with them.
- In the event of equipment failure, determine the equipment needing to be replaced and dispatch on-site with replacement equipment.
- If there is any question as to cause of the outage, dispatch on-site just to be certain.
- Ensure that the site contact has been informed as to the cause of the outage. Ensure that the site contact knows if access to the location will be required in order to effect repairs.
- Issue a NetStatus email indicating what we’ve learned about the cause of the problem and any information regarding an estimated time to repair (ETR). Any information learned should be provided to the customer contact as it is obtained. The subject line should include the date, the site name and DOWN.
If the site remains down after 1 hour during business hours or after 2 hours outside of business hours where business hours are non-holiday, Monday through Friday 0800-1700:
- Ensure that we have been in constant contact with the appropriate carrier, fiber provider or vendor during this interval and have pressed for an update and an ETR no less than once during this interval.
- Ensure that the site contact has been kept informed of on-going efforts, even if that information is that there is no new information. It is important that we maintain contact with the customer during these regular periods throughout the duration of the outage. Should the customer request to only be contacted upon the arrival of new information then comply with this request. (This will typically only be the case during after hours incidents.)
- At the end of this initial period, the Director of Networking will ensure that the Executive Director, Enterprise Services/CTO has been informed of the outage, and will escalate the issue to higher level contacts within the carrier or fiber provider. The CTO will, at his discretion, notify the USG CIO.
- Issue a NetStatus email indicating anything learned or simply indicating on-going efforts to get an ETR or a solution. The subject line should include the date, the site name and DOWN.
- This process should remain in place until such time as we have a resolution or an ETR.
Upon receiving an ETR:
- Ensure that the site contact has been informed of the ETR.
- Ensure that the Director of Networking is aware of the ETR. The Director of Networking will ensure that the Executive Director, Enterprise Services/CTO is notified.
- Issue a NetStatus email indicating the ETR. The subject line should include the date, the site name and DOWN.
- Continue to monitor the event to ensure that the ETR is met.
If the ETR is not met, got back to step 3 until such time as a new ETR is provided or service is restored.
- Once service is restored:
- Ensure that the site contact concurs that service is, in fact, restored.
- Ensure that the carrier, fiber provider or vendor is confident that all repairs have been completed and no further outages are expected.
- Request that any tickets opened be held open for another 24 hours in case of recurring problems.
- Notify the Director of Networking. The Director of Networking will ensure that the Executive Director, Enterprise Services/CTO is notified.
- Issue a NetStatus email indicating that service has been restored. The subject line should include the date, the site name and UP.