After years of helping companies recover from avoidable system crashes, it has become very clear that more technology might not be the first step in improving reliability. Utilizing a simple Change Management process can provide immediate reliability improvements at little or no cost.
724 Engineering has developed a 10-step Change Management (CM) process that we utilize when making any change to a production system. This same process has been successfully applied for hardware, software, network and many other changes.
Here are the steps:
1) 48 hours in advance of change, send plain text mail to changelog mailing list using CM Template
NOTES:
(a) If change is to be made on an EMERGENCY basis, executive approval is required.
(b) If change has not been tested or cannot be tested, executive approval is required.
2) Objections must be filed within 24 hours
If no objections are filed:
3) If downtime is needed or possible, Marketing posts PM notification on www or support site
4) “Change Window Starting” mail sent to changelog
5) Change is made and tested
If successful:
6) “Change Made Successful” mail sent to changelog
7) Documentation updated
<end of process>
If unsuccessful:
8) Rollback per plan
9) “Change NOT MADE” mail sent to changelog
10) Post-mortem meeting scheduled with rollback engineer
Following this simple process ensures that everyone who might be impacted knows a change is coming and can be prepared. In addition, preparing the CM notice and rollback plan ensures that the person making the change has thought it through fully.
We have utilized the 10-step process above for literally thousands of changes. The result is invariably a measurable improvement in system reliability.
Next in this series: Preparing a CM notice and roll-back plan.
Comments