Monday 9 September 2013

Steps towards software operability

The generic definition for operability in Wikipedia is “Operability is the ability to keep an equipment, a system or a whole industrial installation in a safe and reliable functioning condition, according to pre-defined operational requirements”. The definition seems to be applicable to critical application software deployments also. The clause of “pre-defined operational requirements” either makes it vague or makes the definition customizable.

An operable system not only satisfies business functionality for end-user but team behind it. For example for making an eCommerce site working 24X7 requires a lot of operability efforts in design to post-deployment phase.

The projects I worked in past couple of years demanded high operability. This made us think in different ways of designing and planning. The cost you pay for not having operability practices in place is very high. It can shoot your company into “Wall of shame”.

Tackling operability issues of your project require following acts:

Sell it first: Operability is inevitable, to which extent depends on the project. Try to get a deeper understanding of the operability requirements from your previous projects or similar projects. Prepare well in this phase, tough questions awaits you. What, How and When should be detailed to evaluate the need, extent, benefit and cost involved. Sell operability needs of your project to the management, it involves cost.
If it is an existing project you might stop here and continue the way you’re executing the project. Operability is nothing new it always existed and you might already have things in place. But please read phase two “Get the act together” to make sure that you didn’t miss anything while planning.

For management also this is critical phase to understand and justify the cost paid for operability especially in agile projects.

Get the Act together: Since we are past the phase 1 you are ready to “Get the act together”. This does not happen as part of business requirement it needs some extra effort in all the phases of development. It is strongly recommend to have a separate team which will enforce the principles of operability, build/customize tools and framework to suit your deployment. This allows focus, reuse and policing of operability. Having said that separating functional requirements and operability can be fatal to the project.
Operability team should improve the process/framework via continuous feedback. The functional team adopts the operability by infusing it into the business requirements.
Operability should be part of all phases of Software Development Life Cycle (SDLC).Let us look at each of the phase from operability perspective.
Design: Define clear milestone goals for operability. Operability goal can be in following areas
1.       Performance
2.       Scalability
3.       Security
4.       Recoverability
5.       Backward compatibility
6.       Deployment
7.       Rollback
8.       Monitoring
9.       Reporting system health
10.   Trouble shooting
11.   Testing the points 1-10
All of the above should be present in one form or other in all your critical projects. If not remember about “Wall of shame”.

Development and Testing: Practically this might be the longest phase in most projects. As far as operability is concerned this should be the shortest one. If operability tasks is taking longer go back to operability team. They need to fix it.

The main intention of this phase is to follow instructions/guidelines and make sure that next phases doesn’t require human intervention. You should never lose a chance to automate.
Ensure to log in standardized format across all applications as appropriate to enable monitoring and troubleshooting in next phase.

Testing for scalability, performance, backward compatibility, rollback has to execute as planned in previous phases. This is your last chance to avoid catastrophic failures. Collect as much information as possible and do analysis of the variations from baseline during performance/stress tests.

Deployment: Deployment should be automated with rollback plan in place. For high  availability you might also consider Business Continuity  Plan (BCP).Auditing of the deployment and production testing will also help to improve the overall operability.

For complex deployment auto provisioning system can be used.It is generally good idea to do dry run before the actual deployment to make sure that all environment specific attributes are taken care of.

Watch Out!
I would say this is the most critical stage in operability. In this phase application is monitored and feedback is given to the system.
Tools required for this phase can be built or off-the-shelf product can be used. The logs collected should be stored to do analysis and reports.

Monitoring: Monitoring trending errors, performance parameters, application parameters and system parameters critical to project happens in this phase. Servers, Load balancer, Storage, Network, and Switches are possible candidate for monitoring. Use dashboard/alerts for monitoring the application.

Troubleshooting: How good you’re in “Get the act together” phase is usually defined by how fast you’re able to trouble shoot production. Design to make sure that the exact module which failed is identified immediately.

Classify the issues identified in this phase and formulae strict guidelines for the action. Also feedback should be given to phase 1 and phase 2.

Cloud Computing
Does the above applies to cloud computing?Absolutely.
At least in SaaS I would expect the provider to have a great operablity infrastructure.

Take away
Consider operability requirements as a critical deliverable. The key take away while designing and architecting the system are.
  • Evaluate: The extent of operability
  • Specialize: Form a specialized team for operability
  • Automate: Remove manual steps
  • Feedback: Give actionable feedback to the system

No comments: