12 Elements of Effective Reliability Management by Drew Troyer, CMRP, CRE, MBA
I’m often asked what effective plant reliability management looks like. How does one recognize it when he or she sees it? While there are plenty of details, I’ve boiled it down to the following 12 dimensional elements. Naturally, there is much more detail, but I’ve described these key elements for you here in abstract form. This article is short enough to keep the attention of even the highest-level managers in your company. So if you agree with the philosophies, pass it along to them. I’ve written the article in MBA language, not reliability engineering-speak, for that very purpose.
The 12 elements of effective reliability management are:
1. Strong leadership focus and business-aligned plant reliability mission, vision and strategic plan
Your leaders, both at the corporate and plant levels, must keenly understand the impact reliability has on the bottom-line performance of the organization, including the share price. The valuation of an equipment asset-dependent organization is significantly affected by the effectiveness with which that equipment is managed. Your leadership must understand that reliability management is not just doing maintenance better. Without knowledgeable and truly engaged senior leaders who are willing to make plant reliability management a matter of corporate policy, it’s not likely that you’ll gain traction and achieve lasting improvement.
2. Effective interfunctional and interplant communications
Unfortunately, when things go wrong, the typical modus operandi in most plants is to begin the process of assigning blame. Operations blames maintenance, maintenance blames operations and the design engineering group, everybody blames suppliers, and so on. Rarely does this process yield productive results. Poor communication between co-dependent functional groups almost guarantees poor reliability performance. Moreover, organizations that have multiple plants often fail to take advantage of the knowledge and economy of scale afforded to them, either because of a culture of internal competition or the lack of effective interplant communication systems through which to share knowledge and experience.
3. Focus on design for reliability, operability, maintainability, safety and inspectability (ROMSI)
Most organizations have attempted to improve reliability strictly from the maintenance department. It simply doesn’t work. Poor overall reliability is the result of poor basic “design for reliability,” given the required operating context; improper operation, which may be the result of poor “design for operability”; and ineffective maintenance, which may be the result of poor “design for maintainability and inspectability.” Some studies suggest that half of all failures are directly attributable to poor design. Designing reliable equipment and plants requires risk assessment, clear knowledge of the operating context, involvement from operations and maintenance domain experts, and a leadership focus on minimizing life-cycle cost.
4. Reliability-focused operations
Equipment that is started, stopped and/or operated incorrectly, or beyond its operating limits, will simply experience a higher failure rate. A reliability-focused operations team follows and enforces well-conceived standard operating procedures. They also understand that, in some instances, producing more can result in profit erosion. The mentality extends beyond plant operations to the sales and marketing department. An enlightened sales and marketing team understands that the profitability of sales contracts and the reputation of the firm depend upon the reliability of the machines or plant, especially when transactions carry penalties for late or non-delivery – in some cases, total-loss penalties. They factor projected reliability into their pro forma estimates of contract profitability. The reliability-focused operations organization works closely with the maintenance team, particularly to provide inspection and operating health feedback on a regular basis, and supplies design engineers, procurement specialists and strategic suppliers with the information they need to improve equipment operability.
5. Reliability-focused maintenance
While maintenance can’t improve the reliability of equipment, they can ensure that its inherent reliability, based upon design and operating context, is maximized. A reliability-focused organization doesn’t just employ modern techniques like Reliability-Centered Maintenance (RCM), condition-based maintenance (CBM) and precision maintenance techniques. A reliability-focused maintenance organization works hard to optimize maintenance activities, with a focus on running time activities. It also works closely with operations to ensure that the equipment is available to produce as much product as required, meet quality goals and, most importantly, satisfy customer demands. And, a reliability-focused organization works closely with design engineers, procurement specialists and strategic suppliers to improve design for reliability and maintainability, and to avoid purchasing the same problems over and over again.
6. Effective talent management
The success or failure of your reliability management program will ultimately come down to the people who are involved. Effective plant reliability leaders recognize that talent management goes beyond just hiring people with the right skills for the job; performance is also a function of behaviors. Skills can be taught; behaviors can’t. In fact, it is quite difficult to significantly modify an individual’s behaviors beyond a temporary interim period. Effective plant reliability management requires that you identify the behavioral characteristics required to succeed in your organization and in the job, the skills and knowledge required for the job, a method for assessing both, and tools and techniques for managing and retaining your talent. In the tough talent market projected for the future, talent management may differentiate the winners from the losers.
7. Strategic customer and supplier relationships
Suppliers and customers alike are critical to the success of your reliability program. A major component of the Toyota Production System (commonly referred to as lean manufacturing) is negotiating production and delivery time with the customer, both internal and external, for the purpose of load-leveling. At times, delivery deadlines aren’t deadlines at all; they are just dates that are selected. Understanding which delivery deadlines are real and which ones are arbitrary can help you openly discuss matters with your customers. This helps you create a pull-based production system while avoiding the added stress on the equipment and organization to meet arbitrary deadlines. A similar strategic relationship must exist with your suppliers. A vendor is a machine that dispenses a soft drink or snack in exchange for money. You need strategic supplier partners, both for process and MRO materials. Strategic partners bring important knowledge and experience to the table, which enables you to plan more effectively; improve design, operations and maintenance; and more effectively assess problems and shortcomings.
8. Reliability data collection and analysis systems
Reliability management and improvement requires data. Surprisingly few organizations collect, analyze and manage reliability data effectively. From a technical perspective, reliability starts with a failure modes and effects analysis (FMEA), the reliability blueprint of an operational service or machine. Often, the FMEA is completed by drawing on limited data experience, and that is the end of the process. The FMEA worksheets serve as a reliability growth management tool. Each time you learn something new, you modify the FMEA and its associated risk priority number (RPN), a 1 to 1,000 rating of the risks associated with a failure based upon severity, likelihood and detectability. This means the organization’s plant must be committed to collecting operational and maintenance-related data – both when things are going well and when things go wrong. Mathematical reliability engineering methods and associated tools (such as root cause analysis) enable you to implement information-based improvement initiatives. Performance monitoring and detailed failure data collection techniques are an absolute must.
9. Procedure, document and knowledge management support systems
In applications where failure risk is potentially deadly, such as the commercial aviation industry, managers long ago stopped relying on the “skill of the operator” or the “skill of the maintenance craft.” Standard operating and maintenance procedures combined with checklists define expectations. Procedures and checklists are necessary to ensure consistency of practice among different people and over time. And, when staff changes do occur, procedures are required to assure continuity. Too many plants have too much of their intellectual property residing in the heads of staff members that could resign, retire or take ill at any time. Procedures also clearly define skill requirements for a particular job or activity. The modern, reliability-focused organization employs clearly defined operating, design and maintenance procedures, enforces them, and incorporates easy-to-use systems for executing work and managing change.
10. Targeted leading and lagging metrics
Leading metrics reveal performance on causal factors that, when effectively managed, yields desirable performance on lagging indicators – the effect. For example, it’s common for Japanese plants to track the number of small group meetings they have related to Total Productive Maintenance (TPM). The premise is that more small group meetings results in better communication between functional groups (the cause), which results in fewer mistakes, better plant performance, and improved efficiency and effectiveness in responding to problems and opportunities (the effect). To be effective, your metrics, both leading and lagging, must accurately reflect reliability goals that align with the organization’s mission. The management team must understand that cause and effect probably won’t be time-synchronous, hence the term “lagging.” Also, left unchecked, metrics can take control of your organization. Don’t let the organization focus so much on the metric that it loses sight of the mission. To paraphrase W. Edwards Deming, don’t allow metrics to replace leadership, judgment and common sense.
11. Vision-centric team reward system
What gets rewarded gets done. Despite this obvious fact, we’ve got a long history of rewarding failure in industrial manufacturing plants, both extrinsically and intrinsically. For example, when a machine fails over the weekend without warning and technicians are called in to address the event, they are extrinsically rewarded with overtime pay. For many, failure-induced overtime pay is so common that the technicians have adjusted their lifestyle to reflect it. Moreover, when the plant manager or maintenance manager returns to the plant, he or she rightly seeks out the technicians responsible for restoring operations and intrinsically rewards them with thanks and praise for their efforts. In both cases, the rewards are appropriate; but in both cases, they create an incentive for unreliability. Reliability-focused organizations reward reliability, not failure. The reward structure must be modified to create an incentive for the desired behavior.
12. Reliability culture management
Arguably, the most difficult aspect of plant reliability management is creating a reliability-focused culture. People and organizations like to hang on to past practices, resisting change. “We’ve always done it this way” is commonly heard around the plant. The desire to anchor to what has always been done is a phenomenon referred to as “psychological inertia.” Reliability-focused organizations continuously question current practices and look for ways to improve. This pattern of behavior takes time to establish and much work to perpetuate. It’s imperative to create a plan for achieving behavioral change, starting with lead users and early adopters and gradually making making it to those people who are slower to adopt or who actively oppose the change. There is a point at which a sufficient percentage of the organization will come around. That is when you’ll achieve critical mass. But getting to that point requires leadership, patience and tenacity. Once you’re there, you have to keep the pressure on until the new practice becomes a “new business as usual” to replace the old one, or the organization will gradually slip back into its old comfort zone.
Be sure your organization understands these important elements and the impact they have on your organization’s performance – starting at the very top. Without that leadership focus, nothing else matters. Again, pass the article along and get more of your organization on the reliability management bandwagon.