Right Understanding, Right Analysis and Right Reporting – Part 5 of the 10 Components of a Successful Vibration Program – by Alan Friedman
Uptime magazine August / September 2016 – PDF version
10 Components. Right understanding is about knowing the equipment and understanding how it fails. If you do not understand how the equipment fails then you cannot come up with an appropriate strategy to maintain it. When it comes to maintenance strategies, you generally have four options:
- Redesign the asset to remove the failure mode.
- Condition based maintenance (CBM) – If the machine gives you an indication that it is entering into the failure mode you can monitor this indicator.
- Preventive or time based maintenance (PM) – If the component fails in a known amount of time then you can replace it before that time.
- Run to failure maintenance – If the consequences of failure are very low you can simply let the component fail.
Out of these four options, the first option is by far the best, but it is not always practical. It is always better to remove the root cause of the problem than to continuously fight the symptoms. Just think of the airline industry, every time a plane crashes they determine the root cause and take steps to ensure it never happens again. As creatures of habit however, we are more inclined to keep tripping over the same crease in the carpet then we are to bend down and straighten it out.
For condition based maintenance, one needs to consider the different failure modes and how they present themselves. Condition monitoring is based on the idea that machines tell you when they begin to fail. They can tell you this in a variety of ways such as by vibrating differently, making different sounds, changing temperature, changing how electricity flows through them, changing pressure etc. These are called “indicators” of a change in condition. It is necessary to understand the variety of indicators that a machine presents for different failure modes in addition to understanding the failure modes themselves. The monitoring technology you choose (Right tools) and the tests you will perform (Right data collection) will be based on the indicator(s) you wish to measure.
You need to know how quickly the failure modes progress in order to know how frequently to take measurements. A turbine with a large journal bearing can go from perfect operation to catastrophic failure in a matter of seconds; therefore a continuous monitoring protection system is required. A centrifugal pump operating in a clean environment will give the first signs of bearing wear up to a year or more before the bearing actually fails, so monthly or quarterly tests are adequate.
Different indicators will appear at different times. A bearing will emit high frequency vibration at its earlier stages of failure and lower frequency vibration later; much closer to failure it might make audible sounds or get hot. This also needs to be considered when choosing a monitoring technology.
Different machine fault conditions generate different patterns and frequencies of vibration and can appear at different test points and in different axes. Before taking a vibration test it is therefore important to understand the machine, its internal components and the faults it is likely to experience. This helps you be sure that you are testing the machine in the correct way. In order to do this you will need to know the shaft rotation rates, the numbers of gear teeth, pump vanes, fan blades etc. Because this information might not be readily available you will need to document the information you have and remember to track down the information you need.
“Right analysis” boils down to creating baselines and looking for changes in these indicators over time. Most people seem to do all of this backwards. They start with a tool or monitoring technology, then they look for things to test, then they look at the data like so many tea leaves trying to figure out what it means. A better way is to begin with the asset and its failure modes, determine the indicators that the machine produces when it begins to fail, select the appropriate technology and test configurations to monitor for those indicators and then analyze the data to look for changes. If you have good software and take the time to set alarm limits on these specific indictors then your software can do the majority of the analysis work for you.
Alarms are different than reports. For a report to be useful it should contain what is referred to as “actionable information” In other words, the person who receives the report should understand what the problem is and what to do about it. Just saying a machine is in “alarm” does not provide this information. It does not describe what the problem is or how it should be resolved. A typical format for a report might include a diagnosis such as “moderate motor bearing wear” and a recommendation: “monitor for changes.”
Because vibration and other CM technologies aim to diagnose problems very far in advance it is not always necessary to act on the diagnosis right away. Reports should therefore contain priority or severity levels. Definitions of the severity levels should be agreed upon by all parties so the people receiving the reports know what action to take. Here is a typical severity scheme:
Level 1: Slight fault: No recommendation
Level 2: Moderate fault: Monitor for changes. Consider risks of failure, availability of spare parts, upcoming shutdowns etc. Begin to plan the repair.
Level 3: Serious fault: Plan repair for the near future
Level 4: Extreme fault: Shut down machine
Many analysts prefer to wait until a problem is really bad before they report it. This is because they want to be absolutely sure the problem exists and they want to make sure the machine is not repaired earlier than necessary. This behavior is contrary to the goal of providing early warning to planners so they can plan better. On the other hand, some planners will receive a report with a low priority and will schedule the repair right away because they have not been trained to understand the meaning of the severity levels. Optimally, everyone should have access to the same information and everyone should understand how to interpret the severity levels. In other words, report early with a low severity and train the people receiving the reports in how to interpret them.
The amount of detail in the report will depend on who is receiving it. If an outside service provider is providing reports to the maintenance department, the report might not only have a diagnosis such as “moderate bearing wear” but also the evidence that suggested that fault. This might include appropriate plots or trends and a description of why the conclusion was made. However, you don’t want to give too much detail to people who are not interested in it or who cannot understand it. The thicker the report or the harder it is to find the important information, the more likely it is to be ignored. One problem facing all of us in this information age is information overload, so make sure the reports contain only what is absolutely necessary to the person receiving it and understand that you might need to create different reports for different individuals.
It is also important to consider the “how” and “when” of reporting. How is the report transmitted to the person? When do they get it and how does this align with the goals of the program? When it comes to the “how” it is important to ask if the report is passive or active. Dropping a paper report on someone’s desk is “passive” because the person may or may not get around to reading it. If the report arrives by way of email or via a software package that requires an acknowledgement then you will know your message has been received. As for the “when”, it depends somewhat on the severity of the problem and the rate at which it can progress. A very serious problem cannot wait for an end-of-month review. On the other hand, it makes sense to coordinate reporting or review with other planning activities.
Reports are also helpful to the analyst. In most cases you are trending faults as they progress over time, so don’t look at your data every month like it is the first time you have seen it, instead start your analysis by looking at your last report. Your software should have a convenient method of displaying the prior report alongside the new data.
Report procedures should be audited. It is a good idea to occasionally sit down with all of the stakeholders and make sure that everyone is on the same page regarding the issues just raised. It is also important to find out if the reports are valued or not. Too often people in a plant do things because it is their job and that job might be presenting vibration reports to managers or planners but if the people receiving the reports do not actually act on them or find them valuable then resources are being wasted. Either the reports need to be presented differently or the people receiving them need to be educated about their usefulness.
Lastly, reports should be audited for accuracy. What types of problems are being reported? How much misalignment vs unbalance vs bearing wear? What are the severities of the problems being reported? Are defects being discovered at an early enough stage? What percent of the diagnoses are correct? How many failures were missed entirely? All of these are important questions that should be answered in a formal way and as part of an ongoing process. Right follow up and review also needs to be an integral part of the program.
But right understanding, analysis and reporting are only a part of the puzzle. In order to have a successful program, one needs to have all ten components in place: Right goals, right people, right leadership, right tools, right data collection, right follow-up and review and right processes and procedures. You can read about all of these and more in Alan Friedman’s: Audit it. Improve it! Getting the Most from Your Vibration Monitoring Program.