The Expert Talk Predictive Analytics
Predictive Analytics in the area of Equipment Reliability has been a key focus area within Enterprise Asset Management(EAM) practice in Infosys. Thought papers, frameworks and real-time project experience to reinforce the subject knowledge did not limit us in our pursuit of seeking the ultimate in this domain. Bringing you one among many such pursuits- our interaction with Yuri Gogolitsyn. Read on as Yuri take us through some un-traversed areas within in the domain of Predictive Analytics.
About The Expert
Yuri Gogolitsyn is an experienced EAI Technical Architect and Consultant who has worked on numerous multinational projects and has substantial hands-on experience with many leading integration technologies with exposure to real-time assignments involving predictive analytics. He is based out of the UK and before moving into the professional IT was doing brain research dealing with statistical processing of the brain electrical signals.
Welcome to the first edition of Expert Talks, good to have your with us today. Could you please share with us your tryst with Predictive Analytics?
Thank you! First of all, I think that Predictive Analytics is still much more a research area than a set of tools or products capable of providing quick and immediate solutions to emerging requirements in various industries.
Quite a while ago, before moving into the professional IT, I was doing a scientific research in the area of the brain science. My main interests were, to state it very briefly, in using statistical methods to detect and evaluate brain electrical responses to various stimuli (pictures, words etc.). As a rule, the responses were tiny and buried in the unavoidable background variations and noise. The goal was to obtain a statistical proof that the response actually exists at all and to provide some estimate of the extent it is consistently repeated when you present the same stimulus under the same conditions. This is just one of the examples of a more general area of Pattern Recognition. I believe that Predictive Analytics belongs to the same general area. This area is really huge both in terms of classes of problems it deals with and methods used.
Brain science and its relation to statistics seems fascinating! Do you think we can relate this concept to an Equipment response as well?
Yes, in fact the concept finds its application in a wider scope. Look around and you will see the science of correlations in all possible aspects of things. For example, when analyzing large amounts of data on the contents of the supermarket baskets the researchers recently found that when a person buys baby's diapers, there is a good probability that this person would also buy some beer. It was a bit of a surprise! An immediate pragmatic recommendation from the study would be to keep both items close on the shelves to make it more convenient for the customers. However, the researchers did some more digging and found an explanation for this unusual effect. It turned out to be due to young fathers whom their wives ask to buy diapers for their baby on the way home from work. It often implies that the husband expects to spend evening at home with his family, so he also buys beer for himself.
An example from a completely different area - the width of the annual growth rings on the tree stumps strongly correlates with the annual number of fatalities from heart attacks. However, there is no causal relationship between the two observed variables in this case. The actual driving mechanism underlying this effect is the annual variations in the solar activity.
In the context of Equipment, an example would be to note a response against parameters such as load, pressure, rotations per minutes etc. and try correlating it with the failure pattern. With the objective of optimizing equipment performance, one can study these specific parameters and try channelizing it towards a safer zone. This way, we essentially work on a need based maintenance as we know whether a failure is imminent and could avoid the pit fall of overdoing the maintenance activity.
If you want to describe the Predictive Analytics to a novice in this field, how would it be?
The logic underlying Predictive Analytics could be outlined as follows. A combination of parameters is repeatedly measured for a system under observation. At some moment in time an important event occurs due to an unknown reason - the system noticeably changes its behavior in some way (e.g., breaks or stops functioning). Over a substantial period of observation a large volume of data on the values of parameters that precede the important event's occurrence has been accumulated. The question to answer is to what extent it is possible to predict that the important event is imminent by looking at the current values of the measured parameters.
One fundamental aspect should be stressed here - the repeatability of the important event. It is impossible to predict events that are unique or occur very rarely indeed - statistical methods just do not work under such scenarios. On a lighter note, this is nicely illustrated by the following joke.
A University professor is conducting a seminar on telekinesis. He explains to his students that telekinesis is an ability to move objects using just one's will power and says:
- Let's now all close our eyes, concentrate for one minute and try moving ourselves outside this room into the corridor.
In a minute they open their eyes and are very surprised to see that one person is missing! The professor, stunned not less than his students, asks them for comments. One of the students is doing a course in statistics. He says:
- I am not sure you would be able to prove that this effect is significant using statistical methods...
Which industry according to you would have the most requirements of Predictive Analytics?
Everyone would like to know the future! The quality of prediction benefits from the careful statistical analysis of the available data. Unfortunately it may often be the case that even the very large volumes of data do not allow prediction with any usable degree of confidence - we do not know if parameters we are monitoring indeed have the required predictive power. You are unfortunately not guaranteed a success when you start dealing with a prediction task. A very good example in this respect is a long history of attempts to predict earthquakes and volcanic eruptions. We are still very far from where we ideally would like to be in this area. You really need not have to put this in an Industrial perspective.
Based on your experience, could you please tell us about the tools/software widely used in the field of Predictive Analytics? Is there a best -of- breed solution available?
There is a huge number of packages for the statistical data analysis available. You can do a lot in Excel, for example, regression models. You can try the machine learning algorithms or even neural networks. In addition to this, there are online courses available on latest analytics tools such as DataStream, Hadoop etc. which can be tried as well. However, I believe that the tools used should be chosen after considering the nature of the problem in details. You should decide on the approach first, and then pick up the right tool. Also, to work in Predictive Analytics a very good understanding of statistical methods and models is required.
You mentioned about the models, could you please elaborate on this? Is there a best of breed which one can pursue in this regard? According to you, what are the key determinants/factors to ensure accuracy of analysis?
To make a prediction you need a model. A model here is a very general concept. Depending upon the approach and techniques you use the model could be explicitly presented as a formula (e.g., regression models) or, like in neural networks, be not directly visible - embedded in the structure of connections between the neurons in the network. The outline of the general approach used in Pattern Recognition is as follows. Use some part of the data to build a model. Then test the validity of your model by feeding it the data from the other part. The second step shows how good your model is.
In addition to this, The Data to be analyzed needs to represent an actual behavioral pattern or a trend which can be analyzed using a statistical model forming a basis for drawing meaningful conclusions. It is therefore essential to gather data from a real scenario.
The data gathering aspect is becoming more promising as we move towards the Internet of Things. Utility companies have now started offering the home hubs enabling their domestic customers to monitor energy consumption and control home appliances remotely from smartphones, say, switching on the heating some time before arriving home. Actually, Infosys was already involved in integration aspects of one of such projects.
Everyone is talking about the transition to Strategic Maintenance Practices and the Prescriptive Maintenance practices lately, what are your thoughts on this?
If we are talking about Prescriptive Maintenance of some expensive equipment in utilities etc., I think that the organizations that should look in this direction are the companies that actually make the equipment. They are in the best position in terms of being able to collect vast amounts of data from many installed pieces of this equipment. They also should have a better understanding of what needs to be monitored. This increases chances of success.
I am a bit skeptical about quick success in scenarios like "It costs me a lot to maintain my three expensive gadgets/widgets, and one of them failed recently causing me a lot of problems. How nice would it be to use the Predictive Analytics to warn me when one of my gadgets is close to failure? Those guys need to tell me what exactly I should start monitoring. I am sure there are some best practices somewhere".
So quick result is a challenge, what are the other challenges you think one may face while approaching a Predictive Analytics Solution?
From just a task it may develop into a serious research project that would start consuming all your time. Do not expect readily available best practices and universal recipes. You will need to understand a lot about the target process. It takes time and many iterations until (with substantial degree of luck) you arrive to something usable. Furthermore, the most common pitfall I would suggest any analyst is be wary of is generalizing an Asset class, in I think generalizing an Asset class across domains are also not intended. Another common problem I have seen companies struggling with is having huge set of data and having no clue on what to do with them. A predictive data analytic model cannot be generic, it differs case by case. For performing predictive analysis in Asset Management, each Assets specific information needs to be viewed specifically and the asset specific predictive factors determined accordingly.
What are your thoughts on the heavy investments which this area entails? something which Predictive Analytics is infamous for!
I need to make it clear that I am on the side of skeptics in relation to Predictive Analytics, those who tend to believe that the number of scenarios where it is potentially possible to provide a prediction with a reasonable degree of confidence is rather small, definitely much smaller than the number of scenarios where it is not possible. The best negative example we all know about is prediction of the share prices.
The investments in this area should be probably considered as spending on research and development. Usual considerations are valid here - the investments are heavy indeed and in no way they guarantee the desired solution. However, I think that the beneficial side of heavy investments in research is clear - it may lead to better technologies, algorithms etc. that would have much wider usage and substantial benefits.
Besides investments on research, do you think there are other avenues of higher spends which the organization should watch out for?
Certainly the investment cost are higher, the early adopters of predictive analytics would certainly have challenges in substantiating the cost. The investments could range from gathering Instrumentation controls and analytic tools to the company personnel who need to get trained on the using the technology and deciphering the results to act on them. However, looking at the advantages in terms of catching failure before it causes beyond repair damages, the investments seem to be promising.
What would be your advice to Organizations attempting to go the Predicative Analytics route?
Be ready for a trial and error approach albeit at a smaller scale, have some experts who has good qualification in statistical computations. I have often seen companies providing research grants to universities, there is a cost advantage to this. Collaborating with equipment manufacturers also helps as they bring in a consolidation of data to cover the expanse of operational scenario which is a must thing in predictive analytics. Role of Equipment manufacturer and critical component (e.g. Bearing, Bushes) manufacturer are key and should be partnered with, in the journey of Predictive analytics. Every equipment and machine is different and unique therefore developing predictive analytical model would turn out to be very time consuming and costly at times.
Above all, you also need immense patience to succeed in this domain. Never expect to master the art and also do not expect a radical result. Taking things one at a time would help and yeah -All the Best!