The Infosys Utilities Blog seeks to discuss and answer the industry’s burning Smart Grid questions through the commentary of the industry’s leading Smart Grid and Sustainability experts. This blogging community offers a rich source of fresh new ideas on the planning, design and implementation of solutions for the utility industry of tomorrow.

« Increasing Customer Self Service Adoption in Utilities | Main | European Gas to Power - squeezing margins »

Data Quality: The difference between OMS/DMS success and failure

While implementing an Outage Management System (OMS) or Distribution Management System (DMS), high quality data is critical for the product to do its job correctly.  GIS and CIS data are the backbone of the system.  The data represents the customers and devices those customers exist.  The data also provides the network connectivity, and general information dispatchers use to make decisions.  If the quality of data is low, the ability of the OMS and DMS to be productive is low.  Visually, the OMS user must be able to clearly see the correct network information and be able to clearly see the annotations.  Also important is the correct mapping of customers to the correct premise and meter.  Correct mapping enables correct outage prediction and correct counts of customers predicted out.  In addition to CIS, the OMS and DMS depend on the GIS data model in many ways.  Without sufficient data quality, many different defects will be visible throughout the system, but they will all have the same root cause, data.  Bad data undermines OMS users confidence in the system.  If data shows results or information incorrectly, they will think the system is flawed when the system isn't the real problem.  Their negative feelings become a new challenge to the project's success.   Possible problems are that outages predict to the wrong location (or not at all),  deenergized areas, loops,  parallels, missing critical information like critical device attributes, or incorrect voltage values (resulting in incorrect overload and violation warnings), the model can look incorrect in the viewer, or device names to appear in tools incorrectly.   Data problems are like the proverbial onion.  There are many layers.  When one data issue is "peeled away" the next layer of data problems can be seen.  To have a successful project, it is critical to start reviewing the data early and perform reviews over multiple iterations.  It isn't necessary for all configurations to be complete in order to review data quality, so don't wait for the system to be 100% configured to start.  Reviewing data needs to begin early in the project so that when users begin working with the system that it looks reasonable.  Also data must be frozen with enough time to create test scripts for System Integration Testing.  Changing/updating the data model can and will affect how outages predict.   Having good data allows testing to show functional and usability issues instead of being hidden behind poor data.

Poor data quality shows up in several ways and every way is highly important.  The simplest way problems appear is just visual appearance.  All devices must appear as expected.   Devices must appear in the correct size, with reasonable placement, showing the proper symbol for the device, and with the correct annotations appearing near the device.  The electronic representation of the feeder should look similar to the paper copies printed from GIS. 


In addition to appearance, OMS feeder connectivity must be correct.   This is one of the most important data factors because improper energization leads unexpected deenergization, looping, or paralleling of the network.  Having deenergized areas causes outages to not predict or calls not to group together.  Having power loop back around causes the prediction engine to get confused and not know where to predict an outage.  It is possible that the logic used to connect together the segments of a feeder to be bad.  Either extra connections can happen where it is undesirable (resulting in loops/parallels) or connections can fail to occur (resulting in breaks in the energization).   Another important check is to make sure correct phasing is present.  For example, a B phase only conductor should not have an A phase transformer.  The B phase would never bring power to the A phase transformer.  One of the common problems I've seen is when multiphase transformers are modeled to single phase conductors.  When the transformer is restored, only one phase reaches the transformer, so the transformer and the outage doesn't get fully restored.  Just because data looks OK in the GIS system does not mean it will in the OMS/DMS.  An OMS/DMS has greater requirements because it is doing more than showing a visual representation of the model.  


Dispatchers are the primary OMS users.  To do their jobs, dispatchers depend on the GIS and CIS data built into the OMS/DMS.  The system needs correct data to enable quick and correct decisions.  Reviews of data need to occur before they are too hands on with the system.  That way bad first impressions are avoided.  The visual representation of the network must be high quality so they can trust what it tells them.  When they need more information, such as looking up various attributes of a device, they must be clear and correct.  When they need to know how many customers are affected by an action, they must be correct so that the impact of actions can be properly weighed.  When they look at their current outages dashboard, a map viewer, or other tools, they want to see correct information like the feeder name and the name of the affected device. When they need attribute information on the device, it needs to be correct.  Proper logic behind the scenes must be put in place and the results verified.   If the data is missing because the logic to the attribute information was incorrect, or it was never present in the GIS system, the dispatcher can't get the information they need.


To mitigate the data risks, there is a recommended approach to reviewing OMS/DMS data.  In my experience with prior projects, doing a series of data reviews with two different (but similar) sets of data, increases data quality.  In the first review, have a small data model that contains at least 1 of every object wished to be modeled.  The point is to make sure it can build each object class (whether electric or not) and eliminate systemic problems.  Looking for localized issues can wait until the next review.  The first need is to get rid of systematic issues, like getting devices to build correctly or proper device naming.  If it is incorrect, the problem will appear every time that device appears.  Doing this will get rid of many errors all at once and make it easier to identify when a data issue is a one off data problem.  Many devices of the same class are built throughout the model.  If the device builds once successfully, it will do it the same for the rest of them too.  Having a small model for reviews is important because it is intended to rebuild the model after making fixes to verify the fix's success.  This will occur 3 times or until issues are small in number.   To successfully do this in a reasonable time period, the model must be small.  Large, complete models take too much time to be efficient.  To achieve these goals, it is perfectly acceptable to fictionalize data and break real world rules to ensure every device, including those that appear infrequently, are present for the first dataset review.  For the purpose of this dataset, go ahead and put an automatic throw-over device in a place one doesn't naturally appear.  The important thing is to just make sure it builds correctly.


Once the first data set review is complete, it's time to move onto the second dataset.  This dataset is also a small dataset, but this time only real data is used.  Using a small dataset is still important because, like in the first dataset, the model is repeatedly rebuilt after fixes are made.  The dataset should be big enough to use for functional and integration testing, but not bigger.  We assume in this dataset review the first set of reviews was completed and devices now build correctly.  Now we are looking for connectivity issues, individual data errors, or other problems beyond just in how the device builds.  Pick a representative area of the model, about 2-4 substations in size.  Build them, and begin the review.  In this review, problems in how connections are made and topology problems are found.   Making these fixes is also done in iterations.   While this is not the full model, fixes here often apply to the whole model.  Seeing the kind of problems found here also provides insight into the potential problems to look for in the rest of the model.


Since data is the backbone of any OMS, having as high quality data as possible is critical to being successful.  The methodology described above has been proven to efficiently lead to high quality of data.  If there are other data modeling tips you'd like to share, I'd enjoy hearing them.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.