From Graveyards to Goldmines: Leveraging the Compliance Data Challenge

The amount of data wastewater treatment plants generate is increasing exponentially. Here's how utilities can turn it into an asset.


Over the past 30 years, the amount of data wastewater treatment plants typically generate has increased exponentially. Systems for storing and organizing that data have struggled to keep up.

In many cases, water utilities have come to see the sheer amount of data on file as a liability rather than an asset: something that needs to be constantly monitored, corralled, and inevitably pushed to the side.

However, as some researchers have noted, that massive pile of data overwhelming your water utility has the potential to become a goldmine.

When data is organized and shared effectively, it becomes a powerful tool for upgrading the efficiency and performance of your wastewater treatment system, anticipating disturbances before they happen and adapting to stricter standards of compliance.

The Explosive Growth of Water Treatment Data

One study showed that a single large wastewater treatment plant, serving 800,000 to 3 million people, can generate up to 30,000 data points. These include everything from sampling data essential for reporting compliance and meeting environmental regulations, to GPS coordinates, call logs, field notes, and more.

Such a large volume of data has an impact on individual personnel. For example: a single employee at a large water utility is often responsible for overseeing more than 40,000 backflow prevention devices, each of which generates annual inspection data. The system used to organize such data has an enormous impact on that employee’s day-to-day job, affecting their ability to share information, file reports, and ensure compliance.

On its own, the overarching project of compliance—particularly tracking permits—represents an enormous task. One water utility uses Klir to manage over 3,000 permits, a task that would be daunting without Klir’s fully configurable data management systems.

As Lluís Corominas, a researcher at the Catalan Institute for Water Research, writes:

Plant operators have an overwhelming stream of data at their hands, which is very difficult to process and analyze in a timely enough fashion to allow for better understanding or proper decision-making.

The earliest tremors of this explosion of data generation can be traced back to the 1970s, when one of the hottest topics at international wastewater treatment conferences was data collection from sensors.

The sensors being used were adapted from other industries and ill-fitted for use in wastewater treatment systems, but attendees were already discussing the best ways to automate the collection and management of data in their plants.

The same report lists four primary reasons why managing water treatment data (referred to as information, control, and automation [ICA]) has since become such an enormous task:

  • Effluent quality standards, which became more demanding and complex
  • Economic factors, which encouraged water utilities to develop automated, money-saving compliance management tools that generated more data than prior solutions
  • Plant complexity, one of the most important driving factors, which increased as methods of water treatment advanced
  • Improved tools, such as advanced remote sensors, which generated more data for water utilities to manage

With such a large amount of information to deal with, one of the most important tools at a water utility’s disposal is data centralization.

Aerial View WWTP

The Importance of Data Centralization

Utilities are increasingly data-rich but information-poor. As Corominas notes, a large number of utilities have become host to “data graveyards,” massive stores of data that cannot be easily navigated or accessed.

The data graveyard is a sort of invisible weight burdening a water utility, demanding resources to be maintained, causing a constant drain on time and money, but rarely producing outright catastrophic effects.

Individuals may be forced to enter the graveyard on a regular basis, in order to dredge up information for the sake of renewing permits, for instance, or to confirm the status of different backflow devices. But each of these is simply a slow, laborious task–one that creates drag on standard processes without ever pushing them to their breaking point.

The cumulative effect of the data graveyard may be huge, but it’s difficult to see. That’s especially the case when pieces of it are owned by different individuals and teams, or scattered across multiple disconnected databases.

If the cumulative effect of a data graveyard is difficult to grasp, its potential for good may be even more elusive. Your water utility could have a huge amount of data on hand that might be leveraged to speed up and improve processes, anticipate problems, and plan for the future. But so long as it’s a fragmentary mess and a headache to access, its potential is impossible to realize.

The first step in converting your data graveyard to a goldmine is centralizing it. Bringing all your data together in one place, under one administrative dashboard, lets you assess its potential.

The best tool for the job is a comprehensive software as a service (SaaS) solution. Learn more about why SaaS makes sense for water.

Once your data is centralized and easier to navigate, it’s ready to be mined.

Gold Mining for Data

To push the metaphor to the breaking point, once you’ve converted your data graveyard into a goldmine, it’s time to start mining for gold.

“Mining for gold,” in this sense, means converting raw data into information—becoming both data-rich and information-rich. The biggest opportunities for leveraging data into information fall under three categories: machine learning, improvement of remote and real-time monitoring, and increased collaboration.

The Increasing Promise of Machine Learning

Increasingly, machine learning shows potential to have a huge impact on how water utilities leverage their data to improve operations.

Machine learning is, in brief, the process of using computers to analyze large amounts of data, discover patterns, and use those patterns to make predictions, solve problems, and answer questions.

Already, machine learning has been applied to water utility data in order to track the spread of COVID-19, reduce energy usage, and detect compliance violations.

Machine Learning and Wastewater

By testing wastewater samples, infectious disease experts are already able to predict upsurges in COVID-19 infections three to seven days before standard swab testing does the same.

That makes wastewater a window into COVID infection rates among particular populations—provided you have the tools to examine the data accurately.

While current systems for monitoring COVID via wastewater suffer some gaps in information—partly due to reduced detectability in people who have been vaccinated—machine learning has shown promise when it comes to predicting upsurges and tracking COVID’s spread.

What’s more, similar techniques can be used to track other viruses, such as norovirus and polio. You can learn more from our article on wastewater-based epidemiology.

Improving Remote and Real-Time Monitoring Capabilities with Water Data

COVID-19 lockdowns around the world fast-forwarded a general trend, across many industries, towards remote-first work policies. The lockdowns also drove home just how important it is for organizations to be able to access and manage their data remotely.

In this sense, water utilities were ahead of the curve: Many utilities already remotely manage thousands of infrastructure assets using sensors, controllers, and transmitters.

That remote capability is wasted, however, if data is fragmentary—stored natively on a variety of different media (harddrives, thumb drives, backup devices, etc.), accessible only by particular teams or individuals.

Even utilities who stored data in a centralized fashion on their own local servers faced problems when moving to remote working arrangements, as personnel encountered technical barriers to accessing the organization’s intranet from offsite computers.

A cloud-based SaaS (i.e., Software as a service) is the best solution for utilities that want to make their data available to all relevant personnel, regardless of their locations, at all times.

With the help of such a system, a water utility can:

  • Cut down on work-related travel and site visits
  • Put in place more accurate and effective alert and notification systems
  • Shorten response times when issues arise
  • Scale new operations quickly across the organization
  • Respond nimbly to staffing shortages or future lockdown situations

Get More Value out of Your Wastewater Compliance Program

Curious about how technology can help your utility tackle NPDES and other wastewater-related compliance challenges for good? Download the guide and book a demo of Klir today.

How Water Utilities Can Use Machine Learning to Reduce Their Electric Bill

In Singapore, the Ulu Pandan Water Reclamation Plant used machine learning to analyze its operational data, and were able to reduce aeration energy usage by 15%.

Instead of using reactive control mechanisms, which adjust wastewater treatment processes in reaction to changing nutrient levels, flow rates, etc., the machine learning algorithm in use at Ulu Pandan creates predictive models, making fine adjustments to the system earlier than it would otherwise.

Effectively, the automated systems at the treatment plant spend less energy playing catch-up with changing conditions—opting, instead, to literally “go with the flow.”

Detecting Violations of the Clean Water Act (CWA) With Machine Learning

In theory, some water treatment facilities are more likely to violate the CWA than others—there’s just no way to know which ones. Unless you apply machine learning to the task, that is.

In 2018, researchers from Stanford demonstrated that machine learning could be used to predict the likelihood of particular water treatment facilities violating the CWA. In theory, with that information, inspectors could be sent to the facilities most likely to be in violation of the CWA, rather than to facilities with a very low likelihood of being found non-compliant.

As their paper in Nature Sustainability demonstrates, using such a system can double the number of violators caught, while allocating inspection resources more effectively.

There’s also an element of deterrence at work: the researchers theorize that, if water treatment facilities know their data is being monitored and that a machine learning algorithm will be able to anticipate any future violations, they will be more diligent, working harder to ensure violations never occur at all.

Improving Collaboration with Centralized Water Data

Machine learning and the rise of the distributed workforce are both exciting aspects of water utility data management. In fact, they could have a major impact on the future of how water utilities operate.

But organizing and centralizing data has the most immediate impact upon a water utility’s most valuable resource: its people.

When data is accessible to all personnel, across all teams, collaboration becomes more fluid, easy, and intuitive. It’s easier for engineers, compliance professionals, operations management, and other stakeholders to take advantage of the utility’s vast store of data, and use it to everyone’s benefit.

Ready to Turn Your Compliance Data Into an Asset?

Klir’s compliance tracking tools help utilities get more out of their data while cutting down on administration and record-keeping work, create new opportunities for collaboration, and provide a level of system-wide visibility unmatched by other water data management systems. Learn more and book a demo today.

Delivered Straight to Your Inbox

Download a PDF version of our guide by filling out the form below. By submitting this form, you consent to receiving the latest industry news and updates from Klir.