Data Rich, Insight Poor: Why More EHS Data Isn't Giving You Better Answers (DMA Series Part 4)

The difference between having data and having answers

Most EHS teams we talk to are not short on data. They have incident logs, inspection records, observation counts, training completions, audit findings, near-miss reports, JHAs, corrective actions, risk assessments, OSHA 300 logs, workers' comp claims, emergency drill records, chemical inventories, equipment maintenance logs, wearable sensor outputs, and dozens more data sources feeding into various systems every single day.

Some organizations have years of it: structured records in databases and spreadsheets, unstructured data in photos, videos, voice memos, PDFs, scanned forms, and email chains, and semi-structured data in apps, sensor feeds, and IoT platforms. All spread across EHS software, HRIS systems, operational technology, shared drives, paper filing cabinets, and the institutional, non-codified memory of people who have been there long enough to remember how things used to work.

Yet if someone asks a question that actually matters, everyone looks to their right and left:

  • "Which sites are showing early warning patterns that suggest we are heading toward a serious event?"
  • "What are the three biggest risk drivers across our operations right now?"
  • "Are we getting better or worse at closing corrective actions before the risk materializes?"
  • "Where should we be focusing capital investment to reduce our most significant exposures?"

When an inspector shows up unannounced and requests documentation for a specific incident, training record, or corrective action, the team is now scrambling across three systems and two shared drives trying to produce a defensible record.

These are the questions and situations that data is supposed to be able to answer. But for most companies we start working with, answering them can take days or weeks of manual compilation, cross-referencing between systems, chasing down site leads for context, and just trying to make sense of it all to tell a story that actually matters and will ignite real change.

This is the paradox at the center of the Data and Analytics digital maturity dimension: most organizations are somewhere between data they cannot fully trust and data that, even when trustworthy, does not move anyone to act. The gap is not just about data quality. It is about whether the data, once reliable, is framed in a way that leads to a decision.


The Compilation Tax

Before we get into what good can look like, it's worth naming the cost of the current state, as it's been somewhat normalized across the domain.

Every month, EHS professionals across the world spend hours pulling data from one system, reformatting it, merging it with data from another system, cleaning up inconsistencies, and building a slide deck or spreadsheet that approximates reality. This is the compilation tax, that frustrating, invisible labor that sits between raw data and a usable answer.

It shows up in monthly reporting taking days instead of minutes because data has to be manually reconciled across platforms. Leadership reviews get delayed because the team is still cleaning data the morning of the meeting. Trend analysis is unreliable because definitions changed mid-year and no one documented the shift. Year-over-year comparisons require footnotes and caveats that undermine the conclusions.

It also lives in the questions that sometimes don't even get asked because everyone already knows the answer would take too long to produce and the executive team has moved on to the next item of the day. This is where leadership buy-in should happen, but instead it becomes a slow loss of strategic ground, buried under annoying friction.


Why More Dashboards Will Not Fix This

The instinctive response to this problem is to invest in better visualization. Build more dashboards, buy a BI tool, create automated reports.

Dashboards are a presentation layer and they visualize whatever is underneath them. If the underlying data has definition drift, inconsistent classification, missing context, or gaps in evidence, the dashboard will just render all of that beautifully and confidently. It will show you a trend line that looks clean and authoritative while the story behind it is full of holes. And even when the numbers are accurate, the output is often surface-level: this went up by 12 percent, that went down by 8 percent. It reads like evidence, but it does not tell anyone in the room what to actually do about it. Reporting data is not the same as producing insight.

This is not a small-company problem. We have had this exact conversation with EHS leaders at Fortune companies who have massive technology budgets, dedicated analytics teams, and years of data in their systems. And when safety data gets presented to leadership, it still comes down to percentages. This went up, that went down. The team frames it as evidence-based decision-making, but the evidence is really just a restatement of what already happened. No narrative, no direction and no one walks away from the table knowing what to act on next. If that is the experience at organizations with that level of investment, the gap is about designing for insight from the start, not just spending more.

This is how organizations end up in a worse position than before. Leadership sees a polished dashboard, assumes the data is trustworthy, and starts making resource allocation decisions, setting targets, and comparing sites based on numbers that were never designed to hold that weight. And when someone finally pulls the thread and asks "where did this number come from," the answer involves three exports, a VLOOKUP, and a person who left the company six months ago.

Dashboards display whatever you put into them. And even when what goes in is accurate, insight still requires someone who can connect the numbers to a decision. Most organizations have not built the foundation for either.


The Four Conditions for Decision-Ready Data

Data has the foundation to provide true insight when a few conditions are met:

1. Definitions are consistent and governed. What is the threshold for "high severity"? When is an investigation considered "closed"? What qualifies as a critical control versus a general safeguard? How is "serious injury potential" defined when classifying a near miss? If these definitions vary too widely by site, region, or business unit, then aggregated data is comparing things that are not the same. Your rates might be trending down, but you cannot say with confidence whether incidents are actually decreasing or whether one region just started classifying differently.

The reasonable pushback is that operations are genuinely different from one site to the next, and forcing identical data structures across all of them can feel disconnected from the work. But consistency does not require total uniformity. The goal is shared language at the level where data needs to be compared and rolled up, with enough flexibility for sites to capture what is specific to their environment. The problem arises when the differences are undocumented, unintentional, and invisible until someone tries to aggregate and realizes the numbers are not telling a coherent story.

2. Data is captured at the point of work. The further data gets from the moment it was created, the less reliable it becomes. When incident details are entered days after the event, when field-level inspections are completed on paper and transcribed later, when behavioral observations are batched at the end of the week, when a supervisor fills in a JHA from memory instead of at the job site, the record becomes a memory exercise instead of a reflection of what actually happened. Timeliness matters because it preserves the accuracy and context that make the data trustworthy enough to act on. The value of the record can heavily depend on how close it was captured to the actual moment of work, and the closer that capture happens, the less the compilation tax has to compensate for.

3. Context travels with the record. Knowing that a site had 12 incidents last quarter means very little until you also know their exposure hours, the types of work being performed, the risk profile of active projects, whether staffing levels changed, whether a new contractor was onboarded, what controls were in place, and which corrective actions from prior events had actually been closed. If that information is trapped in systems that do not talk to each other, context has to be manually stitched together every time someone asks a question.

4. The data lifecycle is owned, not inherited. Someone needs to own how data enters the system, how it is validated, how definitions are maintained, how risk matrices are updated, and how changes to classification schemes are governed and communicated across the organization. This is change control applied to the data itself and in most EHS teams, this ownership is implicit at best. The system was configured years ago during the original implementation by the vendor, the person who set it up has moved on, and the current team inherited a structure they do not fully understand and are hesitant to change because they aren't sure what will break. Without deliberate ownership of the data lifecycle, quality will drift slowly in ways that aren't obvious until the data fails to hold up under scrutiny.


From Data to Story

Getting the foundation right is necessary, but it is not the finish line. Consistent definitions, timely capture, contextual records, and owned lifecycles give you data you can trust. But trusted data that sits flat on a slide still does not move anyone to act.

The difference between reporting what happened and explaining why it matters is a consistent gap across our domain. Most EHS teams can produce a monthly report that is technically accurate, well formatted, and completely forgettable. The numbers are all there, but they are not connected to anything the audience can act on. The report becomes a record of what already happened rather than a signal about what needs to happen next.

The problem here is framing. A rising corrective action closure time is a statistic. That same trend tied to a spike in repeat findings at two sites that both onboarded new contractors in the same quarter is a story that points directly to a resourcing and onboarding decision. The data did not change between those two versions, just the way it was presented.

Data storytelling in EHS is not about making slides prettier or adding narrative flair to a monthly report. It's about designing the output so the person receiving it can answer three questions:

  • What is actually happening?
  • Why should I care?
  • What should we do about it?

The person presenting safety data to leadership has to understand both the numbers and the operational reality behind them well enough to connect a leading indicator shift to a specific operational change and explain what it means for resource allocation, staffing, or capital investment. That translation work requires someone who has been close enough to the work to know what the data is actually saying, and who can frame it in a way that makes the risk real to people who are not on-site every day.

The organizations that do this well treat data presentation as a strategic activity, not an administrative one. And the difference between a team that gets ignored and a team that gets funded often comes down to whether they showed leadership a spreadsheet or built a case that made the risk impossible to look away from.


A Quick Diagnostic

Here is a way to pressure-test where your organization sits on this dimension and surface the specific constraints that are keeping data from becoming insight.

Can you answer a strategic question in under an hour? Pick a question leadership has asked in the last 90 days. How long did it take to produce a defensible answer? If the answer required manual compilation across multiple sources, the constraint is almost certainly somewhere in the data pipeline.

Do your definitions hold up across your business? Pull the same metric from three different sites or business units. Are they classifying and counting the same way? If the numbers look inconsistent, check whether the variance is real or definitional. More often than most teams expect, it turns out to be the latter.

Is your team spending more time preparing data or analyzing it? Track it for one reporting cycle. If the ratio is more than 60/40 in favor of preparation, the infrastructure is not doing its job. The goal is to flip that ratio so teams spend the majority of their time on interpretation and action, not cleanup. And even as AI tools rapidly accelerate data preparation, the question shifts from how long the pull takes to whether the data underneath is trustworthy enough for the answers to even mean anything.

When was the last time someone asked a new question? If the team only answers the same recurring questions month after month, it is likely because the cost of answering anything new is too high. That is a signal that the data environment is too rigid or too fragmented to support curiosity.

Does pulling the data make you nervous? If the person responsible for reporting feels a knot in their stomach every time they run an export because they are not confident in what is going to come back, that is a data trust problem. If they second-guess the numbers before presenting them, or spend more time mentally preparing caveats than actually analyzing the results, the foundation is not where it needs to be.


Where to Start

The instinct is usually to fix everything at once. Build the dashboard, clean the data, standardize the definitions, and automate the reports, all in one project. That approach almost always stalls because it tries to solve a systemic problem with a single initiative.

Pick the question that leadership keeps asking and the team keeps struggling to produce a defensible response to. Make it your proving ground and then trace the data supply chain for that specific question: Where does the data originate? How does it enter the system? What transformations happen along the way? Where does context get lost?

You'll reveal constraints as you follow the data from the source to the report instead of working backward from a visualization and hoping the numbers hold up.

Once you can see where the chain breaks down, fix the constraint closest to the source. If definitions are inconsistent, standardize them. If capture is delayed, redesign the workflow closer to the point of work. If context is missing, identify the upstream system or process and build the connection. If the data is solid but leadership still is not acting on it, redesign how the data is presented and who is presenting it.

The key is working from the source outward rather than from the report backward, because each layer you skip leaves another opportunity for the data to quietly degrade.


Want to understand where you are?

If you are not sure whether your constraint is in definitions, capture, context, or ownership, a quick maturity assessment can help surface the answer.

Syncra's EHS Digital Maturity Assessment takes about 10 to 15 minutes. It provides a readout on where the biggest gaps are across all five dimensions, including Data and Analytics.

Take the assessment here

Post Your Comment