One post tagged with "data-quality"

IEEE 754 Floating-Point Edge Cases in Industrial Data Pipelines: A Practical Guide [2026]

March 3, 2026 · 12 min read

If you've ever seen a temperature reading of 3.4028235 × 10³⁸ flash across your monitoring dashboard at 2 AM, you've met IEEE 754's ugly side. Floating-point representation is the lingua franca of analog process data in industrial automation — and it's riddled with traps that can silently corrupt your data pipeline if you don't handle them at the edge.

This guide covers the real-world edge cases that matter when reading float registers from PLCs over Modbus, EtherNet/IP, and other industrial protocols — and how to catch them before they poison your analytics, trigger false alarms, or crash your trending charts.

IEEE 754 floating point data flowing through an industrial data pipeline

Why Floating-Point Matters More in Industrial IoT

In enterprise software, a floating-point rounding error means your bank balance is off by a fraction of a cent. In industrial IoT, a misinterpreted float register can mean:

A temperature sensor reading infinity instead of 450°F, triggering an emergency shutdown
An OEE calculation returning NaN, breaking every downstream dashboard
A pressure reading of -0.0 confusing threshold comparison logic
Two 16-bit registers assembled in the wrong byte order, turning 72.5 PSI into 1.6 × 10⁻³⁸

These aren't theoretical problems. They happen on real factory floors, every day, because the gap between PLC register formats and cloud-native data types is wider than most engineers realize.

The Anatomy of a PLC Float

Most modern PLCs store floating-point values as IEEE 754 single-precision (32-bit) numbers. The 32 bits break down as:

┌─────┬──────────┬───────────────────────┐
│Sign │ Exponent │      Mantissa         │
│1 bit│  8 bits  │      23 bits          │
└─────┴──────────┴───────────────────────┘
 Bit 31  Bits 30-23      Bits 22-0

This gives you a range of roughly ±1.18 × 10⁻³⁸ to ±3.40 × 10³⁸, with about 7 decimal digits of precision. That's plenty for most process variables — but the encoding introduces special values and edge cases that PLC programmers rarely think about.

The Five Dangerous Values

Pattern	Value	What Causes It
`0x7F800000`	+Infinity	Division by zero, sensor overflow
`0xFF800000`	-Infinity	Negative division by zero
`0x7FC00000`	Quiet NaN	Uninitialized register, invalid operation
`0x7FA00000`	Signaling NaN	Hardware fault flags in some PLCs
`0x00000000` / `0x80000000`	+0.0 / -0.0	Legitimate zero, but -0.0 can trip comparisons

Why PLCs Generate These Values

PLC ladder logic and structured text don't always guard against special float values. Common scenarios include:

Uninitialized registers: When a PLC program is downloaded but a tag hasn't been written to yet, many PLCs leave the register at 0x00000000 (zero) — but some leave it at 0xFFFFFFFF (NaN). There's no universal standard here.

Sensor faults: When an analog input card detects a broken wire or over-range condition, some PLCs write a sentinel value (often max positive float or NaN) to the associated tag. Others set a separate status bit and leave the value register frozen at the last good reading.

Division by zero: If your PLC program calculates a rate (e.g., throughput per hour) and the divisor drops to zero during a machine stop, you get infinity. Not every PLC programmer wraps division in a zero-check.

Scaling arithmetic: Converting raw 12-bit ADC counts (0–4095) to engineering units involves multiplication and offset. If the scaling coefficients are misconfigured, you can get results outside the normal range that are still technically valid IEEE 754 floats.

The Byte-Ordering Minefield

Here's where industrial protocols diverge from IT conventions in ways that cause the most data corruption.

Modbus Register Ordering

Modbus transmits data in 16-bit registers. A 32-bit float occupies two consecutive registers. The question is: which register holds the high word?

The Modbus specification says big-endian (high word first), but many PLC vendors violate this:

Standard Modbus (Big-Endian / "ABCD"):
  Register N   = High word (bytes A, B)
  Register N+1 = Low word  (bytes C, D)

Swapped (Little-Endian / "CDAB"):
  Register N   = Low word  (bytes C, D)
  Register N+1 = High word (bytes A, B)

Byte-Swapped ("BADC"):
  Register N   = Byte-swapped high word (B, A)
  Register N+1 = Byte-swapped low word  (D, C)

Full Reverse ("DCBA"):
  Register N   = (D, C)
  Register N+1 = (B, A)

Real-world example: A process temperature of 72.5°F is 0x42910000 in IEEE 754. Here's what you'd read over Modbus depending on the byte order:

Order	Register N	Register N+1	Decoded Value
ABCD	`0x4291`	`0x0000`	72.5 ✅
CDAB	`0x0000`	`0x4291`	1.598 × 10⁻⁴¹ ❌
BADC	`0x9142`	`0x0000`	-6.01 × 10⁻²⁸ ❌
DCBA	`0x0000`	`0x9142`	Garbage ❌

The only reliable way to determine byte ordering is to read a known value from the PLC — like a setpoint you can verify — and compare the decoded result against all four orderings.

EtherNet/IP Tag Ordering

EtherNet/IP (CIP) is generally more predictable because it transmits structured data with typed access. When you read a REAL tag from an Allen-Bradley Micro800 or CompactLogix, the CIP layer handles byte ordering transparently. The value arrives in the host's native format through the client library.

However, watch out for array access. When reading a float array starting at a specific index, the start index and element count must match the PLC's memory layout exactly. Requesting tag_name[1] with elem_count=6 reads elements 1 through 6 — the zero-indexed first element is skipped. Getting this wrong doesn't produce an error; it silently gives you shifted values.

Practical Validation Strategies

Layer 1: Raw Register Validation

Before you even try to decode a float, validate the raw bytes:

import struct
import math

def validate_float_register(high_word: int, low_word: int,
                            byte_order: str = "ABCD") -> tuple[float, str]:
    """
    Decode and validate a 32-bit float from two Modbus registers.
    Returns (value, status) where status is 'ok', 'nan', 'inf', or 'denorm'.
    """
    # Assemble bytes based on ordering
    if byte_order == "ABCD":
        raw = struct.pack('>HH', high_word, low_word)
    elif byte_order == "CDAB":
        raw = struct.pack('>HH', low_word, high_word)
    elif byte_order == "BADC":
        raw = struct.pack('>HH',
                          ((high_word & 0xFF) << 8) | (high_word >> 8),
                          ((low_word & 0xFF) << 8) | (low_word >> 8))
    elif byte_order == "DCBA":
        raw = struct.pack('<HH', high_word, low_word)
    else:
        raise ValueError(f"Unknown byte order: {byte_order}")

    value = struct.unpack('>f', raw)[0]

    # Check special values
    if math.isnan(value):
        return value, "nan"
    if math.isinf(value):
        return value, "inf"

    # Check denormalized (subnormal) — often indicates garbage data
    raw_int = struct.unpack('>I', raw)[0]
    exponent = (raw_int >> 23) & 0xFF
    if exponent == 0 and (raw_int & 0x7FFFFF) != 0:
        return value, "denorm"

    return value, "ok"

Layer 2: Engineering-Range Clamping

Every process variable has a physically meaningful range. A mold temperature can't be -40,000°F. A flow rate can't be 10 billion GPM. Enforce these ranges at the edge:

RANGE_LIMITS = {
    "mold_temperature_f":   (-50.0, 900.0),
    "barrel_pressure_psi":  (0.0, 40000.0),
    "screw_rpm":            (0.0, 500.0),
    "coolant_flow_gpm":     (0.0, 200.0),
}

def clamp_to_range(tag_name: str, value: float) -> tuple[float, bool]:
    """Clamp a value to its engineering range. Returns (clamped_value, was_clamped)."""
    if tag_name not in RANGE_LIMITS:
        return value, False
    low, high = RANGE_LIMITS[tag_name]
    if value < low:
        return low, True
    if value > high:
        return high, True
    return value, False

Layer 3: Rate-of-Change Filtering

A legitimate temperature can't jump from 200°F to 800°F in one polling cycle (typically 1–60 seconds). Rate-of-change filtering catches sensor glitches and transient read errors:

MAX_RATE_OF_CHANGE = {
    "mold_temperature_f":  50.0,    # Max °F per polling cycle
    "barrel_pressure_psi": 2000.0,  # Max PSI per cycle
    "screw_rpm":           100.0,   # Max RPM per cycle
}

def rate_check(tag_name: str, new_value: float,
               last_value: float) -> bool:
    """Returns True if the change rate is within acceptable limits."""
    if tag_name not in MAX_RATE_OF_CHANGE:
        return True
    max_delta = MAX_RATE_OF_CHANGE[tag_name]
    return abs(new_value - last_value) <= max_delta

The 32-Bit Float Reassembly Problem

When your edge gateway reads two 16-bit Modbus registers and needs to assemble them into a 32-bit float, the implementation must handle several non-obvious cases.

Two-Register Float Assembly

The most common approach reads two registers and combines them. But there's a critical subtlety: the function code determines how you interpret the raw words.

For holding registers (function code 3) and input registers (function code 4), each register is a 16-bit unsigned integer. To assemble a float:

Step 1: Read register N → uint16 word_high
Step 2: Read register N+1 → uint16 word_low
Step 3: Combine → uint32 raw = (word_high << 16) | word_low
Step 4: Reinterpret raw as IEEE 754 float

But here's the trap: some Modbus libraries automatically apply byte swapping at the protocol layer (converting from Modbus big-endian to host little-endian), which means your "high word" might already be byte-swapped before you assemble it.

A robust implementation uses the library's native float-extraction function (like modbus_get_float() in libmodbus) rather than manual assembly when possible. When you must assemble manually, test against a known value first.

Handling Mixed-Endian Devices

In real factories, you'll often have devices from multiple vendors on the same Modbus network — each with their own byte-ordering conventions. Your edge gateway must support per-device (or even per-register) byte-order configuration:

devices:
  - name: "Injection_Molding_Press_1"
    protocol: modbus-tcp
    address: "192.168.1.10"
    byte_order: ABCD
    tags:
      - name: barrel_temp_zone1
        register: 40001
        type: float32
        # Inherits device byte_order

  - name: "Chiller_Unit_3"
    protocol: modbus-tcp
    address: "192.168.1.20"
    byte_order: CDAB    # This vendor swaps words
    tags:
      - name: coolant_supply_temp
        register: 30000
        type: float32

Change Detection with Floating-Point Values

One of the most powerful bandwidth optimizations in IIoT edge gateways is change-of-value (COV) detection — only transmitting a value when it actually changes. But floating-point comparison is inherently tricky.

The Naive Approach (Broken)

// DON'T DO THIS
if (new_value != old_value) {
    send(new_value);
}

This fails because:

Sensor noise causes sub-LSB fluctuations that produce different float representations
NaN ≠ NaN by IEEE 754 rules, so you'd send NaN every single cycle
-0.0 == +0.0 by IEEE 754, so you'd miss sign changes that might matter

The Practical Approach

Compare at the raw register level (integer comparison), not the float level. If the uint32 representation of two registers hasn't changed, the float is identical bit-for-bit — no ambiguity:

uint32_t new_raw = (word_high << 16) | word_low;
uint32_t old_raw = stored_raw_value;

if (new_raw != old_raw) {
    // Value actually changed — decode and transmit
    stored_raw_value = new_raw;
    transmit(decode_float(new_raw));
}

This approach is used in production edge gateways and avoids all the floating-point comparison pitfalls. It's also faster — integer comparison is a single CPU instruction, while float comparison requires FPU operations and NaN handling.

Batching and Precision Preservation

When batching multiple tag values for transmission, format choice matters for float precision.

JSON Serialization Pitfalls

JSON doesn't distinguish between integers and floats, and most JSON serializers will round-trip a float through a decimal representation, potentially losing precision:

Original float: 72.5 (exact in IEEE 754: 0x42910000)
JSON: "72.5" → Deserialized: 72.5 ✅

Original float: 72.3 (NOT exact: 0x4290999A)
JSON: "72.30000305175781" → Deserialized: 72.30000305175781
Or:   "72.3" → Deserialized: 72.30000305175781 (different!)

For telemetry where exact bit-level reproduction matters (e.g., comparing dashboard values against PLC HMI values), use binary encoding. A well-designed binary telemetry format encodes the tag ID, status, value type, and raw bytes — preserving perfect fidelity with less bandwidth.

A typical binary batch frame looks like:

┌──────────┬────────────┬──────────┬──────────┬────────────────┐
│ Batch    │ Group      │ Device   │ Serial   │ Values         │
│ Header   │ Timestamp  │ Type     │ Number   │ Array          │
│ (1 byte) │ (4 bytes)  │ (2 bytes)│ (4 bytes)│ (variable)     │
└──────────┴────────────┴──────────┴──────────┴────────────────┘

Each value entry:
┌──────────┬────────┬──────────┬──────────┬────────────────┐
│ Tag ID   │ Status │ Count    │ Elem     │ Raw Values     │
│ (2 bytes)│(1 byte)│ (1 byte) │ Size     │ (count × size) │
│          │        │          │ (1 byte) │                │
└──────────┴────────┴──────────┴──────────┴────────────────┘

This format reduces a typical 100-tag batch from ~5 KB (JSON) to ~600 bytes (binary) — an 8× bandwidth reduction with zero precision loss.

Edge Gateway Best Practices

Based on years of deploying edge gateways in plastics, metals, and packaging manufacturing, here are the practices that prevent float-related data quality issues:

1. Validate at the Source

Don't wait until data reaches the cloud to check for NaN and infinity. By then, you've wasted bandwidth transmitting garbage and may have corrupted aggregations. Validate immediately after the register read.

2. Separate Value and Status

Every tag read should produce two outputs: the decoded value AND a status code. Status codes distinguish between "value is zero because the sensor reads zero" and "value is zero because the read failed." Most Modbus libraries return error codes — propagate them alongside the values.

3. Configure Byte Order Per Device

Don't hardcode byte ordering. Every industrial device you connect might have different conventions. Your tag configuration should support per-device or per-tag byte-order specification.

4. Use Binary Encoding for Bandwidth-Constrained Links

If your edge gateway communicates over cellular (4G/5G) or satellite, binary encoding pays for itself immediately. The bandwidth savings compound with polling frequency — a gateway polling 200 tags every second generates 17 GB/month in JSON but only 2 GB/month in binary.

5. Hourly Full Reads

Even with change-of-value filtering, perform a full read of all tags at least once per hour. This catches situations where a value changed but the change was lost due to a transient error, and ensures your cloud platform always has a recent snapshot of every tag — even slowly-changing ones.

How machineCDN Handles Float Data

machineCDN's edge infrastructure handles these float challenges at the protocol driver level. The platform supports automatic byte-order detection during device onboarding, validates every register read against configurable engineering ranges, and uses binary telemetry encoding to minimize bandwidth while preserving perfect float fidelity.

For plants running mixed-vendor equipment — which is nearly every plant — machineCDN normalizes all float data into a consistent format before it reaches your dashboards, ensuring that a temperature from a Modbus chiller and a temperature from an EtherNet/IP blender are directly comparable.

Key Takeaways

IEEE 754 special values (NaN, infinity, denormals) appear regularly in PLC data — don't assume every register read produces a valid number
Byte ordering varies by vendor, not by protocol — always verify against a known value
Compare at the raw register level for change detection — never use float equality
Binary encoding preserves precision and saves 8× bandwidth over JSON for telemetry
Validate at the edge, not in the cloud — garbage data should never leave the factory

Getting floating-point handling right at the edge gateway is one of those unglamorous engineering fundamentals that separates reliable IIoT platforms from brittle ones. Your trending charts, alarm logic, and analytics all depend on it.

Want to see how machineCDN handles multi-protocol float data normalization in production? Request a demo to explore the platform with real factory data.

Why Floating-Point Matters More in Industrial IoT​

The Anatomy of a PLC Float​

The Five Dangerous Values​

Why PLCs Generate These Values​

The Byte-Ordering Minefield​

Modbus Register Ordering​

EtherNet/IP Tag Ordering​

Practical Validation Strategies​

Layer 1: Raw Register Validation​

Layer 2: Engineering-Range Clamping​

Layer 3: Rate-of-Change Filtering​

The 32-Bit Float Reassembly Problem​

Two-Register Float Assembly​

Handling Mixed-Endian Devices​

Change Detection with Floating-Point Values​

The Naive Approach (Broken)​

The Practical Approach​

Batching and Precision Preservation​

JSON Serialization Pitfalls​

Edge Gateway Best Practices​

1. Validate at the Source​

2. Separate Value and Status​

3. Configure Byte Order Per Device​

4. Use Binary Encoding for Bandwidth-Constrained Links​

5. Hourly Full Reads​

How machineCDN Handles Float Data​

Key Takeaways​