Skip to main content

Binary Telemetry Encoding for IIoT: Why JSON Is Killing Your Bandwidth [2026]

· 11 min read

If you're sending PLC tag values as JSON from edge gateways to the cloud, you're wasting 80–90% of your bandwidth. On a cellular-connected factory floor with dozens of machines, that's the difference between a $50/month data plan and a $500/month one — and the difference between sub-second telemetry and multi-second lag.

This guide breaks down binary telemetry encoding: how to pack industrial data efficiently at the edge, preserve type fidelity across the wire, and design batch grouping strategies that survive unreliable networks.

Binary telemetry encoding for IIoT edge devices

The JSON Problem in Industrial Telemetry

Consider a simple temperature reading from a Modbus-connected chiller:

{
"timestamp": 1709251200,
"device_type": 1018,
"serial": "0x00300039",
"tags": [
{"id": 1, "name": "Tank Temperature", "value": 42.5, "status": "ok"},
{"id": 2, "name": "CQT 1 Approach Temp", "value": 18.3, "status": "ok"},
{"id": 3, "name": "CQT 1 Chill In Temp", "value": 22.1, "status": "ok"}
]
}

That's ~320 bytes for three sensor readings. The actual data — three 16-bit integers — is 6 bytes. You're paying for 314 bytes of structural overhead on every single transmission.

On a typical plastics manufacturing floor running 12 machines with 30–40 tags each at 60-second intervals, that's roughly 14 MB/day in pure JSON overhead. Over cellular, that adds up fast.

The Real Cost Breakdown

ComponentJSON SizeBinary SizeOverhead
Field names ("timestamp", "device_type", etc.)~120 bytes0 bytes
Brackets, quotes, colons~80 bytes0 bytes
Numeric values as ASCII strings~40 bytes14 bytes2.8x
Status strings~30 bytes3 bytes10x
Total for 3 tags~320 bytes~25 bytes12.8x

Binary encoding eliminates all structural overhead. Field names, quotes, brackets, and colons vanish. What remains is pure data with minimal framing.

Designing a Binary Wire Format

A good binary telemetry format for IIoT needs four properties:

  1. Self-describing enough to decode without external schema
  2. Compact enough to fit cellular/LoRa constraints
  3. Groupable so multiple readings share context (timestamps, device IDs)
  4. Error-tolerant so partial corruption doesn't destroy entire payloads

The Anatomy of a Binary Telemetry Frame

Here's a proven structure used in production edge systems:

┌─────────────────────────────────────────────────┐
│ Command Byte (1 byte) │
│ Number of Groups (4 bytes, uint32) │
├─────────────────────────────────────────────────┤
│ ┌─── Group 1 ───────────────────────────────┐ │
│ │ Timestamp (4 bytes, uint32) │ │
│ │ Device Type (2 bytes, uint16) │ │
│ │ Production Year (1 byte) │ │
│ │ Production Month (1 byte) │ │
│ │ Production Number (2 bytes, uint16) │ │
│ │ Number of Values (4 bytes, uint32) │ │
│ │ ┌─── Value 1 ────────────────────────┐ │ │
│ │ │ Tag ID (2 bytes, uint16)│ │ │
│ │ │ Status (1 byte) │ │ │
│ │ │ Array Size (S) (1 byte) │ │ │
│ │ │ Element Size (N) (1 byte) │ │ │
│ │ │ Data (S×N bytes) │ │ │
│ │ └────────────────────────────────────┘ │ │
│ │ ┌─── Value 2 ────────────────────────┐ │ │
│ │ │ ... │ │ │
│ │ └────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────┘ │
│ ┌─── Group 2 ───────────────────────────────┐ │
│ │ ... │ │
│ └───────────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘

Why Groups Matter

The key insight is temporal grouping. Multiple tag values read at the same moment share a single timestamp and device identifier. Instead of repeating context with every value, you amortize it across an entire group.

In a typical chiller with 65+ temperature and pressure tags all polled in the same cycle, this saves ~10 bytes per tag — over 650 bytes per polling cycle.

Status Bytes Save Bandwidth Too

Notice the status byte (0x00 = OK, anything else = error code). When a tag read fails, you transmit just the 3-byte header (ID + status) instead of the full value. Failed reads are common on noisy factory floors — serial bus contention, PLC CPU overload, intermittent wiring. A binary format handles this gracefully without the awkward "value": null, "error": "timeout" constructs of JSON.

Data Type Packing: Getting the Bytes Right

Industrial data types map directly to fixed-width binary representations. Here's the canonical mapping used in most edge telemetry systems:

Type Encoding Table

PLC TypeWire SizeByte OrderExample ValuePacked Bytes
BOOL1 byteN/Atrue01
INT81 byteN/A5537
UINT81 byteN/A1579D
INT162 bytesBig-endian-55FF C9
UINT162 bytesBig-endian3276880 00
INT324 bytesBig-endian-55FF FF FF C9
UINT324 bytesBig-endian4294967281FF FF FF F1
FLOAT324 bytesIEEE 754 BE1.553F C6 66 66
FLOAT324 bytesIEEE 754 BE-1.55BF C6 66 66

Byte Ordering: The Silent Data Corruptor

Byte ordering mismatches are the #1 cause of garbage data in IIoT deployments. Here's why:

  • Modbus registers are big-endian (MSB first) by specification
  • EtherNet/IP (CIP) uses little-endian for REAL/DINT values
  • Most ARM-based edge gateways are little-endian natively
  • x86 cloud servers are little-endian

When you read a float from a Modbus holding register, the bytes arrive big-endian. If your edge gateway is little-endian and you memcpy those bytes into a float without swapping, you get nonsense. A temperature of 42.5°C becomes 2.77e-41.

Best practice: Standardize on big-endian (network byte order) for the wire format, regardless of the source protocol. Convert at the edge before packing.

Array Values: When One Tag Returns Many

Some PLC tags return arrays — a recipe with 8 ingredient percentages, or a batch of 4 temperature zones. The binary format handles this with the array_size and element_size fields:

Tag ID:       0x0023 (Recipe Values)
Status: 0x00 (OK)
Array Size: 8
Element Size: 4 (float32)
Data: 32 bytes (8 × 4-byte floats)

The decoder knows to read exactly S × N bytes of payload. No delimiters, no parsing, no ambiguity.

Batch Collection Strategies

Raw tag values don't ship individually — they accumulate in batches. Two finalization triggers work together:

Size-Based Finalization

Set a maximum batch buffer size (e.g., 4,000 bytes). When accumulated data approaches this limit, seal the batch and queue it for transmission. This prevents any single batch from exceeding the MQTT payload limit or the available RAM on constrained devices.

Time-Based Finalization

Set a maximum collection window (e.g., 60 seconds). Even if the buffer isn't full, finalize and send. This guarantees a maximum data latency — critical for alarm values where you can't wait minutes for the buffer to fill.

The Hybrid Approach

In practice, both triggers run simultaneously:

while collecting:
read next tag value
add to current group

if group timestamp changed:
close current group
start new group

if batch_size > MAX_SIZE:
finalize and transmit

if elapsed_time > MAX_TIMEOUT:
finalize and transmit

A well-tuned system uses a 4 KB buffer with a 60-second timeout. During normal operation, the timeout triggers first (30–40 tags generate ~800 bytes per cycle). During high-frequency events (1-second alarm monitoring), the size limit triggers first, preventing memory exhaustion.

Change-Based Filtering: Don't Send What Hasn't Changed

Here's where binary encoding gets truly efficient. Many industrial values are stable for long periods — a setpoint stays at 450°F for an entire 8-hour shift, a pump status remains "running" for days.

Compare-and-Send Logic

For each tag, maintain a last_values cache. On each read cycle:

  1. Read the current value from the PLC
  2. Compare with the cached value
  3. Only add to the batch if the value changed (or if this is the first read)

Tags that change frequently (temperatures, pressures) still transmit every cycle. Tags that rarely change (setpoints, status flags) might transmit once per hour. The bandwidth savings compound dramatically:

ScenarioWithout FilteringWith FilteringSavings
40 tags, 60s interval, 8hr shift19,200 transmissions~4,800 transmissions75%
65 tags, 60s interval, 24hr shift93,600 transmissions~15,000 transmissions84%

Critical Tags: Skip the Compare

Some tags must transmit every cycle regardless — delivery temperatures during active process control, flow values during filling operations. Mark these as compare: false in your tag configuration. They always transmit, ensuring the cloud has a continuous time series for trending and anomaly detection.

Event-Driven Tags: Bypass Batching Entirely

Pump status changes, heater activation, vent opening — these are operational events that demand immediate delivery. Configure these tags with two flags:

  • compare: true (only transmit on change)
  • do_not_batch: true (send immediately, bypass the batch buffer)

The result: a pump going from OFF to ON at 14:32:07.342 arrives at the cloud within milliseconds, not 60 seconds later in the next batch. For alarm correlation, this timing precision is essential.

Calculated Tags: Edge-Side Bit Extraction

Many PLCs pack multiple boolean flags into a single 16-bit or 32-bit register. A status word might encode:

  • Bit 0: Pump running
  • Bit 1: Heater active
  • Bit 2: Low flow alarm
  • Bit 3: High temp alarm
  • Bits 4–7: Operating mode (0–15)

Rather than transmitting the raw 16-bit value and parsing it in the cloud, efficient edge systems extract individual bits at the edge and transmit them as separate boolean tags.

How It Works

Define a "calculated tag" that references a source tag with a bit mask and shift count:

Source tag: Status Word (uint16, ID: 100)
Calculated tag: Pump Running (bool, ID: 200)
→ shift: 0, mask: 0x01
Calculated tag: Heater Active (bool, ID: 201)
→ shift: 1, mask: 0x01
Calculated tag: Low Flow Alarm (bool, ID: 202)
→ shift: 2, mask: 0x01

The edge device reads the status word once, then derives multiple boolean tags without additional PLC reads. Each calculated tag inherits the source tag's polling interval and compare behavior — but the comparison happens on the derived boolean value, not the raw word.

This is especially valuable for Modbus devices where each register read consumes bus time. Reading one register and extracting 8 booleans is 8x faster than reading 8 coils individually.

Practical Binary Encoding Example

Let's walk through encoding a real-world data frame. Suppose an HE Central Chiller (device type 1018) reports three values at timestamp 1609459200:

Tag 1: Tank Temperature = 42 (int16) Tag 2: CQT 1 Approach Temp = 18 (int16)
Tag 40: Pump Status changed from 0→1 (bool, event-driven)

The batch frame:

F7                          Command: tag values
00 00 00 01 1 group

60 06 0A 93 Timestamp: 1609459200
13 88 Device type: 0x1388 (decimal 5000)
00 Production year
00 Production month
30 39 Production number: 12345
00 00 00 02 2 values (pump status sent separately)

00 01 Tag ID: 1 (Tank Temperature)
00 Status: OK
01 Array size: 1
02 Element size: 2 (int16)
00 2A Value: 42

00 02 Tag ID: 2 (CQT 1 Approach Temp)
00 Status: OK
01 Array size: 1
02 Element size: 2 (int16)
00 12 Value: 18

Total: 35 bytes. The equivalent JSON would be ~250 bytes.

The pump status change (Tag 40) is sent immediately as its own micro-frame, bypassing the batch buffer entirely:

F7 00 00 00 01 60 06 0A 93 13 88 00 00 30 39 00 00 00 01
00 28 00 01 01 01

24 bytes for an immediate pump status event. In JSON: ~120 bytes.

Serial Number Encoding: Packing Identity Into 4 Bytes

Device identity in IIoT systems needs to be compact but unique. A common scheme packs the manufacturing identity into exactly 4 bytes:

Byte 0: Year    (0x28 = 2010, 0x29 = 2011, 0x2A = 2012, ...)
Byte 1: Month (0x00 = Jan, 0x01 = Feb, ..., 0x0B = Dec)
Bytes 2-3: Unit (sequential manufacturing number, uint16)

Example: 0x002A0050 = January 2010, Unit #80

This encoding supports 65,536 units per month — more than enough for any manufacturing line — while fitting in a single 32-bit field. No GUIDs, no strings, no parsing.

When to Use Binary vs. JSON

Binary encoding isn't always the right choice. Here's the decision matrix:

FactorUse BinaryUse JSON
NetworkCellular, LoRa, satelliteEthernet, WiFi
Gateway RAM< 64 MB> 256 MB
Tag count per device> 20< 10
Polling interval≤ 60 seconds> 5 minutes
Cloud ingestionCustom parser availableGeneric REST API
ProtocolMQTT (binary payload)HTTP REST

For most production IIoT deployments — especially those using cellular gateways on the factory floor — binary encoding is the clear winner.

How machineCDN Handles Binary Telemetry

machineCDN's edge architecture uses binary encoding by default for all PLC telemetry. Tag values from Modbus RTU, Modbus TCP, and EtherNet/IP sources are packed into compact binary frames with temporal grouping, change-based filtering, and event-driven bypass for critical tags.

The platform handles type-aware encoding (bool through float32), automatic batch finalization (size + time triggers), and calculated tag extraction — all running on resource-constrained edge gateways with as little as 32 MB of RAM.

For teams building their own edge telemetry pipeline, the principles in this guide apply regardless of platform. But if you want this solved out of the box, machineCDN's edge agent handles the encoding, buffering, and delivery so your team can focus on what the data means rather than how it moves.

Key Takeaways

  1. JSON overhead is 10–13x for typical industrial telemetry — switch to binary encoding for cellular/constrained networks
  2. Temporal grouping amortizes device context across multiple tag values in the same read cycle
  3. Change-based filtering eliminates 75–85% of redundant transmissions for stable process values
  4. Event-driven tags bypass batching for immediate delivery of critical status changes
  5. Calculated tags extract multiple booleans from status words at the edge, reducing PLC bus traffic
  6. Standardize on big-endian wire format to avoid byte-ordering nightmares across protocols
  7. Dual finalization triggers (size + time) balance latency against bandwidth efficiency

The factory floor generates data at the edge. How you encode and move that data determines whether your IIoT deployment scales to 10 machines or 10,000.