Binary Telemetry Encoding for IIoT: Why JSON Is Killing Your Bandwidth [2026]

March 1, 2026 · 11 min read

If you're sending PLC tag values as JSON from edge gateways to the cloud, you're wasting 80–90% of your bandwidth. On a cellular-connected factory floor with dozens of machines, that's the difference between a $50/month data plan and a $500/month one — and the difference between sub-second telemetry and multi-second lag.

This guide breaks down binary telemetry encoding: how to pack industrial data efficiently at the edge, preserve type fidelity across the wire, and design batch grouping strategies that survive unreliable networks.

Binary telemetry encoding for IIoT edge devices

The JSON Problem in Industrial Telemetry

Consider a simple temperature reading from a Modbus-connected chiller:

{
  "timestamp": 1709251200,
  "device_type": 1018,
  "serial": "0x00300039",
  "tags": [
    {"id": 1, "name": "Tank Temperature", "value": 42.5, "status": "ok"},
    {"id": 2, "name": "CQT 1 Approach Temp", "value": 18.3, "status": "ok"},
    {"id": 3, "name": "CQT 1 Chill In Temp", "value": 22.1, "status": "ok"}
  ]
}

That's ~320 bytes for three sensor readings. The actual data — three 16-bit integers — is 6 bytes. You're paying for 314 bytes of structural overhead on every single transmission.

On a typical plastics manufacturing floor running 12 machines with 30–40 tags each at 60-second intervals, that's roughly 14 MB/day in pure JSON overhead. Over cellular, that adds up fast.

The Real Cost Breakdown

Component	JSON Size	Binary Size	Overhead
Field names ("timestamp", "device_type", etc.)	~120 bytes	0 bytes	∞
Brackets, quotes, colons	~80 bytes	0 bytes	∞
Numeric values as ASCII strings	~40 bytes	14 bytes	2.8x
Status strings	~30 bytes	3 bytes	10x
Total for 3 tags	~320 bytes	~25 bytes	12.8x

Binary encoding eliminates all structural overhead. Field names, quotes, brackets, and colons vanish. What remains is pure data with minimal framing.

Designing a Binary Wire Format

A good binary telemetry format for IIoT needs four properties:

Self-describing enough to decode without external schema
Compact enough to fit cellular/LoRa constraints
Groupable so multiple readings share context (timestamps, device IDs)
Error-tolerant so partial corruption doesn't destroy entire payloads

The Anatomy of a Binary Telemetry Frame

Here's a proven structure used in production edge systems:

┌─────────────────────────────────────────────────┐
│ Command Byte          (1 byte)                  │
│ Number of Groups      (4 bytes, uint32)         │
├─────────────────────────────────────────────────┤
│ ┌─── Group 1 ───────────────────────────────┐   │
│ │ Timestamp           (4 bytes, uint32)     │   │
│ │ Device Type         (2 bytes, uint16)     │   │
│ │ Production Year     (1 byte)              │   │
│ │ Production Month    (1 byte)              │   │
│ │ Production Number   (2 bytes, uint16)     │   │
│ │ Number of Values    (4 bytes, uint32)     │   │
│ │ ┌─── Value 1 ────────────────────────┐    │   │
│ │ │ Tag ID            (2 bytes, uint16)│    │   │
│ │ │ Status            (1 byte)         │    │   │
│ │ │ Array Size (S)    (1 byte)         │    │   │
│ │ │ Element Size (N)  (1 byte)         │    │   │
│ │ │ Data              (S×N bytes)      │    │   │
│ │ └────────────────────────────────────┘    │   │
│ │ ┌─── Value 2 ────────────────────────┐    │   │
│ │ │ ...                                │    │   │
│ │ └────────────────────────────────────┘    │   │
│ └───────────────────────────────────────────┘   │
│ ┌─── Group 2 ───────────────────────────────┐   │
│ │ ...                                       │   │
│ └───────────────────────────────────────────┘   │
└─────────────────────────────────────────────────┘

Why Groups Matter

The key insight is temporal grouping. Multiple tag values read at the same moment share a single timestamp and device identifier. Instead of repeating context with every value, you amortize it across an entire group.

In a typical chiller with 65+ temperature and pressure tags all polled in the same cycle, this saves ~10 bytes per tag — over 650 bytes per polling cycle.

Status Bytes Save Bandwidth Too

Notice the status byte (0x00 = OK, anything else = error code). When a tag read fails, you transmit just the 3-byte header (ID + status) instead of the full value. Failed reads are common on noisy factory floors — serial bus contention, PLC CPU overload, intermittent wiring. A binary format handles this gracefully without the awkward "value": null, "error": "timeout" constructs of JSON.

Data Type Packing: Getting the Bytes Right

Industrial data types map directly to fixed-width binary representations. Here's the canonical mapping used in most edge telemetry systems:

Type Encoding Table

PLC Type	Wire Size	Byte Order	Example Value	Packed Bytes
BOOL	1 byte	N/A	true	`01`
INT8	1 byte	N/A	55	`37`
UINT8	1 byte	N/A	157	`9D`
INT16	2 bytes	Big-endian	-55	`FF C9`
UINT16	2 bytes	Big-endian	32768	`80 00`
INT32	4 bytes	Big-endian	-55	`FF FF FF C9`
UINT32	4 bytes	Big-endian	4294967281	`FF FF FF F1`
FLOAT32	4 bytes	IEEE 754 BE	1.55	`3F C6 66 66`
FLOAT32	4 bytes	IEEE 754 BE	-1.55	`BF C6 66 66`

Byte Ordering: The Silent Data Corruptor

Byte ordering mismatches are the #1 cause of garbage data in IIoT deployments. Here's why:

Modbus registers are big-endian (MSB first) by specification
EtherNet/IP (CIP) uses little-endian for REAL/DINT values
Most ARM-based edge gateways are little-endian natively
x86 cloud servers are little-endian

When you read a float from a Modbus holding register, the bytes arrive big-endian. If your edge gateway is little-endian and you memcpy those bytes into a float without swapping, you get nonsense. A temperature of 42.5°C becomes 2.77e-41.

Best practice: Standardize on big-endian (network byte order) for the wire format, regardless of the source protocol. Convert at the edge before packing.

Array Values: When One Tag Returns Many

Some PLC tags return arrays — a recipe with 8 ingredient percentages, or a batch of 4 temperature zones. The binary format handles this with the array_size and element_size fields:

Tag ID:       0x0023 (Recipe Values)
Status:       0x00 (OK)
Array Size:   8
Element Size: 4 (float32)
Data:         32 bytes (8 × 4-byte floats)

The decoder knows to read exactly S × N bytes of payload. No delimiters, no parsing, no ambiguity.

Batch Collection Strategies

Raw tag values don't ship individually — they accumulate in batches. Two finalization triggers work together:

Size-Based Finalization

Set a maximum batch buffer size (e.g., 4,000 bytes). When accumulated data approaches this limit, seal the batch and queue it for transmission. This prevents any single batch from exceeding the MQTT payload limit or the available RAM on constrained devices.

Time-Based Finalization

Set a maximum collection window (e.g., 60 seconds). Even if the buffer isn't full, finalize and send. This guarantees a maximum data latency — critical for alarm values where you can't wait minutes for the buffer to fill.

The Hybrid Approach

In practice, both triggers run simultaneously:

while collecting:
    read next tag value
    add to current group
    
    if group timestamp changed:
        close current group
        start new group
    
    if batch_size > MAX_SIZE:
        finalize and transmit
    
    if elapsed_time > MAX_TIMEOUT:
        finalize and transmit

A well-tuned system uses a 4 KB buffer with a 60-second timeout. During normal operation, the timeout triggers first (30–40 tags generate ~800 bytes per cycle). During high-frequency events (1-second alarm monitoring), the size limit triggers first, preventing memory exhaustion.

Change-Based Filtering: Don't Send What Hasn't Changed

Here's where binary encoding gets truly efficient. Many industrial values are stable for long periods — a setpoint stays at 450°F for an entire 8-hour shift, a pump status remains "running" for days.

Compare-and-Send Logic

For each tag, maintain a last_values cache. On each read cycle:

Read the current value from the PLC
Compare with the cached value
Only add to the batch if the value changed (or if this is the first read)

Tags that change frequently (temperatures, pressures) still transmit every cycle. Tags that rarely change (setpoints, status flags) might transmit once per hour. The bandwidth savings compound dramatically:

Scenario	Without Filtering	With Filtering	Savings
40 tags, 60s interval, 8hr shift	19,200 transmissions	~4,800 transmissions	75%
65 tags, 60s interval, 24hr shift	93,600 transmissions	~15,000 transmissions	84%

Critical Tags: Skip the Compare

Some tags must transmit every cycle regardless — delivery temperatures during active process control, flow values during filling operations. Mark these as compare: false in your tag configuration. They always transmit, ensuring the cloud has a continuous time series for trending and anomaly detection.

Event-Driven Tags: Bypass Batching Entirely

Pump status changes, heater activation, vent opening — these are operational events that demand immediate delivery. Configure these tags with two flags:

compare: true (only transmit on change)
do_not_batch: true (send immediately, bypass the batch buffer)

The result: a pump going from OFF to ON at 14:32:07.342 arrives at the cloud within milliseconds, not 60 seconds later in the next batch. For alarm correlation, this timing precision is essential.

Calculated Tags: Edge-Side Bit Extraction

Many PLCs pack multiple boolean flags into a single 16-bit or 32-bit register. A status word might encode:

Bit 0: Pump running
Bit 1: Heater active
Bit 2: Low flow alarm
Bit 3: High temp alarm
Bits 4–7: Operating mode (0–15)

Rather than transmitting the raw 16-bit value and parsing it in the cloud, efficient edge systems extract individual bits at the edge and transmit them as separate boolean tags.

How It Works

Define a "calculated tag" that references a source tag with a bit mask and shift count:

Source tag: Status Word (uint16, ID: 100)
Calculated tag: Pump Running (bool, ID: 200)
  → shift: 0, mask: 0x01
Calculated tag: Heater Active (bool, ID: 201)
  → shift: 1, mask: 0x01
Calculated tag: Low Flow Alarm (bool, ID: 202)
  → shift: 2, mask: 0x01

The edge device reads the status word once, then derives multiple boolean tags without additional PLC reads. Each calculated tag inherits the source tag's polling interval and compare behavior — but the comparison happens on the derived boolean value, not the raw word.

This is especially valuable for Modbus devices where each register read consumes bus time. Reading one register and extracting 8 booleans is 8x faster than reading 8 coils individually.

Practical Binary Encoding Example

Let's walk through encoding a real-world data frame. Suppose an HE Central Chiller (device type 1018) reports three values at timestamp 1609459200:

Tag 1: Tank Temperature = 42 (int16) Tag 2: CQT 1 Approach Temp = 18 (int16)
Tag 40: Pump Status changed from 0→1 (bool, event-driven)

The batch frame:

F7                          Command: tag values
00 00 01                 1 group

06 0A 93                 Timestamp: 1609459200
88                       Device type: 0x1388 (decimal 5000)
                        Production year
                        Production month
39                       Production number: 12345
00 00 02                 2 values (pump status sent separately)

01                       Tag ID: 1 (Tank Temperature)
                        Status: OK
                        Array size: 1
                        Element size: 2 (int16)
2A                       Value: 42

02                       Tag ID: 2 (CQT 1 Approach Temp)
                        Status: OK
                        Array size: 1
                        Element size: 2 (int16)
12                       Value: 18

Total: 35 bytes. The equivalent JSON would be ~250 bytes.

The pump status change (Tag 40) is sent immediately as its own micro-frame, bypassing the batch buffer entirely:

F7 00 00 00 01 60 06 0A 93 13 88 00 00 30 39 00 00 00 01
00 28 00 01 01 01

24 bytes for an immediate pump status event. In JSON: ~120 bytes.

Serial Number Encoding: Packing Identity Into 4 Bytes

Device identity in IIoT systems needs to be compact but unique. A common scheme packs the manufacturing identity into exactly 4 bytes:

Byte 0: Year    (0x28 = 2010, 0x29 = 2011, 0x2A = 2012, ...)
Byte 1: Month   (0x00 = Jan, 0x01 = Feb, ..., 0x0B = Dec)
Bytes 2-3: Unit (sequential manufacturing number, uint16)

Example: 0x002A0050 = January 2010, Unit #80

This encoding supports 65,536 units per month — more than enough for any manufacturing line — while fitting in a single 32-bit field. No GUIDs, no strings, no parsing.

When to Use Binary vs. JSON

Binary encoding isn't always the right choice. Here's the decision matrix:

Factor	Use Binary	Use JSON
Network	Cellular, LoRa, satellite	Ethernet, WiFi
Gateway RAM	< 64 MB	> 256 MB
Tag count per device	> 20	< 10
Polling interval	≤ 60 seconds	> 5 minutes
Cloud ingestion	Custom parser available	Generic REST API
Protocol	MQTT (binary payload)	HTTP REST

For most production IIoT deployments — especially those using cellular gateways on the factory floor — binary encoding is the clear winner.

How machineCDN Handles Binary Telemetry

machineCDN's edge architecture uses binary encoding by default for all PLC telemetry. Tag values from Modbus RTU, Modbus TCP, and EtherNet/IP sources are packed into compact binary frames with temporal grouping, change-based filtering, and event-driven bypass for critical tags.

The platform handles type-aware encoding (bool through float32), automatic batch finalization (size + time triggers), and calculated tag extraction — all running on resource-constrained edge gateways with as little as 32 MB of RAM.

For teams building their own edge telemetry pipeline, the principles in this guide apply regardless of platform. But if you want this solved out of the box, machineCDN's edge agent handles the encoding, buffering, and delivery so your team can focus on what the data means rather than how it moves.

Key Takeaways

JSON overhead is 10–13x for typical industrial telemetry — switch to binary encoding for cellular/constrained networks
Temporal grouping amortizes device context across multiple tag values in the same read cycle
Change-based filtering eliminates 75–85% of redundant transmissions for stable process values
Event-driven tags bypass batching for immediate delivery of critical status changes
Calculated tags extract multiple booleans from status words at the edge, reducing PLC bus traffic
Standardize on big-endian wire format to avoid byte-ordering nightmares across protocols
Dual finalization triggers (size + time) balance latency against bandwidth efficiency

The factory floor generates data at the edge. How you encode and move that data determines whether your IIoT deployment scales to 10 machines or 10,000.

The JSON Problem in Industrial Telemetry​

The Real Cost Breakdown​

Designing a Binary Wire Format​

The Anatomy of a Binary Telemetry Frame​

Why Groups Matter​

Status Bytes Save Bandwidth Too​

Data Type Packing: Getting the Bytes Right​

Type Encoding Table​

Byte Ordering: The Silent Data Corruptor​

Array Values: When One Tag Returns Many​

Batch Collection Strategies​

Size-Based Finalization​

Time-Based Finalization​

The Hybrid Approach​

Change-Based Filtering: Don't Send What Hasn't Changed​

Compare-and-Send Logic​

Critical Tags: Skip the Compare​

Event-Driven Tags: Bypass Batching Entirely​

Calculated Tags: Edge-Side Bit Extraction​

How It Works​

Practical Binary Encoding Example​

Serial Number Encoding: Packing Identity Into 4 Bytes​

When to Use Binary vs. JSON​

How machineCDN Handles Binary Telemetry​

Key Takeaways​