Static vs Dynamic Languages, JSON vs Protobuf, Schema, and Type Erasure

1. Overview
2. Static vs Dynamic Languages (Deep Runtime Model)
3. Dynamic Typing and Type Erasure
- 3.1 What is Type Erasure?
- 3.2 Dynamic Languages Internally Use Type Erasure
4. Reflection
5. Schema-less vs Schema-driven Data
- 5.1 Schema-less (JSON)
- 5.2 Schema-driven (Protobuf)
6. JSON vs Protobuf Through Runtime Models
- 6.1 JSON Parsing
- 6.2 Protobuf Parsing
7. Why Python Still Generates Protobuf Code
8. Relationship Between Concepts
9. Unified Mental Model
10. Real-World Engineering Implications
11. Key Insight (Final Takeaway)

1. Overview

Modern software discussions often mix several related but distinct concepts:

Static vs dynamic languages — where type information lives (compile time vs runtime)
Static vs dynamic typing — when type checks happen
Schema-driven vs schema-less data — whether structure is predefined
Reflection and runtime type information — inspecting structure at runtime
Type erasure — hiding multiple types behind one runtime representation
JSON vs Protobuf — serialization formats with different design philosophies

Dynamic typing vs dynamic language: Dynamic typing is a type-system property: when type checks (or type resolution) happen — at runtime. Variables can refer to values of different types over time; type errors appear when the code runs. Dynamic language is an implementation/runtime property: types exist as runtime data. Values carry type information in memory; the runtime uses it for dispatch and interpretation. In practice, “dynamic typing language” and “dynamic language” refer to the same set of languages (e.g. Python, JavaScript, Ruby). It is correct to say that a dynamically typed language is a dynamic language — they usually go together; the two phrases simply emphasize different aspects (type system vs runtime model).

2. Static vs Dynamic Languages (Deep Runtime Model)

2.1 Static Language (e.g., C++)

Key idea: Types primarily exist at compile time.

Example:

int x = 10;

Compile-time:

Compiler knows type size and layout.
Generates machine code specialized for int.

Runtime:

Raw memory bytes only. The CPU does not know this is an int.
Type meaning exists in the compiled instructions, not in runtime objects.

Properties:

Memory layout fixed
Operations specialized during compilation
Minimal runtime metadata
High performance

2.2 Dynamic Language (e.g., Python)

Key idea: Types exist as runtime objects.

Example:

x = 10

Runtime object (conceptually):

PyObject {
    type_pointer -> PyInt_Type
    value = 10
}

The runtime stores:

The value
Type information
Behavior metadata

Operations are resolved dynamically: inspect type → dispatch correct operation.

2.3 Core Difference

Static language	Dynamic language
Types guide code generation	Types are runtime data

3. Dynamic Typing and Type Erasure

3.1 What is Type Erasure?

References (type erasure series):

Type erasure means: multiple types are hidden behind a single runtime representation.

Examples:

Python objects
std::any, std::variant (C++)
nlohmann::json

Example in C++:

nlohmann::json value;

Internally: an enum type tag plus a union for storage. This simulates dynamic typing inside static C++.

3.2 Dynamic Languages Internally Use Type Erasure

Even Python:

Stores objects behind generic pointers
Uses runtime type tags
Performs dynamic dispatch

So: dynamic typing = type erasure + runtime metadata.

4. Reflection

Reflection = ability to inspect structure at runtime.

Dynamic languages: native reflection; types already exist as runtime objects.

type(x)
dir(obj)

Static languages: limited reflection; metadata must be stored manually (e.g. RTTI in C++, Protobuf descriptors, serialization libraries).

5. Schema-less vs Schema-driven Data

5.1 Schema-less (JSON)

JSON does not require a predefined structure.

{"name": "Alice", "age": 30}

The parser builds generic structures (dict, list, numbers). Structure is determined at runtime.

5.2 Schema-driven (Protobuf)

Protobuf requires a .proto schema.

message Person {
  string name = 1;
}

Benefits: fixed field IDs, known types, efficient binary encoding.

6. JSON vs Protobuf Through Runtime Models

6.1 JSON Parsing

The advantage of a dynamic typing language is clearest here. To handle JSON in C++ you either: (1) write static code for a specific JSON shape (fixed schema, no flexibility), or (2) use a library that implements a minimal dynamic typing system — e.g. nlohmann::json — so one type can represent any JSON structure; only (2) is a dynamic-style abstraction on top of static C++. In Python, the runtime already has dynamic types and dict/list literals; parsing and operating on arbitrary JSON is native. One parser and one code path handle all possible JSON without an extra “variant container” or codegen — the language itself is the dynamic system. Under the hood, both rely on type erasure: the Python interpreter is itself implemented in a static language (e.g. C), and it uses the same idea — a generic object representation with a type tag — as libraries like nlohmann::json do in C++. So at the implementation level, both are type-erased value types over a static foundation; the difference is that in Python that machinery is built into the language and runtime, whereas in C++ you opt in via a library.

6.2 Protobuf Parsing

Protobuf assumes schema-first design. Two approaches:

Static (generated code): .proto → code generator → language-specific class with compiled-in metadata. Same idea in Python and C++. Benefits: faster access, type safety, IDE support.
Dynamic (reflection-based): Schema descriptors are loaded at runtime; messages are created and manipulated via reflection APIs. In Python this is straightforward because the runtime is already dynamic. In C++ it is possible via DescriptorPool and DynamicMessageFactory, but more involved.

Memory representation. The language divide matters here. In Python, both paths produce the same kind of runtime object — ordinary Python objects with attributes — so the underlying representation is the language’s dynamic type system either way. In C++, the two paths differ: static codegen yields a fixed, struct-like layout (known offsets, one type per message), while DynamicMessageFactory uses a separate, flexible representation (e.g. internal structures keyed by field) that can represent any message type. So in C++, static vs dynamic Protobuf implies genuinely different in-memory layouts; in Python it does not.

For schema-driven data, language matters little. To operate on Protobuf data (read/write fields, pass to functions), you either write code against a known structure — which means codegen in both Python and C++ — or you use reflection. Both languages support both options. So for Protobuf specifically, there is no fundamental advantage to using Python over C++; the same choice (codegen vs reflection) applies in either language. The real benefit of a dynamic language shows up with schema-less data (e.g. JSON, dicts), where one parser and one code path can handle arbitrary structure without codegen.

7. Why Python Still Generates Protobuf Code

For Protobuf, then, Python and C++ are in the same situation. Even though Python is dynamic, code generation still provides:

Faster startup
Precompiled descriptors
Better tooling
Static field access
Version control stability

Important: Generated Python protobuf is still dynamic internally — it embeds descriptor metadata.

8. Relationship Between Concepts

Concept	Role
Static vs dynamic language	Where type information lives: compile time vs runtime.
Type erasure	Implementation technique for dynamic behavior; dynamic languages use it internally; static languages emulate it via libraries.
Schema	Defines data structure independent of language. JSON: schema optional. Protobuf: schema required.
Reflection	Mechanism to access type metadata at runtime; easier when types are runtime objects.

9. Unified Mental Model

Dynamic language runtime:

Generic object system
- Runtime type metadata
- Dynamic dispatch

Static language runtime:

Compiled machine instructions
- Minimal runtime type info

Libraries like Protobuf or JSON frameworks add dynamic capabilities to static languages by building their own metadata systems.

10. Real-World Engineering Implications

Static-style systems (e.g. real-time software, high-performance pipelines, autonomous driving perception):

Predictable memory layout
Deterministic execution

Dynamic-style systems (e.g. tooling, scripting, simulation, data exploration): even in Python, generated Protobuf code is often preferred when the schema is fixed (see §7). The advantages of dynamic languages and schema-less or ad-hoc data show up when:

Schema is unknown or evolving — exploratory data, ad-hoc APIs, configs, logs: use JSON/dicts without codegen; one parser handles any structure.
Fast iteration matters — scripting, REPL, notebooks, plugins: edit and run without recompile; no build step.
Many heterogeneous shapes — tooling that consumes multiple formats (JSON, YAML, partial or optional Protobuf): a single code path can handle varying structure instead of one generated type per schema.
Performance is secondary — simulation, glue code, data exploration: developer speed and flexibility outweigh the cost of runtime type resolution and less predictable layout.

11. Key Insight (Final Takeaway)

Dynamic vs static is not about the ability to load code at runtime.

Instead:

Dynamic languages treat types as runtime data.
Static languages treat types as compile-time instructions.

Everything else — JSON parsing, Protobuf generation, reflection complexity — flows from this difference.