Static vs Dynamic Languages, JSON vs Protobuf, Schema, and Type Erasure
- 1. Overview
- 2. Static vs Dynamic Languages (Deep Runtime Model)
- 3. Dynamic Typing and Type Erasure
- 4. Reflection
- 5. Schema-less vs Schema-driven Data
- 6. JSON vs Protobuf Through Runtime Models
- 7. Why Python Still Generates Protobuf Code
- 8. Relationship Between Concepts
- 9. Unified Mental Model
- 10. Real-World Engineering Implications
- 11. Key Insight (Final Takeaway)
1. Overview
Modern software discussions often mix several related but distinct concepts:
- Static vs dynamic languages — where type information lives (compile time vs runtime)
- Static vs dynamic typing — when type checks happen
- Schema-driven vs schema-less data — whether structure is predefined
- Reflection and runtime type information — inspecting structure at runtime
- Type erasure — hiding multiple types behind one runtime representation
- JSON vs Protobuf — serialization formats with different design philosophies
Dynamic typing vs dynamic language: Dynamic typing is a type-system property: when type checks (or type resolution) happen — at runtime. Variables can refer to values of different types over time; type errors appear when the code runs. Dynamic language is an implementation/runtime property: types exist as runtime data. Values carry type information in memory; the runtime uses it for dispatch and interpretation. In practice, “dynamic typing language” and “dynamic language” refer to the same set of languages (e.g. Python, JavaScript, Ruby). It is correct to say that a dynamically typed language is a dynamic language — they usually go together; the two phrases simply emphasize different aspects (type system vs runtime model).
2. Static vs Dynamic Languages (Deep Runtime Model)
2.1 Static Language (e.g., C++)
Key idea: Types primarily exist at compile time.
Example:
int x = 10;
Compile-time:
- Compiler knows type size and layout.
- Generates machine code specialized for
int.
Runtime:
- Raw memory bytes only. The CPU does not know this is an
int. - Type meaning exists in the compiled instructions, not in runtime objects.
Properties:
- Memory layout fixed
- Operations specialized during compilation
- Minimal runtime metadata
- High performance
2.2 Dynamic Language (e.g., Python)
Key idea: Types exist as runtime objects.
Example:
x = 10
Runtime object (conceptually):
PyObject {
type_pointer -> PyInt_Type
value = 10
}
The runtime stores:
- The value
- Type information
- Behavior metadata
Operations are resolved dynamically: inspect type → dispatch correct operation.
2.3 Core Difference
| Static language | Dynamic language |
|---|---|
| Types guide code generation | Types are runtime data |
3. Dynamic Typing and Type Erasure
3.1 What is Type Erasure?
References (type erasure series):
Type erasure means: multiple types are hidden behind a single runtime representation.
Examples:
- Python objects
std::any,std::variant(C++)nlohmann::json
Example in C++:
nlohmann::json value;
Internally: an enum type tag plus a union for storage. This simulates dynamic typing inside static C++.
3.2 Dynamic Languages Internally Use Type Erasure
Even Python:
- Stores objects behind generic pointers
- Uses runtime type tags
- Performs dynamic dispatch
So: dynamic typing = type erasure + runtime metadata.
4. Reflection
Reflection = ability to inspect structure at runtime.
Dynamic languages: native reflection; types already exist as runtime objects.
type(x)
dir(obj)
Static languages: limited reflection; metadata must be stored manually (e.g. RTTI in C++, Protobuf descriptors, serialization libraries).
5. Schema-less vs Schema-driven Data
5.1 Schema-less (JSON)
JSON does not require a predefined structure.
{"name": "Alice", "age": 30}
The parser builds generic structures (dict, list, numbers). Structure is determined at runtime.
5.2 Schema-driven (Protobuf)
Protobuf requires a .proto schema.
message Person {
string name = 1;
}
Benefits: fixed field IDs, known types, efficient binary encoding.
6. JSON vs Protobuf Through Runtime Models
6.1 JSON Parsing
The advantage of a dynamic typing language is clearest here. To handle JSON in C++ you either: (1) write static code for a specific JSON shape (fixed schema, no flexibility), or (2) use a library that implements a minimal dynamic typing system — e.g. nlohmann::json — so one type can represent any JSON structure; only (2) is a dynamic-style abstraction on top of static C++. In Python, the runtime already has dynamic types and dict/list literals; parsing and operating on arbitrary JSON is native. One parser and one code path handle all possible JSON without an extra “variant container” or codegen — the language itself is the dynamic system. Under the hood, both rely on type erasure: the Python interpreter is itself implemented in a static language (e.g. C), and it uses the same idea — a generic object representation with a type tag — as libraries like nlohmann::json do in C++. So at the implementation level, both are type-erased value types over a static foundation; the difference is that in Python that machinery is built into the language and runtime, whereas in C++ you opt in via a library.
6.2 Protobuf Parsing
Protobuf assumes schema-first design. Two approaches:
- Static (generated code):
.proto→ code generator → language-specific class with compiled-in metadata. Same idea in Python and C++. Benefits: faster access, type safety, IDE support. - Dynamic (reflection-based): Schema descriptors are loaded at runtime; messages are created and manipulated via reflection APIs. In Python this is straightforward because the runtime is already dynamic. In C++ it is possible via
DescriptorPoolandDynamicMessageFactory, but more involved.
Memory representation. The language divide matters here. In Python, both paths produce the same kind of runtime object — ordinary Python objects with attributes — so the underlying representation is the language’s dynamic type system either way. In C++, the two paths differ: static codegen yields a fixed, struct-like layout (known offsets, one type per message), while DynamicMessageFactory uses a separate, flexible representation (e.g. internal structures keyed by field) that can represent any message type. So in C++, static vs dynamic Protobuf implies genuinely different in-memory layouts; in Python it does not.
For schema-driven data, language matters little. To operate on Protobuf data (read/write fields, pass to functions), you either write code against a known structure — which means codegen in both Python and C++ — or you use reflection. Both languages support both options. So for Protobuf specifically, there is no fundamental advantage to using Python over C++; the same choice (codegen vs reflection) applies in either language. The real benefit of a dynamic language shows up with schema-less data (e.g. JSON, dicts), where one parser and one code path can handle arbitrary structure without codegen.
7. Why Python Still Generates Protobuf Code
For Protobuf, then, Python and C++ are in the same situation. Even though Python is dynamic, code generation still provides:
- Faster startup
- Precompiled descriptors
- Better tooling
- Static field access
- Version control stability
Important: Generated Python protobuf is still dynamic internally — it embeds descriptor metadata.
8. Relationship Between Concepts
| Concept | Role |
|---|---|
| Static vs dynamic language | Where type information lives: compile time vs runtime. |
| Type erasure | Implementation technique for dynamic behavior; dynamic languages use it internally; static languages emulate it via libraries. |
| Schema | Defines data structure independent of language. JSON: schema optional. Protobuf: schema required. |
| Reflection | Mechanism to access type metadata at runtime; easier when types are runtime objects. |
9. Unified Mental Model
Dynamic language runtime:
- Generic object system
-
- Runtime type metadata
-
- Dynamic dispatch
Static language runtime:
- Compiled machine instructions
-
- Minimal runtime type info
Libraries like Protobuf or JSON frameworks add dynamic capabilities to static languages by building their own metadata systems.
10. Real-World Engineering Implications
Static-style systems (e.g. real-time software, high-performance pipelines, autonomous driving perception):
- Predictable memory layout
- Deterministic execution
Dynamic-style systems (e.g. tooling, scripting, simulation, data exploration): even in Python, generated Protobuf code is often preferred when the schema is fixed (see §7). The advantages of dynamic languages and schema-less or ad-hoc data show up when:
- Schema is unknown or evolving — exploratory data, ad-hoc APIs, configs, logs: use JSON/dicts without codegen; one parser handles any structure.
- Fast iteration matters — scripting, REPL, notebooks, plugins: edit and run without recompile; no build step.
- Many heterogeneous shapes — tooling that consumes multiple formats (JSON, YAML, partial or optional Protobuf): a single code path can handle varying structure instead of one generated type per schema.
- Performance is secondary — simulation, glue code, data exploration: developer speed and flexibility outweigh the cost of runtime type resolution and less predictable layout.
11. Key Insight (Final Takeaway)
Dynamic vs static is not about the ability to load code at runtime.
Instead:
- Dynamic languages treat types as runtime data.
- Static languages treat types as compile-time instructions.
Everything else — JSON parsing, Protobuf generation, reflection complexity — flows from this difference.