chore: Automate data model generation from upcoming CycloneDX 2.0 modularized specification


### Description  

Currently, the data models in this library are largely written and maintained manually. While this approach has worked so far, it is time-consuming and requires significant effort for both implementation and review. This effort could be better invested in feature development and bug fixing.

With the upcoming CycloneDX 2.0 specification, a **modularized** and machine-readable format will be introduced. This presents an opportunity to rethink how data models are created and maintained in this project.

Reference (work in progress): 
- PR <https://github.com/CycloneDX/cyclonedx-python-lib/issues>
- modularized schema <https://github.com/CycloneDX/specification/tree/2.0-dev/schema/2.0/model>

---

### Problem  

- Data models are mostly handwritten  
- High maintenance overhead  
- Repetitive work for contributors  
- Slows down development velocity due to review effort  

---

### Proposal  

Leverage the machine-readable specification planned for CycloneDX 2.0 to introduce **static code generation** for data models.

This would involve:  

- Parsing the official CycloneDX specification (once available in machine-readable form)  
- Generating Python data models automatically  
- Integrating generation into the build or release process  
- Minimizing manual intervention for future spec updates  

There have already been proof-of-concept implementations demonstrating that automated generation of data models from the specification is feasible. These approaches should be revisited, consolidated, and applied as part of this effort.


Pipeline:

```text
CycloneDX JSON Schema
        ↓
Preprocessing (if needed)
        ↓
Code Generation (datamodel-code-generator)
        ↓
Post-processing (formatting, adjustments)
        ↓
Generated Python Models
```

---

### Possible Tools / Libraries  

The following tools could be evaluated as part of this effort:

- datamodel-code-generator — MIT  
  https://pypi.org/project/datamodel-code-generator/  

- pydantic — MIT  
  https://pypi.org/project/pydantic/  

- dataclasses-json — MIT  
  https://pypi.org/project/dataclasses-json/  

- dacite — MIT  
  https://pypi.org/project/dacite/  

- marshmallow — MIT  
  https://pypi.org/project/marshmallow/  

- marshmallow-jsonschema — MIT  
  https://pypi.org/project/marshmallow-jsonschema/  

- jsonschema (validation, not models) — MIT  
  https://pypi.org/project/jsonschema/  

- quicktype — Apache 2.0  
  https://pypi.org/project/quicktype/  

- genson (schema generator, reverse direction) — MIT  
  https://pypi.org/project/genson/  

---

### Community Input  

Community discussions have already suggested evaluating tools such as:

- datamodel-code-generator  
- json-schema-to-pydantic (https://pypi.org/project/json-schema-to-pydantic/)  
- jambo (https://pypi.org/project/jambo/)  

and 
- de/serialization with cattrs (https://pypi.org/project/cattrs/) 
  - example: https://github.com/CycloneDX/cyclonedx-python-lib/pull/934 

These should be considered as primary candidates during evaluation.

see discussions: 
- https://cyclonedx.slack.com/archives/CVA0QJEVA/p1769793218405249

---

### Expected Benefits  

- Significant reduction in maintenance effort  
- Improved consistency across models  
- Faster adoption of new specification versions  
- More time available for feature development and bug fixing  

---

### Considerations / Open Questions  

- What format will the machine-readable spec be published in (e.g., JSON Schema, OpenAPI, etc.)?  
  - JSON Schema it is
- Should generated code be committed or generated at build time?  
  - decision: generated before build time, and commited to the repo
- How to handle custom logic or extensions on top of generated models?  
- Backward compatibility with CycloneDX 1.x  
  -  easy path: breaking change in the library, and only support 2.0 from then on 

---

### Additional Context  

This proposal aligns with the direction of CycloneDX 2.0, which aims to make the specification more modular and tooling-friendly. Taking advantage of this early could significantly improve long-term maintainability of this library.

---

**Note:** This issue is intended as a meta-ticket to collect related subtasks and track overall implementation efforts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: Automate data model generation from upcoming CycloneDX 2.0 modularized specification #955

Description

Problem

Proposal

Possible Tools / Libraries

Community Input

Expected Benefits

Considerations / Open Questions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

chore: Automate data model generation from upcoming CycloneDX 2.0 modularized specification #955

Description

Description

Problem

Proposal

Possible Tools / Libraries

Community Input

Expected Benefits

Considerations / Open Questions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions