Software
Communicate with software experts on product changes and improvements
16 Discussions

The Origins of the Device Modeling Language (DML)

Jakob_Engblom
Employee
0 0 1,814

As mentioned in a previous blog post about the open-sourcing of the Device Modeling Language (DML), DML 1.0 was released as part of version 3.0 of the Simics® simulator in 2005. However, that was the end of a process that began as early as 2001. Digging back in the archives, it turns out we had the original idea for what would later become DML. It is quite instructive to look at the initial plan and compare to what was eventually shipped.

The Original Concept: DevGen

The original concept was called Device Generator or “DevGen”, analogous to the SimGen system used to generate instruction set simulators for the Simics simulator. It was becoming clear that there were some recurring issues in device models that could be made more efficient.

For example, checkpointing of device state was almost always broken since modelers had to code the attributes manually for each and every register by hand. Each attribute requires quite a few lines of boring boilerplate code to declare it and provide get and set functions. This kind of code can obviously be generated and should be generated (see below for a detailed example).

The original plan, as saved in the source code archives, looked like this (the contents of the actual text document):

Generate code for register access:
----------------------------------
M - Mandatory
O - Optional

* A register needs at least one fields.

* For a register bank specify:
+ [M] Name
+ [M] Overlapping access to register ok?

* For a register specify:
+ [M] Name
+ [M] Address (offset)
+ [M] Size
+ [O] Partial access ok?
+ [O] Don't generate access code
+ [O] Configuration attribute?
+ [O] Debug - category (new), level, string, log always/changes

* For a register field specify: (simplify if only field in register?)
+ [M] Name
+ [M] Position in register
+ [M] RO/RW/WO/W1C/CoR/Trig/...
+ [M] Reset value (hard, soft)
+ [O] Configuration attribute support?
+ [O] Debug - category (new), level, string, log always/changes
+ [O] Way to specify conditional RO
+ [O] Field that must be written with specified value (0 or 1 typically).
+ [O] Signed

* How to support split fields? (including fields with parts in different regs).

* Support special endian handling?

* Some way to specify actions that are triggered by a register access.

Generated code include:
* read/write access code for registers
* checks to verify that accesses are ok (warn if not)
* debug_log() calls
* configuration attribute get/set functions
* Register reset function.

More generated code:
--------------------

* Specify "interfaces" to implement, and get empty functions.

* init_local() with interface, class, and attribute registrations.

* Generated code should allow C inheritance (used by pci devices).
Perhaps also support it? Cleaner when used with Mathilda.

* Get/set macros for registers and register fields.

Other features:
---------------

* Delayed assignments. Example: "irq_line = 1 after 10ms;"
Implement by hiding event queue code.

* "Script thread" equivalent. I.e. wait for some event to occur
in the middle of sequential code. Could be difficult to
implement properly (must work with checkpointing, etc).

* Should devgen know about Simics PCI system?

* Support to make state-machine writing simpler.
+ Verify illegal state changes.

The plan is also available on the open DML wiki.

 

From Generator to Language  

The idea was to build a code generator. It would read a specification file and generate C code for use in a model. A specification file format is not necessarily the same as a language, and the plan never says “domain-specific language” or “language” at all. The value is obvious in that a lot of boring boiler-plate code could be generated, and some smarts could be applied to generating register decoders. In the end, the modeler would add manually written C code to the generated code to complete the model.

However… while working on the plan, it turned out that a language was hiding in the problem statement. According to those present at the time, the following line in the specification set the process off in the “language” direction:

* Some way to specify actions that are triggered by a register access.

Designing a solution to the problem of specifying actions led to the definition of a language. Once that happened, the nature of the design space changed completely.

In a code generator, adding new modeling capabilities typically requires adding additional keywords or point features in the frontend. The code generator needs to be changed every time. In contrast, in a language, features can often be implemented and added as libraries expressed within the language. This is much easier to evolve than the language definition and compiler. Some features will require language changes, but such changes become much less prevalent.

For example, a code generator as originally envisioned would have a list of standard register behaviors like read-only, read-write, write-only, and write-1-clears. Each such behavior would naively be represented as a case in the code generator source code. From the specification above:

+ [M] RO/RW/WO/W1C/CoR/Trig/

In the device modeling language that resulted, register behaviors are provided as libraries using the DML template mechanism. This makes the behaviors easy to change and extend. It also lets users extend the functionality themselves by defining their own templates, often by extending existing templates. With a code generator, such additions would instead require forking or patching the code generator – which takes time and also provides a maintenance nightmare.

It is my belief that beyond some level of capability, in general using a custom language results in a smaller and simpler solution than a supposedly simple code generator tool.

How does current DML compare to the original specification? Most of the requirements above ended up being implemented in the device modeling language that emerged. It is a bit interesting to look through and consider how things went.

 

Banks in DML

For register banks, the plan had two bullet points:

* For a register bank specify:
+ [M] Name
+ [M] Overlapping access to register ok?

Both these points and a lot more were implemented in DML banks.  For example, in addition to just specifying the bank name and overlap, the DML standard library allows a user to specify a description string (for documentation and help), the default register size for the bank, and the endianness and bit-numbering order used. These are all expressed as DML parameters, leaving the implementation to the DML standard library.

An example of a bank header with some param declarations is shown below. In most cases, default values from standard templates or values set at the device level can be used:

bank ctrl "control registers" {
    param documentation = "Control register bank";
    param register_size = 8;
    param overlapping = false;       
    param byte_order = "little-endian";
...

In addition, the standard bank template provides the ability to override all memory accesses. Those fallback mechanisms mean that DML can be used to implement cases that do not fit the mainstream “set of registers”, such as forwarding all incoming memory accesses to another bank or device.

Registers in DML

The DevGen plans called for registers and fields to have the following properties and behaviors:

* For a register specify:
+ [M] Name
+ [M] Address (offset)
+ [M] Size
+ [O] Partial access ok?
+ [O] Don't generate access code
+ [O] Configuration attribute?
+ [O] Debug - category (new), level, string, log always/changes

* For a register field specify: (simplify if only field in register?)
+ [M] Name
+ [M] Position in register
+ [M] RO/RW/WO/W1C/CoR/Trig/...
+ [M] Reset value (hard, soft)
+ [O] Configuration attribute support?
+ [O] Debug - category (new), level, string, log always/changes
+ [O] Way to specify conditional RO
+ [O] Field that must be written with specified value (0 or 1 typically).
+ [O] Signed


* Some way to specify actions that are triggered by a register access.

Generated code include:
* read/write access code for registers
* checks to verify that accesses are ok (warn if not)
* debug_log() calls
* configuration attribute get/set functions
* Register reset function.

* Get/set macros for registers and register fields.

DML register and field objects implement almost all the above features. The features are implemented in the DML standard library using a mix of default behaviors controlled by param values and overridable predefined methods. Attributes are automatically generated for registers to provide checkpointing and inspection (covering “get/set macros”). The standard register access code supports logging of devices accesses, covering the debug aspects.

One aspect that did not end up being implemented was having values other than unsigned integers in registers and fields. Instead, the unsigned integer representation is seen as providing a set of bits, that the code in the device can interpret as arbitrary types using casts and arithmetic as needed.

The requirement of “don’t generate access code” is an interesting one – something has to happen on a register access. Presumably it meant that the code generator would leave it entirely to the programmer to handle. In DML you can choose to handle a register access in any way you want, including capturing accesses at the bank level. Another example of providing non-default behavior is to declare a register as “pseudo”, which exempts it from checkpointing.

Cases like “conditional read-only” can be handled in the code implementing accesses for a register. Very easy to write in code, but quite hard to express as a general language construct or even in the standard library. 

Reset was envisioned to encompass “hard” and “soft” reset, and that is still how it is handled in the DML standard library. When more detailed reset handling is needed, modeling teams can simply add their own templates on top of the standard templates (and use their custom template when declaring registers). 

 

More Features in DML

Most of the other features listed have indeed been implemented.

* Specify "interfaces" to implement, and get empty functions.

This turned into the DML port and implements mechanism, as well as the connect objects used to call into ports. Very handy.

 * init_local() with interface, class, and attribute registrations.

The entire initialization and registration system of Simics is hidden in DML, and the programmer should never have to see it. If an attribute is declared, the code to register it is automatically added to the device initialization code (see example below).

* Generated code should allow C inheritance (used by pci devices).
  Perhaps also support it? Cleaner when used with Mathilda.

* Should devgen know about Simics PCI system?

The change from DevGen to DML had significant impact on these two points. In DML, the Simics device data structure is generated and there is no need to extend it from C code (which would presumably be the use-case for “C inheritance”). When it comes to PCI (and later PCIe), the standard Simics simulator modeling libraries for PCI and PCIe are written in DML (clearly demonstrating the power of DML to do more than simply replace a code generator). “Mathilda” appears to be the code name for a new scripting language that never made it into the product.

* Delayed assignments. Example: "irq_line = 1 after 10ms;"
  Implement by hiding event queue code.

Delayed assignment turned into general-purpose event callbacks. Current DML after statements specify a certain delay in seconds or cycles, and then a method to call alongside any arguments to the method. For example:

    after delay s: compute_operation_complete(d);

Behind the scenes, this does indeed use the Simics simulator event system. The code generated automatically supports the checkpointing of outstanding events (see the Design Automation Conference 2022 presentation on checkpointing for more information).

 

Examples of Generating Simics Simulator Framework Code

An important initial impetus for DevGen was to get rid of the need to write tons of boring boilerplate code when creating a Simics device model. In early marketing for DML, we used this as part the value proposition (“code 75% smaller than manual C code”). It is worth digging in a bit more on this aspect of code generation.

Take attributes as an example. The C code needed to declare an attribute in the Simics simulator API is not hard to write, but it screams out for automatic generation. Look at the source code for the sample-device-c module found in the Simics simulator base package (part of the public release). First, the code needs to declare storage for the attribute value in the simulation object structure:

typedef struct {
/* Simics configuration object */
conf_object_t obj;

/* Value for example attribute */
uint64_t value;
} sample_device_t;

Second, it has to declare the get and set functions that convert between the internal storage and the external view:

static set_error_t
set_value_attribute(conf_object_t *obj, attr_value_t *val)
{
sample_device_t *sample = (sample_device_t *)obj;
sample->value = SIM_attr_integer(*val);
return Sim_Set_Ok;
}

static attr_value_t
get_value_attribute(conf_object_t *obj)
{
sample_device_t *sample = (sample_device_t *)obj;
return SIM_make_attr_uint64(sample->value);
}

Third, the attribute must be registered with the simulator framework in the init_local function called when the module is loaded. The registration provides the name, help text, and user-facing type of the attribute, and binds the get and set functions to the attribute:

void
init_local(void)
{
// Register the class, we do not show the contents of the "funcs"
// data structure here for brevity
conf_class_t *class = SIM_create_class("sample-device-c", &funcs);
//...
SIM_register_attribute(class, "value",
get_value_attribute, set_value_attribute,
Sim_Attr_Optional, "i",
"The <i>value</i> field.");
//...

It is pretty obvious that this can be easily generated from a higher-level description of the attribute. DML provides this automation for explicitly declared registers, as well as the attributes that are automatically created for all device registers and saved variables.

 

Ideas Not Implemented

A couple of things from the DevGen plan are still on the to-do list.  

* "Script thread" equivalent. I.e. wait for some event to occur
in the middle of sequential code. Could be difficult to
implement properly (must work with checkpointing, etc).

It is indeed hard to implement threads or coroutines properly with checkpointing in mind. Instead, DML code is explicitly event-driven, which provides simpler and cleaner state handling. It should be noted that the Simics simulator itself did add script-level threads (“script branches”) later on. Scripts represent a different design space, where checkpointing is not needed and coding in terms of callbacks is much more cumbersome.

* Support to make state-machine writing simpler.
+ Verify illegal state changes.

Dedicated state-machine support has not been implemented in DML just yet. State machines are common in DML models, but they are implemented using the language as it exists. Adding dedicated language constructs requires careful design and balancing – they would have to cover all existing use cases and make the code significantly simpler and cleaner to be worth the effort.

 

Split Fields in DML

Another idea that ended up not being implemented is “split fields”. It is an interesting concept that deserves a bit of a deep dive. The plan puts it like this:

* How to support split fields? (including fields with parts in different regs).

It refers to hardware designs where what could be considered as a “single value” is physically split across multiple registers or possibly multiple disjoint fields in a single register. While it might sound odd to a software person, this case had been encountered several times already in 2001. The problem was common enough to warrant consideration.

One example noticed at the time was the classic VGA (Video Graphics Array) graphics adapter. The hardware design harks back to the 1980s and has several cases where a single value is built up from multiple registers (in part due to registers being quite small, and in part due to successive extensions to the functionality over generations of hardware). For illustration, here is a concrete example:

Image from the Intel® OpenSource HD Graphics Programmer’s Reference Manual (PRM) Volume 3 Part 1: Display Registers – VGA Registers (SandyBridge)Image from the Intel® OpenSource HD Graphics Programmer’s Reference Manual (PRM) Volume 3 Part 1: Display Registers – VGA Registers (SandyBridge)

 

I.e, when the hardware is constructing an address to access, four bits from register GR10 and seven from GR11 are combined to form the complete “page select” value. The idea for DevGen seems to have been to provide an explicit way to define the “page select” value as a combination of bits from multiple registers. This sounds complicated, and it is not really all that hard to handle in a programming language (as opposed to a code generator). Instead, in DML, the value is computed when it is needed by reading both registers.

The code looks something like this:

bank b {  
// Example of a split field from VGA
register gr10 size 1 @ 0x10 "address mapping" {
field page_select_extension @ [7:4];
field io_map_enable @ [3];
field paging_target @ [2:1];
field page_mapping @ [0];
}
register gr11 size 1 @ 0x11 "page selector" {
field page_select @ [6:0];
}

// Register to demonstrate concatenation
register vga_address_dummy size 4 @ 0x14 is (read_only, read, get)
"dummy register to demonstrate field value concatenation" {
// No value of its own – exclude from checkpointing with "pseudo"
param configuration = "pseudo";
method get() -> (uint64) {
return (gr10.page_select_extension.val << 24) +
(gr11.page_select.val << 17);
}
}
}

This example also shows how to exclude a register from a checkpoint (using “param configuration=pseudo”) and using multiple templates on the same register to produce a combined behavior with “is (read_only, read, get)“.

 

Repetitions Missing?

There is one notable aspect missing from the DevGen plan that was added in DML – repeating patterns. It is very common that register banks contain multiple copies of the same set of registers, such as the set of control registers for each channel in a device with multiple channels. Another example is an array of registers that work like read-write memory. Expressing such recurrences and arrays by “unrolling” and declaring each register individually would result in bloated DML code and huge generated C code.

Instead, DML uses array-style constructions to declare repeated registers and sets of registers. For example, the DML code below creates a set of six registers repeated ten times. By using a template, the internal layout of the register group is separated from the assignment of locations in the register bank:

template foo_grouping {
param base_offset default 0;
  register a @ base_offset "a register called a";
register b @ base_offset + 4 "a register called b";
register c [j<4] @ base_offset + 8 + j*4 "a small array of registers";
}

bank regs {
param register_size = 4;
// Array of the group
group foo [i<10] is foo_grouping {
param base_offset = 0x1000 + i*24;
}
}

Such declarations are more convenient to write and make it easier for the DML compiler to generate good and compact code. When machine-readable specifications are used to generate the register layouts, it is important to use a source that contains the pattern and not just an unrolled representation.

The code above results in a set of registers that look like this:

# Interactive inspection from Simics simulator command-line
simics> print-device-regs t.dev.bank.regs pattern = "foo*"
Offset Name Size Value
---------------------------------
0x1000 foo[0].a 4 0x0000
0x1004 foo[0].b 4 0x0000
0x1008 foo[0].c[0] 4 0x0000
0x100c foo[0].c[1] 4 0x0000
0x1010 foo[0].c[2] 4 0x0000
0x1014 foo[0].c[3] 4 0x0000
0x1018 foo[1].a 4 0x0000
...

Final Thoughts

Looking back at the DevGen plan and comparing it to the DML we have today shows that the basics of device modeling have proven remarkably stable over time. What has changed is the sheer size of devices and the sophistication of the machinery needed to code them.

The original DevGen plan reflected a time when a complex device might have a few dozen registers and was mostly written by hand. Today, devices often have thousands of registers expressed in generated code and using layers of custom templates. DML has evolved to handle this while keeping simple things simple. If we did not have DML, we would have had to invent it.

Learn More

About the Author
Jakob Engblom works at Intel in Stockholm, Sweden, as director of simulation technology ecosystem. He has been working with virtual platforms and full-system simulation in support for software and system acceleration, development, debug, and test since 2002. At Intel, he helps users and customers succeed with the Intel Simics Simulator and related simulation technologies - with everything from packaged training to hands-on coding and solution architecture. He participates in the ecosystem for virtual platforms including in Accellera workgroups and the DVCon Europe conference.