Schema for XML device files
Hi, in an effort to make xpcc build system independent I had a look at the platform generation and therefore the XML device files. To understand how they are set up I wrote a schema against all of them can be validated: https://github.com/dergraaf/modm-platform/blob/master/tools/device/device/re... The schema could be refined a bit more (not only string type for attributes) but should be good starting point. Is there a documentation available what the different attributes mean? Especially for the attributes marked as optional the use differs quite a lot between the different architectures. Cheers, Fabian
Hi,
in an effort to make xpcc build system independent I had a look at the platform generation and therefore the XML device files. To understand how they are set up I wrote a schema against all of them can be validated:
https://github.com/dergraaf/modm-platform/blob/master/tools/device/device/re...
The schema could be refined a bit more (not only string type for attributes) but should be good starting point.
Awesome!
Is there a documentation available what the different attributes mean? Especially for the attributes marked as optional the use differs quite a lot between the different architectures.
Do you mean the `type` vs. `pin-id` difference between AVR and STM32? The only documentation we have on the device files is this here: https://github.com/salkinium/xpcc/blob/develop/src/xpcc/architecture/platfor... (linkerscript element is deprecated now) Cheers, Niklas PS: HexadezimalType with a z?
Hi Niklas,
Is there a documentation available what the different attributes mean? Especially for the attributes marked as optional the use differs quite a lot between the different architectures.
Do you mean the `type` vs. `pin-id` difference between AVR and STM32?
E.g. "pcint", "extint", "func", "bits" and "define" in the <gpio> tag.
The only documentation we have on the device files is this here: https://github.com/salkinium/xpcc/blob/develop/src/xpcc/architecture/platfor... (linkerscript element is deprecated now)
Great, that helps a lot! Cheers, Fabian
PS: HexadezimalType with a z?
Um, you saw nothing... *pushes* ...what are you talking about?
Hi,
Is there a documentation available what the different attributes mean? Especially for the attributes marked as optional the use differs quite a lot between the different architectures.
Do you mean the `type` vs. `pin-id` difference between AVR and STM32?
E.g. "pcint", "extint", "func", "bits" and "define" in the <gpio> tag.
The driver specific data does not really follow a common schema, since the data is architecture and sometimes even series specific (like GPIO data on STM32F1). There are some lax agreements on naming things, like the `port` and `id` (which should actually be `pin`?), and the `af` element, but until now there wasn't a way to formally specify and enforce that. So I think your schema work is very important and attacks a root problem of the device files. Could we have a base schema and then "specialize" that for the architectures? An alternative may be to place a schema with a driver and load that dynamically. But I really dislike that, because there is information that crosses driver boundaries, like the DMA interconnect data. There are also drivers that need to know about other driver data (like the GPIO needs to know how many ADCs there are [1]) which means in future I'd like to pass the entire device file to all drivers, instead of just their driver subtree. Niklas PS: Please feel free to ignore the LPC device files (`func`, `bits`). They don't have any future, since we've found no data source to generate them from. [1]: https://github.com/roboterclubaachen/xpcc/blob/develop/src/xpcc/architecture...
Hi, After the verification of the device files, next step: Enumerate all the devices! Where should the naming schema of the devices be stored? At the moment there have two places: - platform_tools.py Defines how the device name is mapped to the name of the XML device file. This is used to find the matching XML file. - device_identifier.py Separates the device name into platform, family etc. Using the name of the device file makes the definitions inside the XML file redundant. I'm actually not sure whether those are actually used somewhere. I would like to get the platform generator independent from the actual device being built. Or at least reduce the number of places where something needs to be edited to support a new device/family. One possibility would be to store the naming schema inside the device file. That would require to parse all available device files to find the matching one. Then again it allows to remove the big chunk of case statements in device_identifier.py and it would require to only add a new device file without touching the Python files. Something like (for stm32): <device-name>${platform}f${name}${pin_id}${size_id}</device-name> or (at90) <device-name>${family}${type}${name}</device-name> or (atmega) <device-name>at${family}${name}${type}</device-name> Interestingly name and type swap between at90 and atmega. What do you think? Btw. according to the device files xpcc supports 573 different targets at the moment :)
Could we have a base schema and then "specialize" that for the architectures?
That is possible, but a bit difficult. Inheritance is supported on tag level, but not on schema level. With that most of the schema has to be rewritten for that. Cheers, Fabian PS: (...)/platform/devices/avr/at90646_647_1286_1287-usb.xml looks broken with flash, ram and eeprom being defined multiple times with different values without any selector.
Hi,
Using the name of the device file makes the definitions inside the XML file redundant. I'm actually not sure whether those are actually used somewhere.
Yes. It can be removed. There was a half-hearted attempt to "verify" the device data, but it's rather pointless. https://github.com/roboterclubaachen/xpcc/blob/7b32f74cc43c7bfaf815287902ef2...
One possibility would be to store the naming schema inside the device file. […] What do you think?
Not too fond of using XML beyond the device files. Counter offer: Lets put a `platform.py` with a bunch of (stateless) callbacks invoked by the library builder in the folder with the device files. It would contain: 1. a function that splits the device identifier string into a target dictionary and returns that. The library builder itererates over all `platform.py` in the folders and executes that first callback. All further hooks are used from that file then. 2. a function mapping the target dictionary onto a device file. The device file naming scheme is then local to only the folder and can therefore transparently hide differences in the target dictionary. 3. a function adding platform dependent Jinja2 filters. I believe that these three function will allow the build system to be completely ignorant of the content of the target dictionary. Only the driver templates will have to understand what the target dictionary looks like. Regarding `driver.xml`: It's functions disguised as data. That's horrible. The library builder has the right idea. Let's use a `driver.py` that's called with a copy of the entire (!) substitution dictionary and it can modify it as it wishes. It can filter it, add and remove data, or create and entirely new substitution dictionary. Adding the templates is then "just" another action that depends on the substitution dictionary. Also we would provide functions that add parameters (using the mechanism in lbuild). The reason why I want to pass the entire device file data into the driver is that restricting cross-peripheral data like DMA to only inside the driver nodes leads to data duplicatation one way or the other. Let's keep it generic.
Btw. according to the device files xpcc supports 573 different targets at the moment :)
Not all combinations are valid though, some simply don't exist. The ORing is a one way mapping.
Could we have a base schema and then "specialize" that for the architectures?
That is possible, but a bit difficult. Inheritance is supported on tag level, but not on schema level. With that most of the schema has to be rewritten for that.
Hm, ok. Would that mean that each platform has it's own schema (in the device file folder?), with duplicate common code? If we generated these platform schemas from a common schema which is inserted with platform specific schema, that would force the schemas not to diverge. This generation step is not part of lbuild (doesn't even have to use Jinja2), and the schemas are checked into the repo.
PS: (...)/platform/devices/avr/at90646_647_1286_1287-usb.xml looks broken with flash, ram and eeprom being defined multiple times with different values without any selector.
That looks like it would have been caught with a schema ;-P Cheers, Niklas
Hi,
Counter offer: Lets put a `platform.py` with a bunch of (stateless) callbacks invoked by the library builder in the folder with the device files.
Dang it, I just wanted to extract the platform generation mechanism, not write a completely new one :-)
It would contain:
1. a function that splits the device identifier string into a target dictionary and returns that. The library builder itererates over all `platform.py` in the folders and executes that first callback. All further hooks are used from that file then.
Having the naming as part of the XML file would make it useful for other purposes as well. It would then capture all the core characteristics of the device in one data format.
2. a function mapping the target dictionary onto a device file. The device file naming scheme is then local to only the folder and can therefore transparently hide differences in the target dictionary.
3. a function adding platform dependent Jinja2 filters.
What would be an example of these Jinja2 filters? Where are those stored at the moment?
I believe that these three function will allow the build system to be completely ignorant of the content of the target dictionary. Only the driver templates will have to understand what the target dictionary looks like.
Ok.
Regarding `driver.xml`: It's functions disguised as data. That's horrible. The library builder has the right idea.
Let's use a `driver.py` that's called with a copy of the entire (!) substitution dictionary and it can modify it as it wishes. It can filter it, add and remove data, or create an entirely new substitution dictionary.
As long as the edited dictionary is specific to the driver thats easy, if the other driver should get access to the edits than it gets much more complicated.
Btw. according to the device files xpcc supports 573 different targets at the moment :)
Not all combinations are valid though, some simply don't exist. The ORing is a one way mapping.
Damn. I still think having a two way mapping is very helpful. E.g. it would allow to check that the generated code compiles for __all__ devices. Is there any way to check which devices are non-existent? I guess it is mostly a problem of the STM32 families? For the AVRs the number of devices per device file is much lower.
Could we have a base schema and then "specialize" that for the architectures?
That is possible, but a bit difficult. Inheritance is supported on tag level, but not on schema level. With that most of the schema has to be rewritten for that.
Hm, ok. Would that mean that each platform has it's own schema (in the device file folder?), with duplicate common code?
You can have common type definitions in a base schema, but the specific schema needs to define the tag structure up to the point where it diverges. Therefore it is easier to put the common part deeper in the tag tree than the other way around.
PS: (...)/platform/devices/avr/at90646_647_1286_1287-usb.xml looks broken with flash, ram and eeprom being defined multiple times with different values without any selector.
That looks like it would have been caught with a schema ;-P
Not sure about that. You would have to check that a tag together with the values of all its attributes is unique. For required attributes that is possible, but I'm not sure about optional attributes. Cheers, Fabian
Hi,
Dang it, I just wanted to extract the platform generation mechanism, not write a completely new one :-)
I'd prefer to keep you motivated, so feel free to ignore my ramblings and take the path of least resistance. Getting something working is worth much more to me than adding all the features.
Having the naming as part of the XML file would make it useful for other purposes as well. It would then capture all the core characteristics of the device in one data format.
Hm, well… this proposed format is a printing format and cannot be used to split the string.
<device-name>${platform}f${name}${pin_id}${size_id}</device-name>
It would require a regular expression, and even that is not entirely simple: https://github.com/roboterclubaachen/xpcc/blob/develop/tools/device_files/de... Even if you have all the regexes in place, you'd probably spend more time writing the matching engine that opens every device file and tries with every regex. Probably just easier and less complicated to let the user write their own callback. And then keep in mind that lbuild doesn't actually need to know about the target description.
What would be an example of these Jinja2 filters? Where are those stored at the moment?
https://github.com/roboterclubaachen/xpcc/blob/3c7cd31e5aad66e47d073dc445551... Note that the four points regarding the build systems are all resolved with `platform.py`: https://github.com/roboterclubaachen/xpcc/blob/develop/PORTING.md#making-the...
As long as the edited dictionary is specific to the driver thats easy, if the other driver should get access to the edits than it gets much more complicated.
Very interesting point. The drivers are supposed to be stateless, so sharing data (which implies an execution order) is a no-go. You could however share the filter/edit code that generates the edited dictionary, by placing the file with the filter in one driver folder, and importing it in another. That's not the most pretty, but that would work.
Damn. I still think having a two way mapping is very helpful. E.g. it would allow to check that the generated code compiles for __all__ devices. Is there any way to check which devices are non-existent? I guess it is mostly a problem of the STM32 families? For the AVRs the number of devices per device file is much lower.
A bijective mapping is way too difficult and generate an enormous clutter in the device file. Let's just provide a list of all target strings, that get's checked by `platform.py`. Can also be used to provide a helpful message if a target is not supported. Here is the list of all AVR targets (we only support those programmable by avrdude). https://github.com/roboterclubaachen/xpcc/blob/develop/tools/device_file_gen... All the latest STM32 F0/1/3/4/7 targets: https://gist.github.com/salkinium/35bb921fc935cfad81c15e4fcde4beab Note that even though we have device file data for all these device, it does not mean that xpcc supports them! Just that these are valid device identifiers. Figuring out what the HAL supports is much more tricky. In total we have device file data for 100 AVRs, and 568 STM32. Adding the remainder of the STM32 devices (like F2 and all of L*) is not all that difficult. In total there are 615 STM32F + 349 STM32L = 964 STM32 devices that we could provide data for. Cheers, Niklas
Hi Niklas,
It would require a regular expression, and even that is not entirely simple: https://github.com/roboterclubaachen/xpcc/blob/develop/tools/device_files/de...
Even if you have all the regexes in place, you'd probably spend more time writing the matching engine that opens every device file and tries with every regex. Probably just easier and less complicated to let the user write their own callback.
That's why I want to go the other way around by enumerating all devices and then comparing. That is much easier and does not require complex regex. But it requires a way to generate all valid device names.
<device-name>${platform}f${name}${pin_id}${size_id}</device-name> Hm, well… this proposed format is a printing format and cannot be used to split the string.
Actually I need to split that string into its parts to do something like devices = list(itertools.product((platform,), ("f",), device_name, pin_id, size_id)) device_name, pin_id, size_id etc. are lists filled from the device file. With that finding the correct device file becomes: for file in glob.glob("devices/**/*.xml"): devices = parser.get_devices(parser.parse(file)) if device in devices: print("found device file") With that method i've already reduced the ~70 lines of selections and regexes to six if statements [1].
https://github.com/roboterclubaachen/xpcc/blob/3c7cd31e5aad66e47d073dc445551...
Wouldn't it be easier to make all the information from the device file available in the template? Then you don't need specific tests but can write something like: %% if device.core == "cortex-m3" or device.core == "cortex-m4": instead of: %% if target is cortex_m3 or target is cortex_m4 The current form is not bad but the less stuff you have to define in addition to the device files the better.
Note that the four points regarding the build systems are all resolved with `platform.py`: https://github.com/roboterclubaachen/xpcc/blob/develop/PORTING.md#making-the...
Btw. the jinja2 tests are not called "filters" but "tests" [1] :) [1] http://jinja.pocoo.org/docs/dev/api/#writing-tests
Damn. I still think having a two way mapping is very helpful. E.g. it would allow to check that the generated code compiles for __all__ devices. Is there any way to check which devices are non-existent? I guess it is mostly a problem of the STM32 families? For the AVRs the number of devices per device file is much lower.
A bijective mapping is way too difficult and generate an enormous clutter in the device file. Let's just provide a list of all target strings, that get's checked by `platform.py`. Can also be used to provide a helpful message if a target is not supported.
Here is the list of all AVR targets (we only support those programmable by avrdude). https://github.com/roboterclubaachen/xpcc/blob/develop/tools/device_file_gen...
Shouldn't that list be something which is extracted from the device files?
All the latest STM32 F0/1/3/4/7 targets: https://gist.github.com/salkinium/35bb921fc935cfad81c15e4fcde4beab
But not all of those have matching device files. E.g. for the stm32f051 or the stm32f412 there are none. Is that a list of all STM32 microcontrollers or from where did you extract that list?
Note that even though we have device file data for all these device, it does not mean that xpcc supports them! Just that these are valid device identifiers. Figuring out what the HAL supports is much more tricky.
But that is exactly what I'm trying to do later on. At least to check for which target the generated code can be compiled. Checking if it's working on the actual mikrocontroller is an entirely different thing, but compiling is a start. And for that you need a list of valid device names. Cheers, Fabian [1] https://github.com/dergraaf/modm-platform/blob/master/tools/device/device/pa...
Hi,
That's why I want to go the other way around by enumerating all devices and then comparing. That is much easier and does not require complex regex. But it requires a way to generate all valid device names.
Ha! That went over my head the first time. That's a very elegant solution, and I so totally approve of this method!
With that method i've already reduced the ~70 lines of selections and regexes to six if statements [1].
Ok, so that's why you wanted to reverse engineer the devices from the `<device ...>` tag. Makes more sense now. Here is the redundancy that happens if I add the device string for the F4x5/4x7 device file: https://gist.github.com/salkinium/bc6a21578d0bd93f6f0d03962c9cb54f#file-stm3... The less bloaty alternative is just to add all target strings as a bunch of elements without filters. <identifier>stm32f405oe</identifier> <identifier>stm32f405og</identifier> ... <identifier>stm32f417zg</identifier> I don't think this is a good idea, I feel that is still distracting for the reader, but I am open to be convinced otherwise. Here is my suggestion how this can still work: 1. We add the list of device strings from earlier into something like `platform/targets.txt`., and the lbuild just does a search in that file. 2. If it finds an exact match, it knows that these are the device files to use. 3. It builds the list of target strings from the format a and searches those until a match is found. Would that be an acceptable compromise?
Wouldn't it be easier to make all the information from the device file available in the template? Then you don't need specific tests but can write something like: %% if device.core == "cortex-m3" or device.core == "cortex-m4":
instead of: %% if target is cortex_m3 or target is cortex_m4
The current form is not bad but the less stuff you have to define in addition to the device files the better.
Ah. Yes and no. Yes, we don't need specific tests for that, but we _do_ need some abstraction. There are APIs that are grouped around two IP implementations, that share enough that they don't warrant a separate driver. For example the STM32 clock driver is split in two main sections, the F0/F1/F3 and F2/F4/F7 section. The issue we encountered when porting to the F7 and L4 is that suddenly you will write _a lot_ of duplicate code. %% if target is stm32f2 or target is stm32f4 became longer and longer: %% if target is stm32f2 or target is stm32f4 or target is stm32f7 or target is stm32l4 The smart way of doing this is to check for features, not targets. That's why I want to be able to add Jinja2 tests to `driver.py`, since that centralizes the actual test in one location and makes it much easier to port to new devices like the L4. Note that with `driver.xml` is is possible to define your own elements using our filter syntax, but it fails for complex queries.
Here is the list of all AVR targets (we only support those programmable by avrdude). https://github.com/roboterclubaachen/xpcc/blob/develop/tools/device_file_gen...
Shouldn't that list be something which is extracted from the device files?
This data is used by the DFG as additional input data and worked into the device files. Ugly? You've only seen the half of it, bwahaha. Particularly for the AVRs I did not find enough information in the raw data. This list of GPIO AFs was extracted manually by looking at all datasheets: https://github.com/roboterclubaachen/xpcc/blob/develop/tools/device_file_gen... Oh, and then there is this for STM32: https://github.com/roboterclubaachen/xpcc/blob/develop/tools/device_file_gen... In the AVR device files the `mcu` is passed along to avrdude, so that `scons program` just works. Arguably this information is specific to the HAL implementation, given that you don't have to use avrdude, but it serves it's purpose for us.
All the latest STM32 F0/1/3/4/7 targets: https://gist.github.com/salkinium/35bb921fc935cfad81c15e4fcde4beab
But not all of those have matching device files. E.g. for the stm32f051 or the stm32f412 there are none.
Not all device files are committed into xpcc, because the HAL does not support them all. We wanted to add a device file when we actually have a device to port, like you did with the F373 and then had to modify the HAL (quite extensively actually). Note that I am not very consistent here, for AVR I added all the device files, even for ATxmega which isn't supported in xpcc at all anymore! Here is a .zip with all those device files. AS IS! They may not work! https://dl.dropboxusercontent.com/u/44769046/stm32.zip
Is that a list of all STM32 microcontrollers or from where did you extract that list?
No, this is a list of all devices that the DFG can compile a device file for using raw data from about a month ago. It excludes the F2 and all L series, because I don't have any device for it. (There is an experimental port to STM32L4, but not in shape to be upstreamed).
Note that even though we have device file data for all these device, it does not mean that xpcc supports them! Just that these are valid device identifiers. Figuring out what the HAL supports is much more tricky.
But that is exactly what I'm trying to do later on. At least to check for which target the generated code can be compiled. Checking if it's working on the actual mikrocontroller is an entirely different thing, but compiling is a start. And for that you need a list of valid device names.
Makes more sense now with the reverse lookup mechanism. I think we are talking implicitly of two target lists: 1. the targets that we have a device file for. This can be a subset of all existing targets, especially when the vendor adds new targets and we haven't updated yet. 2. the targets that the HAL supports. This is most definitely a (small) subset. I think it's important to decouple them. The targets that are really supported by the xpcc HAL are only a handful, because those are the ones we have tested. So there may be a list of targets that the HAL should pass to lbuild to check against, before continuing on to finding the device file. Cheers, Niklas
participants (2)
-
Fabian Greif
-
Niklas Hauser