@DrSimonClark some cyclers record time as an integer number of milliseconds internally, Newares also do this. My opinion is that seconds feel more natural, and I don’t think the precision loss storing as a float matters.
It would be nice if there was a way to separate label and pint-compatible unit in the column names. e.g. if units were put in square brackets or separated with a different character like double underscore, then tools that read bdf can handle any units with pint. I don’t know if that makes the ontology part impossible.
Some cyclers store time as a millisecond integer to avoid accumulation of errors due to floating-point precision issues. I think this is more important during the measurement itself, when errors can accumulate over a long time. But for serializing to a global interoperable format, it’s best to stick to community-standard units and the precision of converting integer to decimals should be ok.
Handling of units is also an important consideration. That’s why we’ve built in lots of support to parse and convert units. These will be included in a python package that we’ve been working on. Check out the units notebook here. As you mention, we build on top of pint to easily handle unit conversions in python. On the simplest level, you can just strip the unit from the preferred label or the machine name and use that in pint directly. We also have units encoded in the application ontology and the table schema for semantic operations. So hopefully this will provided robust redundancy to handle units. What do you think?
Hi Simon, a bdf python package would be great, the validator would be particularly useful for other devs (me) trying to ingrate bdf in our pipelines.
I wouldn’t immediately know how to strip units, as there is no reserved character for separating labels and units in names, e.g. charging_capacity_milliampere_hour - where does the label end and unit start? But if a bdf package can do all this for me I am happy.
pint-pandas may be of interest for you if you haven’t seen it yet.