Slightly odd one - XML schema for tech regs. Thoughts?

Post anything that doesn't belong in any other forum, including gaming and topics unrelated to motorsport. Site specific discussions should go in the site feedback forum.
TeamFFX
TeamFFX
0
Joined: 19 Apr 2007, 17:31

Slightly odd one - XML schema for tech regs. Thoughts?

Post

Hi guys,

This is a bit of an odd one, but go with it... :)

As part of a project I am managing, I am defining a schema for an XML markup of technical regulations. It's pretty simple, and I think I've got most things covered, but any thoughts and comments appreciated.

Code: Select all


<regulations>

  <information>

    <governingBody>Fédération Internationale de l'Automobile</governingBody>

    <governingBodyShort>FIA</governingBodyShort>

    <championship>Formula One World Championship</championship>

    <year>2008</year>

    <date>XX/XX/XXXX</date>

  </information>

  <maintext>

    <article id="">
    
      <subarticle id="">
      
        <paragraph id="">
        
          <diagramReference></diagramReference>
        
        </paragraph>
        
        <list listType="">
        
          <listItem></listItem>
          
          <listItem></listItem>
        
        </list>
      
      </subarticle>
    
    </article>

  </maintext>

</regulations>

<paragraph>, <list> and their associated tags can appear in either the <article> or the <subarticle> tags.

the id="" in <paragraph> is not compulsory (but is in <article> or <subarticle>). This is to give the option of a third "level" of paragraph (a.b.C for example).

<diagramReference> will provide a hyperlink (when transformed into HTML or PDF) to the associated diagram. I am currently working out a way to fit the diagrams in. Probably a link to an external file - it takes away the point of XML if you put binary data in a Cdata wrap...

Using this, I will be able to output to several formats (HTML, PDF, Office 2007 XML) using the same base file. Also, certain regs can be called by id on a webpage (so only the relevant ones are printed for example).

Can anyone see anything I have missed?

Cheers,

Martin

User avatar
Ciro Pabón
106
Joined: 11 May 2005, 00:31

Post

In the header I miss the <CW date>. This is part of the page footer. In your schema does not exist the concept of <page>, <header> or <footer>, unless I'm missing something.

Where is the <TableOfContents>?

I also miss the <Language id " "> tag. This is not easy: the french and english versions are put side by side, so in effect the whole regulation has two <DocumentColumn>s. How do you handle that in XML?

In the maintext I miss <Table id " ">, like the ones you have in Article 19, 2007 version, unless is a list type, but then you have the problem of <row> and <column>.

You don't want BLOB fields and I don't know if is a good or a bad thing. I guess it depends on the number of photos you have on your database! Anyway, <DiagramReferences> are located at the end of the text, not as part of an article.

How do you contemplate the fact that many times regulations include the <OldText> crossed over and the <NewText>, both in a different <TextColor> from the rest of the text? The <TextColor> varies from version to version and from Regulation to Appendixes (more on appendixes follows).

<Appendix>es, which are part of the "main" regulation, are a recursive nesting of the whole thing. They are given as separated <Document>s: Appendix O, one of my favorites, :oops: named "Procedures for recognition of International Circuits" has <Supplement>s at the end. This Appendix O have a couple of <Formulas> and <Notes>. Several appendixes have a <Foreword>. They also have a couple of <Hyperlinks>.

What about the <SportingRegulations>? They are part of the regulations, even if they are a different document. They have a slightly different form: all paragraphs have a <ParagraphIndentation> and have a consecutive number. Its <Subparagraphs> have a consecutive letter to identify them. I guess is a little tedious and prone to errors to include that as part of the id " " part by hand: I wonder if there is a way to create a <NumberedList> with a <NumberedListFormat> in XML, the same way a Word document has this kind of object (that is, "Numbered Lists") different from the "Title" object.

Sporting Regulations have their own <Appendixe>s, but unless Regulations, they are an integral part of them and not given as separated documents.

Those Sporting Regulation's Appendixes have another object: <Form>. A form is composed of <line>s, with a <LineTitle> and a <LineFillIn> part. A line can have <SubLines>. Check the "Registered Office" part of Form named "THE APPLICANT" in Appendix 2 of Sporting Regulations. To complicate things, the line named "Adress" has sublines with <SubLineTitle>s: "Tel", "Fax" and "e-mail". The "CONSTRUCTOR DETAILS" form is even more complex: it has two <lines> in the same line, like the one that says "License ............ Issued by ..........". I throw the towel there... ;)

The document, which I don't know how to call, because it does not appear to be an Appendix but is referenced in Appendix O (perhaps <AppendixToTheAppendix>?) :lol: , named "Circuit Drawing Format" is complex: the tables included in it have rows with <RowTitle> and include <CellDrawings> into the cells that compose the table. The <row>s in those tables have <subrow>s. The diagrams included at the end have a <DiagramOrientation>: some are vertical and some horizontal, I don't know if you will include this kind of formatting.

Well, this is way too complicated to me... Perhaps you will answer (and I won't blame you :) ): "Ciro, thanks but I said clearly Tech Regulations!". In that case, forget everything except the <OldText>/<NewText>/<TextColor>, <TableOfContent>, <DocumentColumn>, <Language> and <Table>/<Column>/<Row>, I think. You can avoid Appendixes and I'll agree...

I'll have some mercy with the few readers left at this moment of the post and won't mention (but you must know) that I believe that rulings by the International Motorsport Council (or whatever his name is) are part of Technical Regulations. You can conclude Max Mosley's succesor must be a lawyer: any engineer shudders thinking about that, I guess. :) I bet you are one of the few individuals attempting to understand Technical Regulations structure, so you will have to read them, which probably makes you unique: my condolences. No wonder people breaks them all the time: nobody reads them in its entirety. ;)

Anyway, thanks, Martin: I had a good time posting this. I've been cursing XML since I started to make a database to be handled by VB and exported to Google Earth KML format: it has take me months of tinkering and I'm not done yet... I haven't even finished reading the KML specification: it's longer than War and Peace and it has five or six miserable objects. They seem redacted by me! The guy that invented XML should be forced to drive a Spyker and have Ralf Schumacher as teammate.
Ciro

TeamFFX
TeamFFX
0
Joined: 19 Apr 2007, 17:31

Post

Ciro,

Thank you! Much to chew on there.

I was messing around last night and have changed some things.

Incidentally, are the Appendices to the tech regs downloadable? I can't find them anywhere...

I have left <toc> out as I would imagine it to be dynamically generated recursively by whatever parser is reading the XML file - this would negate the need to manually update the TOC when something is changed.

I suppose the best way to implement formulae would be to simply use MathML tags in a <formula> container... Good point though. I will look more into it.

As far as text color goes, I would not put that as a tag (the point is to separate style from content), although something like <old> could be used to format out of date text however the user sees fit (be it colour, strikethrough etc etc). Equally <new> could be used for a replacement.

It would also be possible to ID the old and the new, so <old id="3"> is the old version of <new id="3>. That logically links the old and the new text.

As to the rest, I will reply more fully, but I need a lot more coffee first :shock:

Cheers!

M
Last edited by TeamFFX on 27 Apr 2007, 10:31, edited 1 time in total.

TeamFFX
TeamFFX
0
Joined: 19 Apr 2007, 17:31

Post

Ciro Pabón wrote:In your schema does not exist the concept of <page>, <header> or <footer>, unless I'm missing something.
No - page headers and footers are considered "stylistic" rather than structural and should be dynamically generated at print/parse time.

TeamFFX
TeamFFX
0
Joined: 19 Apr 2007, 17:31

Post

Ciro Pabón wrote:I wonder if there is a way to create a <NumberedList> with a <NumberedListFormat> in XML, the same way a Word document has this kind of object (that is, "Numbered Lists") different from the "Title" object.
I would base the type of list on the listType="" of the <list> tag. There is no point in getting to ordered or unordered lists. The possible values are:

<list listType="a">
or
<list listType="1">
or
<list listType="bullet">

where "a" gives an a) b) c) list, "1" gives a 1, 2, 3 list, and bullet parses to an unordered list (of whatever bullet the stylesheet defines).
I'll have some mercy with the few readers left at this moment of the post and won't mention (but you must know) that I believe that rulings by the International Motorsport Council (or whatever his name is) are part of Technical Regulations. You can conclude Max Mosley's succesor must be a lawyer: any engineer shudders thinking about that, I guess. I bet you are one of the few individuals attempting to understand Technical Regulations structure, so you will have to read them, which probably makes you unique: my condolences. No wonder people breaks them all the time: nobody reads them in its entirety.
On a more general note, I wouldn't presume to tell the FIA that things need to be standardised (although they clearly do). This is mainly for my own personal project, but it is interesting to see how the concepts can be applied to the real world...

Ciro - I agree. XML is the spawn of satan. Unfortunately it is now ubiquitous (and I can imagine ECUs from next year being controlled by Microsoft MS-XML files :roll: )

Cheers,

Martin

TeamFFX
TeamFFX
0
Joined: 19 Apr 2007, 17:31

Post

This thread is now my stream of consciousness... :?

In reality what needs to happen with the regs is that they need to be defined differently. instead of:

Code: Select all


<article id="1" title="Definitions">

  <subarticle id="1" title="Whatever">

    <subsubarticle id="1">

      <text>Text goes here</text>

    </subsubarticle>

  </subarticle>

</article>

It should be just a load of nested <article> tags (with no id=""), which the parser can automatically number.

However, for that to work, there needs to be a standardised numbering format for paragraphs, which looking at the regs, there is not...

That way, entire articles, or single words, could be wrapped in <old> and <new> tags to differentiate between changes.

TeamFFX
TeamFFX
0
Joined: 19 Apr 2007, 17:31

Post

Ok, a real world example. This is articles 3.1 - 3.4 of the 2008 tech regs (28 March 07 update): http://www.designersounds.co.uk/test.xml
[EDIT: Updtaed to 3.10, with nested lists examples]

This is very simple - no lists or tables as yet, but it does give an idea of the oldText and newText and how it works with whole articles and single words/sentences.

I'm going to try and do the whole of article 3, and then I will define the DTD from that (I know that is in reverse, but it gives me an idea of what I will need to define).

Cheers,

Martin

User avatar
Ciro Pabón
106
Joined: 11 May 2005, 00:31

Post

Well, thanks for the sample. Interesting. Keep us informed on this thread, please!

I woke up thinking a database should not need formatting, as you state. Anyway, you'll need a mighty parser: are you trying to save a text document? That's what originated Word's DOC format...

You can find regulations at FIA site, under FIA Sport/Regulations. From there you have several things to download:

Specific Regulations:
http://www.fia.com/sport/Regulations/f1regs.html

International Sporting Code (I forgot to mention the "General Prescriptions"):
http://www.fia.com/sport/Regulations/sportcoderegs.html

Technical Lists:
http://www.fia.com/sport/Regulations/techlists.html

Medical Regulations:
http://www.fia.com/sport/Regulations/medregs.html

Circuit Regulations:
http://www.fia.com/sport/Regulations/circuitregs.html

FIA Standars:
http://www.fia.com/sport/Regulations/standregs.html

You don't need to download the dozen or so documents for Driver's Equipment, they are included in Technical Lists and FIA Standards.

I guess that if FIA had written the Ten Comandments, they would be called the One Thousand Three Hundred and Forty Two Commandments With Appendices.

Nonwithstanding my comments on XML, I believe KML format will replace SHP or DWG formats some day: I've invested a lot of time on that belief. On the other hand, you could say XML is now like dBase was before the invention of ODBC... For those that don't believe on it, I advice to try to use MySQL to send a page. It's the proverbial "fly killer cannon".

Don't forget the <language> thing. It's the Federation Internationale de l'Automobile. The english version is an afterthought. Thank heaven F1 is not popular in USA. Did you know NASCAR regulations (and I am not making this up) are secret? They are probably under the command of Home Security Office and Cheney can shoot you if you read them...
Ciro

User avatar
checkered
0
Joined: 02 Mar 2007, 14:32

Post

By the looks

of it KML does seem to have momentum behind it and rightly so. I haven't delved very deep into it, so perhaps you could point us to a few representative (and exciting!) examples of where things are headed. There's also an object based format emerging from the construction/facilities management industry that seems to have potential to eventually replace formats like DWG. It's the "industry foundation classes" (IFC/ifcXML) format, which is pretty far along its development. All in all, the World is-a-changing. Once again. Some regulators and major businesses are already accepting it as a standard. Whether that is for the best with regard to true interoperability remains to be seen.

http://www.iai-international.org/
"In theory there's no difference between theory and practice. In practice, there is." - Yogi Berra