In the header I miss the <CW date>. This is part of the page footer. In your schema does not exist the concept of <page>, <header> or <footer>, unless I'm missing something.
Where is the <TableOfContents>?
I also miss the <Language id " "> tag. This is not easy: the french and english versions are put side by side, so in effect the whole regulation has two <DocumentColumn>s. How do you handle that in XML?
In the maintext I miss <Table id " ">, like the ones you have in Article 19, 2007 version, unless is a list type, but then you have the problem of <row> and <column>.
You don't want BLOB fields and I don't know if is a good or a bad thing. I guess it depends on the number of photos you have on your database! Anyway, <DiagramReferences> are located at the end of the text, not as part of an article.
How do you contemplate the fact that many times regulations include the <OldText> crossed over and the <NewText>, both in a different <TextColor> from the rest of the text? The <TextColor> varies from version to version and from Regulation to Appendixes (more on appendixes follows).
<Appendix>es, which are part of the "main" regulation, are a recursive nesting of the whole thing. They are given as separated <Document>s: Appendix O, one of my favorites,
named "Procedures for recognition of International Circuits" has <Supplement>s at the end. This Appendix O have a couple of <Formulas> and <Notes>. Several appendixes have a <Foreword>. They also have a couple of <Hyperlinks>.
What about the <SportingRegulations>? They are part of the regulations, even if they are a different document. They have a slightly different form: all paragraphs have a <ParagraphIndentation> and have a consecutive number. Its <Subparagraphs> have a consecutive letter to identify them. I guess is a little tedious and prone to errors to include that as part of the id " " part by hand: I wonder if there is a way to create a <NumberedList> with a <NumberedListFormat> in XML, the same way a Word document has this kind of object (that is, "Numbered Lists") different from the "Title" object.
Sporting Regulations have their own <Appendixe>s, but unless Regulations, they are an integral part of them and not given as separated documents.
Those Sporting Regulation's Appendixes have another object: <Form>. A form is composed of <line>s, with a <LineTitle> and a <LineFillIn> part. A line can have <SubLines>. Check the "Registered Office" part of Form named "THE APPLICANT" in Appendix 2 of Sporting Regulations. To complicate things, the line named "Adress" has sublines with <SubLineTitle>s: "Tel", "Fax" and "e-mail". The "CONSTRUCTOR DETAILS" form is even more complex: it has two <lines> in the same line, like the one that says "License ............ Issued by ..........". I throw the towel there...
The document, which I don't know how to call, because it does not appear to be an Appendix but is referenced in Appendix O (perhaps <AppendixToTheAppendix>?)
, named "Circuit Drawing Format" is complex: the tables included in it have rows with <RowTitle> and include <CellDrawings> into the cells that compose the table. The <row>s in those tables have <subrow>s. The diagrams included at the end have a <DiagramOrientation>: some are vertical and some horizontal, I don't know if you will include this kind of formatting.
Well, this is way too complicated to me... Perhaps you will answer (and I won't blame you
): "Ciro, thanks but I said clearly Tech Regulations!". In that case, forget everything except the <OldText>/<NewText>/<TextColor>, <TableOfContent>, <DocumentColumn>, <Language> and <Table>/<Column>/<Row>, I think. You can avoid Appendixes and I'll agree...
I'll have some mercy with the few readers left at this moment of the post and won't mention (but you must know) that I believe that rulings by the International Motorsport Council (or whatever his name is) are part of Technical Regulations. You can conclude Max Mosley's succesor
must be a lawyer: any engineer shudders thinking about that, I guess.
I bet you are one of the few individuals attempting to understand Technical Regulations structure, so you will have to read them, which probably makes you unique: my condolences. No wonder people breaks them all the time: nobody reads them in its entirety.
Anyway, thanks, Martin: I had a good time posting this. I've been cursing XML since I started to make a database to be handled by VB and exported to Google Earth KML format: it has take me months of tinkering and I'm not done yet... I haven't even finished reading the KML specification: it's longer than War and Peace and it has five or six miserable objects. They seem redacted by me! The guy that invented XML should be forced to drive a Spyker and have Ralf Schumacher as teammate.