These .loc files are plain text constructed in an XML format so I grabbed an Open Source XML Editor.
XML Copy Editor 1.2.1.3
This is available on Source Forge.
I started by extracting the .loc files from Uru Live as it was easy to go to the location on line and see what these files produced
I started by looking in detail at "BaronCityOfficeEnglish.loc"
This is a nice small area with a range of in game documentation on one desk.
I grabbed screen shots of all of the pages from Sharpers Journal, The cover of Sharpers Journal, The Wrinkled Note, The Map, The lined page with sketches and The document that looks like something D'ni to do with the Teledhan lock barrier.
The last two are pop up images that whilst documents must be associated with something other than .loc files.
The cover is also an image but this is one that is going to be associated with the code attached to your Journal modifier but a reference does appear as one of the very first elements in the .loc file.
There may be a way to use this to embed the other two samples into the .loc file if a way can be found to stop it offering further pages.
However the other two documents are in the .loc file.
Now it appears that content of both documents are in the one file so it would seem that each age that contains documentation as one .loc file for all of the documentation contained therein.
The content of the Sharper Journal is actually broken into two distinct sections and the entries take the form of discrete paragraph entries with an opening date in the format Day Month Year. This date is just plain text and is not actually some sort of encoded date.
There may be some reason for the text to be broken into two sections but I believe that it is actually a hangover from the way Uru evolved where the second section was added after the Inclusion of "The path of the Shell" either way we can see that the content of a journal can be broken into discrete sections within the .loc file.
"BaronCityOfficeEnglish.loc" starts with the following opening code.
<?xml version="1.0" encoding="utf-16"?>
<localizations>
<age name="BaronCityOffice">
<set name="Journals">
<element name="Sharper">
<translation language="English"><cover src="xSharperJournalCover*1#0.hsm"><font size=18 face=Sharper color=982A2A><margin left=62 right=62 top=48>11.14.97 - Looks like they've agreed to let me take "control" of Teledahn. Time to start a journal. Officially.
11.17.97 - Maybe not. Kodama "popped in", going on about his inspections, in his usual arrogant manner. What a joke.
The first line is a pure XML opening header in XML code follows in blocks surrounded by an opening and closing statements so the entirety of the .loc file body is indicated with the statement.
<localizations>
This is paired with a corresponding closing statement at the end of the file.
</localizations>
The next section block is delineated with the <age> and </age> but this time the opening statement includes the name of the age in question so here we see the following.
<age name="BaronCityOffice">
Note the closing statement does not have the appended age name.
</age>
This sits inside the localization pair
So far our structure is as follows
<?xml version="1.0" encoding="utf-16"?>
<localizations>
- <age name="BaronCityOffice">
</age>
This shows the pairing principle where the opening statement contained in angle brackets is matched with the same but with a leading slash / mark.
Inside the two age statements we have opening and closing set statements <set name="Journals"> </set> note again the opening statement has a name qualifier that is not explicitly added to the closing one.
As mentioned earlier the Sharper Journal has its content broken into three sections the cover image then two sections of text that appear as one continuous block inside the Sharper Journal.
The opening and closing set statements appear to block together all of the elements of a particular Journal or document
So we now have the following structure for our sample
<?xml version="1.0" encoding="utf-16"?>
<localizations>
- <age name="BaronCityOffice">
- <set name="Journals">
</set>
As mentioned the Sharper Journal text is broken into two parts these are the elements of the set and have their own opening and closing statements and again the opener has a name qualifier dropped from the closing statement in this case we have two pair one with the name ""Sharper" and the second with the name "SharperPart2"
So here is our revised structure.
<?xml version="1.0" encoding="utf-16"?>
<localizations>
- <age name="BaronCityOffice">
- <set name="Journals">
- <element name="Sharper">
</element>
<element name="SharperPart2">
</element>
The next level of pairing is the translation pair it is unclear why this level exists but if we are to conform to the structure Cyan have put in place then we need this pair. again the opening statement has qualifiers that are not required for the closing. It should be noted that the opening statement is actually the same in both cases. NOTE the sample being used is the English one and I dare say other languages will use the corresponding openings.
<?xml version="1.0" encoding="utf-16"?>
<localizations>
- <age name="BaronCityOffice">
- <set name="Journals">
- <element name="Sharper">
- <translation language="English">
</translation>
<element name="SharperPart2">
- <translation language="English">
</translation>
Now we come to the actual body of the document.
We have the following short line <cover src="xSharperJournalCover*1#0.hsm">
This is obviously a reference to the cover image that will bear further scrutiny that someone may be able to shed some light onto. For now I am looking at the text component.
There is now a header for the text that is going to establish the font color margins and it is not altogether clear as to where this line starts as some of this may be formatting for the cover image. It does appear that any qualifier starts with < and ends with > allowing us to break the following into two lines.
<font size=18 face=Sharper color=982A2A>
<margin left=62 right=62 top=48>
Here we can see a font size of 18 points a font face with the name Sharper (this appears to be a nice hand written font) with a color of 982A2A this colour has the appearance of red ball point pen.
The next line establishes a left margin of 62 points a right margin also of 62 points this would suggest that the margins are measured from the respective edges and not some common left datum. There is also a top margin of 48 points this appears sensible.
At this point we dive into the text where blank lines appear as the same (blank lines) in the displayed document and word wrapping and page splitting occurs automatically.
Part way into the second page we see the following statement <font color=000000> and this changes the font to a standard black and continues until a point later on when we see the statement <font color=323853> this is actually a slightly different shade of red ink. Later on this changes to yet another color with the setting <font color=323853>
Clearly this is intended to give the appearance of the same writer using different pens and as such it is a nice level of illusion.
It can be seen that the body of the text is in plain text with only the inposition of codes to change the ink color.
Here is the corresponding closing section that divides the first and second part of the Sharper Journal.
11.28.03 - What is it with this Phil character? The man is disturbing, yet intriguing. I was able to talk to him last night for a while in Kemo. He's probably just crazy but if they will listen to him, he could be very useful. More people need to hear him. He's agreed to meet with me again. This could be just what I needed.
</translation>
- </element>
<element name="SharperPart2">
- <translation language="English">
Here we see the closing elements of the first section followed by the opening of the second it would appear that the staggered indentation is an artifact rather than some required element and at this point I am not sure whether the indentation is actually required at all and not some artifact intended to support readability.
When we come to the end of the .loc file we find a smaller section of code this is the content of the crumpled note the first three lines are the closing statements of the the previous block of text
</translation>
</element>
</set>
<set name="TextObjects">
<element name="WrinkledNote">
<translation language="English">
Dr. Watson -
Big problems. The house on Noloben is NOT empty.
I met someone there today. My D'ni isn't great, but
I spoke with him for a while. Yeah, he's D'ni and,
as we figured, he knows a lot about the creatures.
A WHOLE lot.
We obviously need a meeting ASAP.
- Marie</translation>
</element>
</set>
</age>
</localizations>
This is quite nice because it is small self contained and obviously carries the formatting established earlier in that it uses the previous margins font and ink color from the previous section of text it would therefore appear that these elements even though they sit inside <translation><set> and <element> pairs are some global change that stays in place beyond those boundaries.
Obviously the last two lines are the closing statements of </age> and </localization> for the complete file.
While it may be possible to build some python code to produce this file it may be easier just to produce a template I am at this time unaware if there is some way of adding comments to that template and that could well be subject to experimentation.
I am going to experiment with this and will add further notes if I find out anything. Feel free to add comments and I would be grateful if anyone can shed some light on the bit about the cover.
[Edit]
I was unsure if you make changes to a .loc file. In my local eddition of Destiny I opened the "CleftEnglish.loc" and made a couple of changes.
Specifically making the opening to Dear Bubba and added a line "What again? What's with all this dreaming?" saved the result and opened my local incarnation of Destiny.
I linked to the Cleft opened the document and had no issue reading my changes so it would appear there is no checksum attached to the .loc files
[/Edit]