In my last entry I talked about the dialog files and how they are stored in the Cory in the House DS game. In this entry, I'll be talking about conversation files as I describe them. In the dialog folder there are two files pertaining to a day, a bst file and a tdf file. The bst file contains all the dialogs while the tdf file tells the game what to do when you talk to someone. They contain things such as who says what or what kind of laugh track to use (did I mention there are laughtracks yet?). There are other things too, such as setting an event, closing the dialog, or even altering the amount of money you have. Trying to reverse engineer the conversation files took a lot of time as there's no clear indication what each byte does when first looking at all. Nearly all of the file contains a bunch of numbers and to be honest, I had to look through my program I made to remember how this file is even structured.
To delve straight into the file, here's what it looks like with some highlighting to show what groups of bytes are related to each other.
So, let's take each part one bit (no pun intended) at a time.
This part is the easiest, as it's the only part that has characters, which is the location of the dialog file. The 12 in red is the length in hexadecimal of the filename. The 3 bytes in yellow are just zeros, I'm not sure what it's for, but as all files had them, I just let it go. The next 18 bytes is the filename of the dialog file, so the game knows what dialog to spew out when this file asks for the 12th or whatever line of dialog.
This part of the file is a bit more confusing, but there is somewhat of a pattern in these. Notice that as with the beginning of the dialog file, these are increasing values (0000, 0200, 0258, 0279, 0317, etc). And similar to the dialog file, this is the index of the file. I presume the game has it programmed so when you talk to such and such, it'll start conversation 6. Conversation six has an offset of 033D, so read the file starting at that offset until the end of the conversation is reached. Also worth noting as that all files have 256 maximum possible conversations, which is why some of these values are 00 00 and why the first offset is 0200 (512 in hexadecimal, or 256 groups of 2 bytes).
Now the fun part, this is where all the conversations are stored. Every conversation starts with the value 86 followed by 00 or 01. What the 00 or 01 does, I have no clue, but there's something special to take care of when 01 appears, which I'll get to later. I have this part highlighted in red. The next one in blue is the number of commands we'll call them. So in this case the first one is 0A, or 10, commands. We can then read 10 commands and be sure that the next value after is either the next conversation or end of the file. So, then next group of bytes in orange is the offset of the next command. We can use that with the current offset to determine the length of the command. For example, we know that the offset starts at 0200 plus the 5 bytes of the red, blue, and orange section. That means the length of the command is 020D - 0205 = 8. So, we have 00 07 00 00 00 00 00 21 as the command. What does that mean? It took a bit of trial and error, but this is the dialog command. 00 07 means it's a dialog, the next 00 means Cory is speaking (or so I assume, it was always 00 when Cory is talking, but changing it to something else never did anything that I could see), the next 00 00 means to use the 0th dialog from the dialog file, an the 00 21 means to use the 21st character sprite (the happy Cory sprite )
This pattern continues until we reach the 10th command, and we start back at the 86 value and we repeat again until the end of the file. The first conversation looks something like this split apart.
I mentioned that the value after 86 could be 00 or 01, well this is what it looks like if it's a 01
There are three extra bytes after the 01 and before the number of conversations value. I have no idea what these do, but keeping storing them as is didn't cause any problem so that's what I did.
Now we know how to read the file, the next part is to know what all these commands are, what does it mean when the command starts with 00 86? or 00 27? The only way to find out is to see what happens. Since dialog was the easy one to figure out as they appear most often, finding something between two lines of dialog can help determine what happens. Take 00 86 02 in the above image for example. The dialog before that was Cory saying "And they're sending me a boatload of them so I can sell them to tourists. Hello money!" to which the audience has a light chuckle. There we go, 00 86 tells us to play the laugh track. There's also a 00 86 01 in there, which is a louder laugh after Victor says "That's what I'm afraid of." So the value after 00 86 is telling the game which laughtrack to play. There's also a 00 86 00 laugh which is the equivalent of a Big Bang Theory Windows 7 joke.
I kept on doing this for other commands and was able to guess a good number of them, which are:
So, I guess the next question is how was I even able to figure all this stuff out? A lot of this was trial and error, but a good portion of it was trying to find patterns. 86 00 appeared a lot, same with 02 (which then became a lot of 03's, then 04's). Seeing those increase meant looking at the value next to them, which also increased, so I knew these had to be related to something with an index, counter, or something similar. 00 07 appeared a lot, too. But once I found the patterns, the hard part was finding out how they all relate to each other, and that really was just a bunch of trial and error with editing the file, recompiling the ROM, and playing the first scene.
After I got a decent understanding I wrote a file reader/writer and created this masterpiece
And this to find see all the dialog sprites and laugh tracks
There's still a bit I don't know about this file, but getting a good enough understanding to write an editor has to be some kind of accomplishment.
The next and final entry will be about the dialog head sprite files. I leave you with an early version of the editor in action set to very inappropriate music.
To delve straight into the file, here's what it looks like with some highlighting to show what groups of bytes are related to each other.
Makes sense, right?
So, let's take each part one bit (no pun intended) at a time.
This part is the easiest, as it's the only part that has characters, which is the location of the dialog file. The 12 in red is the length in hexadecimal of the filename. The 3 bytes in yellow are just zeros, I'm not sure what it's for, but as all files had them, I just let it go. The next 18 bytes is the filename of the dialog file, so the game knows what dialog to spew out when this file asks for the 12th or whatever line of dialog.
This part of the file is a bit more confusing, but there is somewhat of a pattern in these. Notice that as with the beginning of the dialog file, these are increasing values (0000, 0200, 0258, 0279, 0317, etc). And similar to the dialog file, this is the index of the file. I presume the game has it programmed so when you talk to such and such, it'll start conversation 6. Conversation six has an offset of 033D, so read the file starting at that offset until the end of the conversation is reached. Also worth noting as that all files have 256 maximum possible conversations, which is why some of these values are 00 00 and why the first offset is 0200 (512 in hexadecimal, or 256 groups of 2 bytes).
Now the fun part, this is where all the conversations are stored. Every conversation starts with the value 86 followed by 00 or 01. What the 00 or 01 does, I have no clue, but there's something special to take care of when 01 appears, which I'll get to later. I have this part highlighted in red. The next one in blue is the number of commands we'll call them. So in this case the first one is 0A, or 10, commands. We can then read 10 commands and be sure that the next value after is either the next conversation or end of the file. So, then next group of bytes in orange is the offset of the next command. We can use that with the current offset to determine the length of the command. For example, we know that the offset starts at 0200 plus the 5 bytes of the red, blue, and orange section. That means the length of the command is 020D - 0205 = 8. So, we have 00 07 00 00 00 00 00 21 as the command. What does that mean? It took a bit of trial and error, but this is the dialog command. 00 07 means it's a dialog, the next 00 means Cory is speaking (or so I assume, it was always 00 when Cory is talking, but changing it to something else never did anything that I could see), the next 00 00 means to use the 0th dialog from the dialog file, an the 00 21 means to use the 21st character sprite (the happy Cory sprite )
This pattern continues until we reach the 10th command, and we start back at the 86 value and we repeat again until the end of the file. The first conversation looks something like this split apart.
The xx xx is the offset value I mentioned earlier, but it's not necessary to store these values as they are calculated.
I mentioned that the value after 86 could be 00 or 01, well this is what it looks like if it's a 01
There are three extra bytes after the 01 and before the number of conversations value. I have no idea what these do, but keeping storing them as is didn't cause any problem so that's what I did.
Now we know how to read the file, the next part is to know what all these commands are, what does it mean when the command starts with 00 86? or 00 27? The only way to find out is to see what happens. Since dialog was the easy one to figure out as they appear most often, finding something between two lines of dialog can help determine what happens. Take 00 86 02 in the above image for example. The dialog before that was Cory saying "And they're sending me a boatload of them so I can sell them to tourists. Hello money!" to which the audience has a light chuckle. There we go, 00 86 tells us to play the laugh track. There's also a 00 86 01 in there, which is a louder laugh after Victor says "That's what I'm afraid of." So the value after 00 86 is telling the game which laughtrack to play. There's also a 00 86 00 laugh which is the equivalent of a Big Bang Theory Windows 7 joke.
I kept on doing this for other commands and was able to guess a good number of them, which are:
00 00 xx - Set Event xx = 9-bit integer representing the event 00 07 aa bb cc cc 00 dd - Dialog aa = 00 Cory is talking 01 Someone else is talking bb = 00 Normal dialog box 01 Thought dialog box 02 Phone dialog box cc cc = 16-bit integer representing the line of dialog from the dialog file dd = 8-bit integer representing the character speaking 00 09 xx - Close Dialog 00 27 xx xx 00 00 - Add/Subtract Money xx xx = 16-bit signed integer 00 86 xx - Sound effect xx = 00 - Uproarious Laughter And Applause 01 - Loud Laughter 02 - Chuckle 07 - Phone Ring 08 - Phong hang upOnce I got all these figured out along with the dialog file, I could code up an application to change these things and display it in a fashion that's more readable to a human, which ended up looking like this:
Written by: Larry David
So, I guess the next question is how was I even able to figure all this stuff out? A lot of this was trial and error, but a good portion of it was trying to find patterns. 86 00 appeared a lot, same with 02 (which then became a lot of 03's, then 04's). Seeing those increase meant looking at the value next to them, which also increased, so I knew these had to be related to something with an index, counter, or something similar. 00 07 appeared a lot, too. But once I found the patterns, the hard part was finding out how they all relate to each other, and that really was just a bunch of trial and error with editing the file, recompiling the ROM, and playing the first scene.
After I got a decent understanding I wrote a file reader/writer and created this masterpiece
And this to find see all the dialog sprites and laugh tracks
There's still a bit I don't know about this file, but getting a good enough understanding to write an editor has to be some kind of accomplishment.
The next and final entry will be about the dialog head sprite files. I leave you with an early version of the editor in action set to very inappropriate music.