SquareSoft/Square Enix - .dat(/.sz) demuxer by AnonRunzes at 6:30 PM EDT on November 4, 2017
Okay, so I'm about to show you something unbelieavable. It's an untouched demuxer of those strange movie formats used in SquareSoft/Square Enix games. Here it is: squaresoft-square.enix_sz_dat.bms
Basically, it does exactly as I mentioned: it demuxes the .dat/.sz movie format as seen in Driving Emotion Type-S and a few other SquareSoft/Square Enix games if you see one. Although for individual movie formats such as .dat the script can handle these too! Most of the demuxing happens on these video/audio chunks, which have to be set manually(read the script for more info).
However, it does not do any kind of post-processing when it comes to these chunks, that's why it's an untouched demuxer of these movie formats.
Okay, so I updated this whole script to flesh out the "single .DAT movie type" part, alongside a few things which I'll get into it later.
So I just found out that in those .DAT movie files(through FFXI's OPN.DAT file exclusive to the PS2 version), the "ID" value 4(in which audio chunks are stored) can also store several audio formats. Depending on how present they tend to be(not to mention how much the size of these chunks tend to vary thoughout three 32-bit fields after the first one), these formats are indicated through a single value that has three -1 bytes, followed by a single byte that indicates which format is this used on. You can thank quickBMS's "endian" detection for that.
To allow these audio files to be extracted at any way, I decided to make another variable(EXTRACT_AUDIO2) whose number can be set into 1 just for that. If said number is set to 0 the extraction will happen as-is.
I've also made a few changes related to the .SZ/.DAT format. The whole thing regarding the EXTRACT_AUDIO variable has been restructured, meaning that the whole audio extraction thing regarding The Bouncer's .SZ/.DAT files have been moved into the 2 value of said variable. If EXTRACT_AUDIO is set to 1 it can extract SquareSoft's tweaked PlayStation ADPCM chunk file as a whole while omitting the first 0x10 bytes out of each chunk.
And finally, the whole thing regarding the FINAL_EXTRACTION variable has also been restructured. A new number called 3 that sets out the extraction of these audio files stored on single .DAT movie files is now available, although the 2 number of said variable is currently unused until further notice.
Today, this script has been updated to its third version. I will try to explain all changes that drove into this update.
For starters, this script can now "split" .pcm and .ac3 files(assuming these chunks that contain two separate codecs merged into one were already saved as two separate files to begin with) into two layers for each file containing this codec. That splitting is based on how bigger the audio chunk size actually is, which means I also had to guess everything based on what I saw out of these audio chunks sadly.
Most of the code used for the layer-splitting part was "based on"(read: copy-pasted from) the deinterleaver.bms script I just wrote since at least 5 months and 3 days ago. The best part is that I made half of the script into a poorly-optimized piece of shit quickBMS code, so if you don't know how to program in quickBMS like me, it's time to get yourself prepared 'cuz your PC's goin' to get ya' memory-anus handled in no time!
This is where the whole extraction process get interesting. Not only did I manage to think about this, as I also thought about byte-swapping resulting .ac3 files by 16-bits to get a playable result, and the script already does this before said .ac3 files are even splitted in the first place! Again, how splitted the output .ac3 files are depends on the entire size of the audio chunk, which is also the "chunk sign" of said chunk unfortunately. Also, how is the script willing to byte swap .ac3 files depends on the first two bytes of said .ac3 file - if it already looks like a "normal" .ac3 file the script will handle the extraction/layer-splitting process without much problem. I left the .pcm files untouched though since you might want to code some .TXTH files for those, I didn't have much luck with .TXTH stuff ever since I tried making one.
Anyways, time for a tl;dr changelog: v01 - Initial version. v02 - (Driving Emotion Type-S, Gekikukan Pro Baseball: The End of the Century 1999, Final Fantasy XI(PS2), Front Mission 5)The extraction process has been expanded. The script can now extract raw ADPCM-like audio chunks if EXTRACT_AUDIO is set to 1(.SZ/.DAT). EXTRACT_AUDIO2 is now added, and can already support three audio chunk types(.DAT). v03 - (The Bouncer) The script can now handle PCM/AC-3 chunks successfully through way of splitting by (fixed)chunksize, and will also handle .ac3 files by itself to get a playable result - if EXTRACT_AUDIO is set to 2. New chunk types have been added for this purpose within the EXTRACT_AUDIO variable(.SZ/.DAT).
Okay, so the script has now been updated to its fourth version. It can now support layer-splitting extraction of individual PCM chunks through a fixed chunksize. owever, just like .ac3's splitting case as described above, how much layers are splitted out of these .pcm files depends on how big the chunksize for these chunks tend to be.
In other news, I also integrated a changelog into the header of this script for historic reasons.
You can find it here through the usual pastebin link.
EDIT1: Here's the .TXTH needed to play headerless .pcm files after the entire audio-demuxing process: .pcm.txth