Metal Gear Solid V RIFX streams by GirianSeed at 11:24 PM EDT on April 25, 2014
I've been digging around the the game data of Metal Gear Solid V: Ground Zeroes lately, and I've noticed that most of the FSM files in the demo_stream directory contain Wwise RIFX streams.
My problem is that ww2ogg can't seem to convert them. It'll spit out files a mere fraction of the size that end up sounding like junk data.
There were also a few sound effects in the sound banks that couldn't be converted either (some with the extension .wav but with a RIFX header), but the cutscene audio is my main concern.
I tried testing this again. I extracted one of the RIFX streams from an FSM file using Ravioli Game Tools. After that, I ran the file through ww2ogg.exe with the updated codebook switch and it gave me this error.
Input: File0001.wwise RIFX WAVE 6 channels 48000 Hz 440648 bps 9347200 samples - 6 byte packet headers, no granule - external codebooks - shortened Vorbis packets Output: File0001.ogg Parse error: ran out of space in an Ogg packet
I should mention that this time the outputted file was actually listenable, but it only lasted for about five seconds.
Im confused as to what are you trying to do exactly? If you need the gamescore, check here: http://forums.ffshrine.org/f72/metal-gear-solid-ground-zeroes-gamerip-score-169667/
I've downloaded that gamerip, but I have some issues with it (tracks only loop once, some tracks are unidentified, all the music was mixed down to stereo from 6/5/4 channels, etc.), so I've decided to make my own. Plus, it's missing all of the music from the game's cutscenes.
I've already converted all of the background music and most of the sound effects, but I've yet to properly decode the audio streams packaged with the cutscene data. Even if MGSV handles the sound like it did with previous games (sound effects, BGM, and voices all mixed into one), I still want to get my hands on a clean rip.
Also, if I can manage this now, chances are that it'll also work for The Phantom Pain. I don't know about you, but I think it'd be great if we could already figure the game's idiosyncrasies out by the time it's released.
I can try to help you if you want. Can you send me the files you have (bgm, sfx, voices, cutscene music, etc) on mediafire or mega, in one zip/rar file, and I will see what I can do. Thanks a lot man! :D
Sure. The download link to the game's demo_stream directory is in the first post. As for the rest, I'll try to make some some space in my Dropbox to upload it.
You could also just download the whole game and use a tool like PSARC or Xbox Image Browser to extract all the assets like I did. The entirety of Ground Zeroes is only 2GB in total.
OK, but you left me confused here and Im worried that we might be missing something from the start.
Your filename is demo_stream.zip and you mention entirety of the game is 2GB?
I saw some place the ps3 isos are indeed around 2gb while the 360 ones are around 7gb, so what's going on there? Is the ps3 just a demo version while 360 is the full game?
I'm not sure why the 360 ISO floating around on the net is an extra 5GB in size, but I've extracted the data and it only adds up to about 1.51GB. Even the PS4 and Xbox One versions are only roughly 2GB as well.
I did some research, it's because the ps3 handles cutscenes in real-time it seems. So 360 cannot do that, hence why cutscenes take more space. anyway, as long as we have the full thing, it is fine.
I'm pretty sure that all versions of the game render the cutscenes (or polygon demos, if you will) in real-time and that the 360 ISO's increased size is due to the way the file format fills in blank data, or something along those lines.
Either way, I digress. What I really need to know is how to fix the parse error when converting the streams with ww2ogg. Here's the sample from the test I mentioned in the second post.
OK I got the entire game extracted and all files (ps3 version). I see tape files (.wem) and Wwise sound bank files (.sbp) and also fsm files seem to have sound files in them. Question is how you know which ones are the cutscene music?
I don't specifically know what each stream is, actually. (The audible part of the conversion test file I uploaded was helicopter noise, but I don't know what comes after that.) I'm just trying to properly decode them first and then see what I can find.
I didn't try it, but the tools here can create wwise files out of wav files, but maybe also edit already created wwise files with load/import? I don't know, it's a huge download and install, but you could try and see if it open wwise files and allows you to export/change the wav files within.
The source files in that archive were last updated in 2009. I doubt that the compiled tool would be able to handle the newer RIFX streams that Wwise started using after 2011 -- Ground Zeroes included, since I had to use ww2ogg's newer codebook just to get anything listenable.
Again, if anyone has any information regarding this error, I'm all ears and I'd be very grateful.
Input: File0001.wwise RIFX WAVE 6 channels 48000 Hz 440648 bps 9347200 samples - 6 byte packet headers, no granule - external codebooks - shortened Vorbis packets Output: File0001.ogg Parse error: ran out of space in an Ogg packet
I wish I had the knowledge to rectify this problem myself, but sadly I don't and have to hope that there's someone else who does that's also willing to help.
I think I figured out what the problem is, it's the codebooks.
From readme file: "If you are getting totally invalid files as output, for a game produced after mid-2011, try using the --pcb packed_codebooks_aoTuV_603.bin switch"
If doesn't work, we need somehow the latest codebooks from the Wwise SDK or something, a bin file? That is only way for you to get this to work.
Yeah, that's the same thing I got, but the stream should be -much- longer than that. It's an incomplete conversion. The original RIFX vorbis file was 10MB, whereas the OGG file ww2ogg spits out is roughly 5% the size of that.
It's because original files are surround stereo and this converter sadly only gives you mono it seems. It's hard to believe the original wav encoded into wwise format were mono...
As for complete file, to prove it, send me another file on dropbox, couple other wwise files.
Actually, most of these -- if not all -- are 6-channel. I went back and tried to extract as many files as I could with Ravioli. (Some streams weren't detected properly, which makes me suspicious.)
Almost every file either gave me an error saying that it ran out of space in a OGG packet, or that the file was truncated. There was just one audio stream I managed to convert properly, and it's from p12_050070_000.fsm.
Also, the music is mixed in with the rest of the sound effects like the previous games. There are just a lot of near-duplicate scenes to account for different variables. For example, if the player were to move other prisoners around before he opened Chico's cage, they wouldn't be present in the cutscene.
Open the source code file Bit_stream.h from the ww2ogg folder in notepad++ or another editor that has code lines. Go to line 247 where that parse error is mentioned, found it using astro grep utility.
In other words, the error appears because there seems to be something wrong with the Bits or bit-depth of the stream. but what...
We are getting somewhere maybe now with this source code, helping us understand.
I tried to contact Wwise to ask them how to edit already created .wwise files in case I made a mistake or need to further add something (obviously a fake story just to know how to import those files :P ) but their main contact thing requires you to be a licensed customer (aka paying cash for that)...so no... But would be fun to know what they say to that. Maybe paying customers get additional tools to edit those files, idk, free license Wwise tools and SDK have no such option.
This is quite a strange problem. It's not like ww2ogg can't decode the files -- they're quite listenable -- it just gives out during the conversion process.
At 256K into this file (0x40000 in File0001.wwise) there is a " DNS" header (little endian "SND "). This is approximately where ww2ogg explodes when trying to parse this file. Allow me to explain a bit about game audio and memory subsystems as background for my understanding of the issue:
In general a game wants to load up all the crap for one stage at the same time; this includes any sound effects it'll be using, sequences if there's sequenced music, instrument samples, etc. Compared to dynamically loading stuff as needed, loading a whole chunk like this, even if some of it won't be used, has a lot of benefits, importantly:
1) predictable load time (no seeking, no occasionally things being much faster or slower if the expected item was/wasn't already loaded)
2) optimal use of space (memory fragmentation is a non-issue)
This data is called "in-memory" or "resident".
Of course in general it is not possible to get everything in memory at once. The next best thing is streamed data: if we know we'll be needing some constant bitrate of video, or audio, only for as long as it takes to play it, we can set aside a reasonably-sized buffer for a portion of the data and load/play/load/play/etc. This preserves the benefits mentioned above: load time happens but it can be amortized over the whole playback so as to be unnoticeable, we only need to reserve as much space as we need to be confident that the buffer won't underrun.
Often the audio middleware handles in-memory and streamed samples with a lot of common code. The file formats are often similar or identical, though with in-memory stuff you'll find those files packed into larger "sound bank" files that are loaded at once. Most middleware will require streamed audio to be represented by header data in sound banks, even if the actual audio data is stored elsewhere on the disc and only streamed in on-demand.
Streaming has a big problem, though: starting up a stream is not instantaneous, as the data has to be prebuffered. If the engine wasn't expecting to play a streamed sound there could be a large delay before it starts while it initially fills the buffer from scratch.
A hybrid approach is sometimes used, and I think this is the difficulty you are encountering here: The in-memory sound bank contains the beginning of the audio file, the header data and a second or so of audio data, and the rest is located in another file elsewhere on disc to be streamed in. The resident beginning of the file can be played immediately as it is already in memory, and ideally this will give enough time for the streaming to get going.
This is a fine engineering solution but it makes things a mess for us trying to read the audio! In the sound bank we can find the very beginning of the sound file but the rest may be somewhere entirely different on the disc. For systems like wwise where everything is numbered with hash codes it is hard to figure this out with inspection. What you are seeing here is that the RIFF is actually truncated, the rest of it lives somewhere else, and there is something else stuffed in the sound bank next.
In particular this is frustrating because the rest of the audio data may be stuck somewhere with no header! If we knew more about how the whole wwise bank system works it might be easier to locate this headless data.