Sonic Lost World Sound Files by Kurausukun at 12:22 PM EST on January 14, 2015
In Lost World Wii U, all of the streams are in a file called "bgm_streamfiles.awb," which is ~200MB. Is this a known archive type/can I extract it? The rest of the files are .acb (or also .awb). Same question for those.
Thanks, I didn't know VGMToolBox had that function. Turns out they're all just .adx files, which is weird because I thought for sure they would be HCA. But the real kicker is that ALL OF THEM ARE 32000Hz
by RebeccaSugar at 4:36 PM EST on January 14, 2015
only ninty does that sample rate, oh well, care to upload?
Uploading as I type, will update this post with the link once it's done. Also, no DLC files, unfortunately. I didn't dump this myself, someone of at Sonic Retro did because the common key was leaked. The reason I was surprised at the sample rate is because Sega almost always samples their in-game files at 48000, in fact this is the first (console) game I can think of since maybe Shadow the Hedgehog that had 32000.
EDIT: Here https://mega.co.nz/#!gllTAapA!kx9uJpjnNJTPiu0Kt_b6jZ2ZG-KCWrjc-RGFn38APFY edited 10:23 PM EST January 14, 2015
edited 11:12 PM EST January 14, 2015
by RebeccaSugar at 5:39 PM EST on January 14, 2015
Shadow the hedgehog had 48khz actually, I just checked, unless they're upsampled.
Shadow the Hedgehog had 48kHz on XBOX, but on GC it was 32. We're talking Nintendo consoles here, but if you want to include them all, then we'd have to go all the way back to SA2(B).
There isn't a specific tool AFAIK; like soneek said earlier in the thread, use VGMToolBox's adx finder. Also, I don't think the voices would be in the .acb file (unless you're sure they are), I would expect them to be part of a larger .awb archive.
edited 2:57 PM EST January 29, 2015
by TabuuAkugun at 10:20 AM EST on January 29, 2015
The reason I'm sure they're in the ACB itself is because there's no AWB with her name on it AFAIK.
EDIT: All that's there is the music and the echoed voices that play before her boss battles.
EDIT 2: I cannot for the life of me unpack those accursed ACBs.
Thanks. I'm adding it to VGMToolbox, so you'll definitely see it when it's ready for public tests.
Regarding name matching yes, it uses the CueNameTable and WaveformTable entries in the ACB's UTF Table to match the cue IDs located in the AWB (AFS2) file. Hcs's utfview.exe has been invaluable in helping with this analysis.
Still trying to figure out how to map the ACB to AWB. The previous samples I researched has the same number of file names in the ACB as music files in the AWB. This one is different.
Finally figured out the CueName to AWB mapping. I'm throwing this up for historical purposes and since Sourceforge SVN is still down. It's been a long time since I worked with pointers, so bear with me if the notation is off:
I've uploaded a beta VGMToolbox with the WIP ACB/AWB extractor here. It does not yet extract data from within the ACBs themselves, although I think I understand things enough to add it soon. I've only had samples for 3 file types (AT3, ADX, and HCA), so if you get an output file extension with ".EncodeType-XX.bin", please report it so I can review the file and update the code.
This version of VGMT also has a CPK extractor added. The ISO Extractor also has a CPK browser. I got sick of waiting for thousands of voice files to extract for 30 BGMs. I haven't added support for encrypted CPKs yet.
Thanks so much to hcs for the UTF/CPK code and posts. The base UTF info came in handy for both CPK and ACB/AWB stuff.
Error processing <C:\Users\Infernus Animositas\Desktop\ACB\strm.acb>. Error received: Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: startIndex
strm.acb is from Digimon All-Star Rumble on Xbox 360 BIRD_BGM.acb is from Dragon Ball Xenoverse on PC
Here are the ACB files for further investigation http://www90.zippyshare.com/v/P6aqG4pI/file.html
Thanks. A quick check with hcs's utfview threw unexpected end of file errors. Are you sure these are the complete files? If so, I'll take a closer look to see if this is a new format or something.
I figured out the problem, and uploaded a fixed version here.
For anyone else familiar with UTF, the samples provided seem to indicate that the offset to row information is actually a ushort at offset 0xA rather than a uint at 0x8.
The very first extracted Cue ID is 00000_BRD_BBGM_00_GOK and according to the ACF as shown here and the waveform's number of samples for that file it should be 4794827
But the converted HCA has 5266364 samples which is the exact number of samples for the second Cue ID 00001_BRD_BBGM_01_VGT as seen here and here
EDIT: Here's the ACF file for further inquiry http://www51.zippyshare.com/v/kxnRRZ2G/file.html
Thanks a lot for the testing, I really appreciate it. It turns out this ACB had a different ReferenceType in the CueTable, 3. This resulted in a different offset correction for the Waveform ID, making the formula above:
Please keep the bug reports coming. I also realized I hadn't added ITOC support to the CPK functions yet. I'm adding that and a brute force decryptor as well for CPK in case the keys change.
Here's an updated copy. It looks like maybe extra data is left in for files that are not actually there. _Or_, I need to do more analysis, since CRI Viewer seems to think some of these files exist, but I cannot test without the AWB.
The interesting part is that it adds silence samples to the track and the number of samples can range anywhere from 100 to 5000 additional samples.
Not sure why that is though but it works just fine.
Since it's Japanese, I've roughly translated the batch decode and decryption files as well.
The decoder can also show you the file header information of a HCA too :)
In Xenoverse there are 81 HCA files in the BGM AWB archive but there are only 72 Cue IDs in the corrosponding ACB.
I'd imagine the same thing happens in other games that use this middleware as well.
Is there a way to have those additional HCAs extracted but without filenames (maybe just their offsets?) without using the HCA Extractor and manually sorting which ones weren't included from the ACB/AWB Extractor?
Here's the ACF, ACB and AWB archive from Xenoverse to test on. Xenoverse Audio
I'll take a look. I was focusing on getting the CPK stuff (encryption, ITOC support) completed. Now just need to test the heck out of it.
There are a number of ACB variations that I've seen: (1) ACB with no AWB: SFX inside the ACB (2) ACB with more cues than AWB, usually uses a song more than once with different names (probably applies different effects to the waveform) (3) ACB with less cues than AWB, your example above. (4) ACB with internal AWB and external AWB (Sonic Lost World)
I'm going to work on supporting all of those ACB types and incorporating ACF categories (maybe as subfolders to separate SE, VOICE, and BGM when possible) and any other helpful ACF things (if any).
A nice thing is the ACB has internal MD5 checksums for the AWB and ACF too, which helps verify the integrity of the source data.
Good luck. I hope it works out so I can extract Lost World's files. It's not completely necessary since the .adx extractor works well enough, but it would be nice to have proper names. Also IIRC some SFX archives need this. Right now the files just throw errors at me when I try to use it, so I'm either doing something wrong, or you just haven't figured the format out enough yet--not sure, sorry.
Here's an updated build of VGMToolbox. I've got all of the ACB/AWB samples working well, I think. Some of the items don't always seem to match CRIAtomViewer's columns, but many do match.
Alright, I just tested with Lost World's bgm.acb file.
It properly spit out all of the .adx files, but it's not perfect; there were a some duplicates (seemingly), and a few of the files still didn't have proper names. But this is great, excellent work.
I know it's meant for sonic lost world acb and awb files, but it doesn't work on pokepark2 acb and awb files. (Since they share the same file extensions and they contain similar data, they should be similar files and might not be too much trouble for you to add support for should you decide to do so.)
There are acb/awb archives that may have more waveforms (music/sfx files) than cue names. Any waveform that doesn't have a corresponding cue name is extracted using the base file name and the file's id within the AWB (AFS2) file.
Also, one waveform may have 0 or more cue names, causing the duplicates. You can use VGMToolbox's checksum calculator/duplicate finder if you want to remove duplicates (or maybe I'll add an option to only use the first cue name).
As for pokepark, it should work. This tool is supposed to work for all CRI acb/awb packs.
It says that extraction is complete, but it does that regardless of what files I drop into it. It doesn't actually extract any files. Here is an example awb from pokepark2 if you want to see for yourself. http://www.filedropper.com/spikavoicestreamfiles
This is a CPK file, not an AWB. Drop it on the ISO/Archive Extractor tool to view the contents. Since it uses an ITOC file system, the files will not have "nice" names.