HCS Forum - My personal VGMSequence concept

My personal VGMSequence concept by Katsur at 12:21 AM EDT on October 14, 2025: So I was brain-storming the equivalent of VGMStream, but sequenced, capable of playing raw, unadulterated, original sequenced game music files with soundfonts. Yeah, sounds unlikely, but I was thinking my best, so here we go...

1) TXTP-like file - basics:
-Soundfont with samples: specify the location first, then choose the suitable sample bank. It can also scan sample bank file's content of samples used within raw VGM sequence file
-MIDI-like sequence file: which file (not PSF, PSF2, USD, etc). No actual header? TXTH file may help.
-Header, when specified correctly, may work straight out of VGMSequence's internal compatibility database, such as Squaresoft AKAO Header.
-Subsongs (optional; if available and necessary)
-Tempo (in case of sequence file lacking specified BPM)
2) Module trackers are also welcome, including UMX (Unreal Music Header), MOD (Protracker MOD Header), XM (FastTracker II Header), MO3 (Unseen MO3 Header), etc. Besides, old PSF/miniPSF alternative format (and its derivatives, such as 2SF/mini2SF) is also supported, with no TXTP needed thanks to miniPSF being smart enough to find psflib
3) The obscure STY+DLS sequence format (Hitman: Codename 47, Gothic, Midnight Club II, etc) is welcome too!
4) It may have its own sequence instruction pattern visualization screen, in vein of XMPlay and OpenMPT. Supports every compatible format, either MIDI sequence with just instructions but without integrated samples, or module trackers with built-in samples and instructions.
5) Bit depth and sample rate are adjustable, up to 32-bit 96000 Hz respectively, and encoding type is synthesized, according to Foobar2000.

Yes, it sounds crazy, I admit it. It needs some research, reverse engineering, sweat&tears, etc. Feedback for improvements appreciated; this is just an estimate concept idea made-up by me.

Motivation? To properly play raw, unconverted video game sequenced music files straight from Sony PlayStation, GameCube, Nintendo DS, etc. with little-to-no hiccups.

edited 12:38 AM EDT October 14, 2025

edited 10:39 AM EDT October 14, 2025
by WDLmaster at 7:16 AM EDT on October 14, 2025: I had the same idea decades ago and a few years back I started developing something like this, originally designed with N64 sequences in mind, but can be anything as long as the implementation details of any given format is exactly known. My concept is as follows:

I have a RIFF-like container format (.YMF) that contains the isolated soundbanks, wavetables and sequences. The data segment is the first chunk in the file ("YMF!") and has all the extracted/isolated data in continuous form. Then comes the format chunk ("FRMT") that has 8-byte sub chunks that tell the actual format and offset of the soundbank, wavetable and sequence, with additional info if necessary. It can have other chunks as well that are necessary if some formats have game-specific quirks or need additional info that's not present in the soundbanks themselves.

The actual hardcore part is writing the playback engine and all the loaders for every format that exists. That's exactly what I did and I can tell you, it's a LOT of work. I'm not sure if the work put into this is actually worth it in the end but the results in some cases are remarkable, especially when you hear N64 music with stereo reverb and without glitches, especially the ugly upsample limit of the N64, which cuts every note that is pitched up more than 200%. But you have to implement *everything* from scratch. The first thing I did was the standard compact MIDI sequence format of the N64 along with the Pointer Table V1 and VADPCM wavetables. But it wasn't long until you hit the first incompatibilities and quirks of the standard implementation. A few examples: HAL-Games have messed up vibrato settings in the soundbank and non-standard MIDI CCs for different things. Some games set instrument properties in-game at runtime. Stuff like this needs special handling. Two solutions for this would be a lightweight scripting language (TXTH) or – as it is currently, another chunk ("MODS") with modifications and stuff but it's not flexible enough. Scripting language would be better. Could be embedded or external if needed.

The reason why the entire project exists at all was Mischief Makers, which has messed up soundbanks. I wanted to know what the hell was going on with this game so I decided to examine the ROM for additional stuff stored near the beginning of the ROM. It has per-song settings for each key-region in the soudbank and per-song reverb settings. Something that's not included in the standard soundbank data and thus makes the handling of additional info mandatory in the first place. But after I had the CMID playback engine and loaders finished, I could suddenly play most of the games with this format just fine. Ripping standard games was now just a matter of 2 minutes, you just isolate the data using a HEX editor, put the offsets in my tool and click "compile YMF". Creating a USF is a nightmare in comparison.

I haven't worked on it in a while because even if it's damn awesome and exciting the first time around when you hear *actual* real time sequences playing flawlessly inside you own tool, it quickly becomes routine and gets boring. Remember, you have to implement every aspect for every format that you wish to support. And there are some nasty sequence formats out there! AudioSeq, BMS, AKAO… AudioSeq and BMS kinda work but only the early ones. I would love to implement AKAO, just to hear Chrono Cross in my tool but I have no motivation at the moment.

edit: features so far, problems, incompatibilities (experimental, unstable, work-in-progress, user-unfriendly)

* Soundbank viewer, shows entire structure in tree-form, instruments, key regions, properties, visual display of key regions. Instruments can be previewed (played) with either PC keyboard or even MIDI-in device.

* Wavetable viewer, shows the waveforms, loop points, same preview (play) as in soundbank viewer possible

* Sequence viewer/player, shows the sequence in tree-form (tracks), can mute tracks, graphically shows note events in table-like viewer, can optionally print all (including low-level) events on a debug graph (timeline), WAV exporter (interleaved or one-file-per-track, can write additional stuff like events as cue lists), shows live track status for the usual stuff like program, bank, pitch bend, reverb send level, SVF state etc, rudimentary (AKA barely functioning) register operations for formats which need crap like this (BMS, AudioSEQ, CSEQ)

* Reverb adjustable, currently FreeVerb but could be anything

* some standard SVF implementation for games with LPF-feature, more like one-fits-all-workaround because it's basically impossible to recreate all the different algorithms of the many formats/games with just one SVF model

* ADSR falloffs adjustable, more like a workaround because every game/format is different

* sequences are not converted to common format but are processed on-the-fly

* soundbank format originally based on N64 SDK's layout but got complex over time, needs fundamental refactoring/abstraction to account for more formats, can currently not represent all necessary features for every game/format

* polyphony of 256 voices (hardcoded), sample rate of 48000 (hardcoded), floating point processing throughout, only linear interpolation at the moment (not a big problem),

* includes a few hard-to-use tools to help extract N64 stuff and search for byte patterns and what not, can compile a YMF from isolated files, but most of the work will always be done with a good old HEX editor, no way around it.

* sequence playback (the heart of the entire thing) already supports a few different formats, *none* of them 100% accurately (not possible, forget about it). CMID is currently the "most compatible", ranging from "playable with quirks" to "nearly perfect". SNG format is playable but has problems (ADSR not yet implemented), AudioSeq has problems, register OPs messed up, BMS has even more problems, some rudimentary register OPs work but, yeah. DCM1 (Tetrisphere, New Tetris) worked perfectly but broke after some core-rewrite, LPF worked but broke after some core-rewrite, CMID's copy-block does not support variable repeat-amounts, Many game-specific note handlings not implemented like how new-note-actions, parameter changes after note-off, note lengths after tempo change, the list goes on and on…

As I said, implementing something like this is a monstrous undertaking!

edited 10:32 AM EDT October 14, 2025

edited 8:36 PM EDT October 14, 2025
by Katsur at 10:30 PM EDT on October 14, 2025: Thanks for your insightful comment, WDLmaster. I hope this can inspire other guys for better tomorrow. As you implied, it's obvious that it's absolutely not easy nor pretty. Even I implied that will be difficult.

edited 10:36 PM EDT October 14, 2025
My suggestion if anyone want to begin development and alpha-test by Katsur at 8:33 PM EDT on October 15, 2025: My estimations:
* Gran Turismo (1997, PlayStation 1) has its own proprietary sequenced music format used for frontend menu, including garage and dealerships, using some kind of custom header.

* Super Mario 64 (1996, Nintendo 64) uses the whole soundtrack in sequenced forms using same on-the-fly scheme as WDLmaster implied.

* Hitman: Codename 47 (2000, Windows) uses STY+DLS for sequenced music, composed by Jessper Kyd.

* Wolfenstein 3D (1992, MS-DOS) uses IMF format for FM synthesis music.

* Luigi's Mansion (2001, GameCube) uses JaiSeq for sequenced music during some ghostbusting in the mansion and getting a break.

* Metroid Prime: Hunters (2006, Nintendo DS) uses NDS standard MIDI format with low-quality sample library.

* Jak & Daxter (2001, PlayStation 2) uses Naughty Dog's custom MUS module tracker-like music format, with MIDI instructions and audio samples in one file.

* Super Metroid (1994, Super Famicom/SNES) needs no introduction, but once its BGM was ripoed and converted into good old 64kilobyte-sized SPC format. I don't even know if SNES game data files have their own extensions anx filenames.

* Gran Turismo 4 uses its own custom sequenced music format for frontend menus, garage, GT Auto, etc. INS is a soundbank, and SQT is a MIDI file.

* NiGHTS Into Dreams... (1997, Sega Saturn)

* Shenmue (1999, Sega Dreamcast)

I suggest you to write different, flexible dynamic libraries with their own databases for specific formats, such as n64audioseq.dll (may also include codes for specific games), ps1seq.dll, ps1granturismo.dll, idmusicformat.dll, ps1akao.dll, jaiseq.dll, saturnseq.dll, dreamcastmidi.dll, gt4sqt.dll, libopenmpt.dll (for module tracker music), openmpt123.dll (for UnSeen MO3), jakdaxtermus.dll, etc... That's a lot!
Or maybe code just like in VGMStream, but that's not a possibility.

Overall, this concept is very experimental.

edited 12:16 AM EDT October 16, 2025
by WDLmaster at 5:15 PM EDT on October 16, 2025: Writing different DLLs for each format is not a good idea to be honest. Your initial concept of " VGMSequence" is much better because there can be a certain core audio engine which handles all the different soundbank, wavetable and especially sequence formats. You said it's impossible but I can definitely say it's not. Yes, it's ridiculous work but not impossible. When I wrote my wall of text above, it was not meant to discourage the development of such a playback engine, quite the contrary. I showed that a project like this is very possible but you need tons of patience and time. Yes, I love sequenced VGM, that's the reason I even started a project like this. And hearing original, unaltered CMID sequences live within my own player/tool was *very* exciting but the realization of what tremendous amounts of work lays ahead of us is a bit depressing at times, especially when you're dealing with playback inconsistencies that should not happen according to format specifications.

You listed a few games and their formats but what is so special about these? All sequences are proprietary, not just the few ones listed there. Basically every sequence that's not MIDI is proprietary. Super Mario 64 is AudioBank + AudioTable + AudioSeq, just like Mario Kart, Ocarina, Majora, Yoshi's Story, Starfox […]

Luigi's Mansion is IBNK + WSYS + JaiSeq, just like Windwaker, Twilight Princess, Mario Sunshine […] just (vastly) different versions.

The situation with SNES games is very different. SPC format is *not* a sequence format. I never looked into SNES music but I think the actual sequence formats is mostly unknown and is basically different from dev to dev. SPC is just a sound chip snapshot, run by an emulator.

The same with USF, PSF and all xSF files. Bringing xSF files to "VGMSequence" makes zero sense and is not going to happen because their underlying concepts are completely different. You wanted an actual sequence player, not an emulated sound chip snapshot. And xSF is not an audio format despite looking like one from a user's point of view. xSF is just a convenience layer that almost literally fast forwards an emulator to the point where the music starts.

Tracker music, if it's standard format (MOD, S3M, XM, IT) already has tons of applications for playback and it's not VGMSequence's responsibility to handle those.

Before going into the exotic formats, it makes more sense to flesh out the well documented ones because it covers way more games. The typical HD, BD SQ combination and AKAO are good candidates.
by DarkOK at 6:12 PM EDT on October 16, 2025: Being able to import xSFs for convenience could be possible if you employ heuristics to determine which sound driver & version is in an xSF's program.

For example, with DSFs, you can tell if something is using the Manatee driver as near the beginning of the program is version information. With that determined, you could then begin looking at the location where the area map should reside, which then you can feed each unit (sound bank, sequence) to the respective playback engine. You can also further determine validity as each unit should begin with a FourCC and version.

I believe vgmtrans is able to do this to an extent for 2SFs and PSFs.
You're right, WDLmaster by Katsur at 8:42 PM EDT on October 16, 2025: Starting with most documented ones is a good idea. Much better than what I thought about DLLs, xSF and module tracker.

Go to Page 0

Search this thread

Show all threads

Reply to this thread:

HCS Forum Index
Halley's Comet Software
forum source

User Name		Tags: bold: [b]bold[/b] italics: [i]italics[/i] emphasis: [em]emphasis[/em] underline: [u]underline[/u] small: [small]small[/small] Link: [url=http://www.google.com]Link[/url] [img=https://www.hcs64.com/images/mm1.png]
Password
Subject
Message