Ways to improve 24 khz or lower sound files using audacity/audition by Sephirothkefka at 2:08 AM EDT on April 29, 2018
Is there an easy way to improve music that is noticeably lower quality than normal (like Super paper mario, kid Icarus uprising, etc.) im fine with 32khz files most of the time especially if they're synth in origin (as most synths sample at that rate anyway). I haven't found sn easy way to do so.
@kode54 - I disagree. You can even make 16KHz 4-bit ADPCM recordings sound like a CD-quality track. For example, you can use an exciter, which interpolates the low-to-mid range frequencies into the high range spectrum, or simply resample it with no interpolation, and manually filter out the aliased frequency bands.
But that involves processing and alteration, which leads to the argument of "If it's coming from a low-quality source, is it really HQ?".
Personally I believe kode54 is right. As someone who obesses over having as highest quality as possible it really bugs me to side with kode, but I'd rather have the original low-quality version rather than knowing it's been processed (because who's to say their version is 100% accurate?)
Considering digital audio isn't 100% accurate anyway, it's kinda a moot point to be honest. That's not just me defending my objectively brilliant remasters at all, nope, not at all ðŸ˜
simonmkwii is actually hitting on something that's not as widely known or understood as it should be. Game consoles resample audio for mixing and apply interpolation that generates high frequencies through aliasing.
A great demonstration of this is with a game like Goldeneye 007, where the USF files play at about 22 kHz, yet if you play the game on a real Nintendo 64 it sounds better and has frequencies well into the ultrasonic range. It doesn't sound as good as if the music were natively at a higher sampling rate, of course, but it's still an improvement.
I find that if you use the MultiResampler DSP in foobar and set it to cubic interpolation, you get a good approximation of N64, Wii (I assume GameCube as well but haven't specifically tested), and 3DS resampling. For PS2 games with 22 or 24 kHz streams, they come out a little more muffled on real hardware than with any of MultiResampler's interpolation options because the PS2 uses an interpolation method not offered.
Note that the Wii and 3DS appear to use cubic interpolation to get everything to 32 kHz for their software mixing stages, and then evidently output with sinc interpolation applied, which allows alias frequencies until about 18 kHz for both systems. This is true even for Wii games played on a Wii U and captured digitally from the HDMI output, so it has to be done in software rather than with a filter in the audio output circuit. But when you're listening to ripped files, you might as well just set MultiResampler to do one pass with cubic interpolation to your target output rate.
@nothingtosay - Don't forget the fact that the GBA and DS weren't capable of interpolation of any kind, so PC music rips using a decent interpolation method can actually sound significantly better than on actual hardware.
Lanczos resampling tends to sound (and look) the best, but if you want those high frequencies back, you'll need to give up all interpolation altogether.
Most people don't understand much about the fourier domain, so I won't bore people with the details.
@simonkwii: Yeah, adding interpolation does a lot to clear up the nastiness of those systems. Audiophiles sometimes criticize digital audio for supposedly reducing smooth, round sound waves to stair-step shapes, which is not what happens because oversampling combined with interpolation/low-pass filtering restore the smooth shape. Well, that criticism is actually basically valid in the case of the GBA and DS!
Lanczos interpolation is a form of sinc interpolation and can tend to make things like drum and cymbal samples on those systems too muffled. If you want high frequencies that sound closer to intended without distortion, cubic interpolation is a good idea. Linear interpolation is only subtly different as well.
@MoldyPond: Indeed. But I have a pedantic note about that. kode54 added that output option to foobar's USF decoder after I discussed this stuff with him a while back. He said at the time it applies the same cubic interpolation scheme that is used in the N64's mixing stage when dealing with samples, but I found that the spectogram isn't quite as similar to that of a real N64 when compared to using MultiResampler's cubic interpolation. ¯\_(ツ)_/¯
I first completely disregarded foobar and game rips a few years ago because the first thing I did was listen to Rare game music and immediately noticed that it sounded infintely worse than my line-in 320kbps MP3 recordings :3
You are right though about it still not being right at 48kHz. Before Diddy Kong Racing's USF was fixed a few months ago I had a new line-in recording and listening back to it, it still sounds slightly better quality. Not quite as clean and raw as the USF, but more... right.
Still though, having the ability to export a 50 song soundtrack as WAV files in 2 minutes as opposed to having to record every song individually is something I will never complain about :D
If you're using the default enabled resampler in foo_input_usf, you don't even need a post processing step, as it already upsamples using a cubic interpolation algorithm.
Did you mean me rather than simonkwii about using MultiResampler? The reason is that it allows you to control interpolation methods.
@MoldyPond: It would be really awesome if somebody could somehow figure out exactly what a real Nintendo 64 does so it could be implemented. I'll say though that I abandoned my idea of recording Goldeneye's music from hardware when I A/B tested it versus the USF upsampled with MultiResampler and found a difference so tiny I decided I'd be doing all that work for truly no actual gain. With cubic interpolation, it really is extremely similar to what the real thing sounds like.
@simonkwii: GBA and DS music can use instrument samples with different sampling rates, making individual instruments alias at different frequencies, so there's not really a one-size-fits-all-instruments point to start filtering. But if you're filtering out all the aliasing artifacts, isn't that functionally the same as sinc interpolation?
Additionally, I would argue that while linear interpolation, and cubic for that matter, have audible aliasing artifacts, that's the point when the samples are so bandlimited in the first place. The interpolation is used to synthesize treble frequencies that aren't there to improve the sound while saving on memory or CPU usage. When a cymbal sample is only 8 kHz or something, I think the composer or sound designer is counting on aliasing to create the treble associated with a cymbal crash. But of course you can do whatever you think sounds best; that's what I'm doing too and I don't think we have to stick to what the actual hardware does. Nintendo doesn't either. The EQ applied to the audio for GBA games in the Game Boy Player isn't at all like what a real GBA sounds like, but I think it's quite good. Also, their releases of Famicom music never sound like they were recorded from a standard Famicom's only output method: RF.
I don't mean to try to contradict you on your Diddy Kong Racing work, but that track has about 3 dB more dynamic range if you turn the music volume slider down in the game. It has to be all the way down to about 25% for it not to clip! I also think the reverb at the end could be dialed back or even eliminated entirely and it would sound fine. But I do think EQing the treble to boost it is a good idea and makes a big difference.
@nothingtosay DKR being one of the loudest games I know of was this first thing I found out about when recording it nearly 10 years ago :D
That being said I've actually asked about that in the USF request thread. I let JoshW know about the issues with the old USF set of DKR and he fixed and reuploaded it, so now it's perfect except really really loud and you can easily hear the clipping if you don't upsample it. When creating USFs is there a volume setting of any sort?
It actually doesn't. kode54 is the one who maintains the foobar USF player and added the upsampling to it. It's probably been a very long time since the Winamp plugin has been updated.
@kode54 - I disagree. You can even make 16KHz 4-bit ADPCM recordings sound like a CD-quality track. For example, you can use an exciter, which interpolates the low-to-mid range frequencies into the high range spectrum, or simply resample it with no interpolation, and manually filter out the aliased frequency bands. Purely snake oil, nothing more.
You cannot improve sound quality by re-sampling or interpolating. The information is not there, it's lost. kode54 is absolutely correct.
The Ancient Lake "remastered" track is not of higher quality either. One can easily prove this by examining its spectrogram. Everything above 11000 Hz is mirrored. This is the aliasing effect. What we hear as high frequencies are ringing artifacts of the interpolation method used (cubic?). Using Whittaker–Shannon interpolation should avoid those artifacts and give you the same result as the console output, just at higher sample rate.
The only chance to get higher quality tunes is to re-synthesize them. Only possible for games that use sequenced audio ofc. Since the instrument samples are compressed themselves, the gain may be marginal. Doing this for DKR's Ancient Lake this looks / hears the following (using fluidsynth with 7th order sinc interpolation for all instrument samples): https://drive.google.com/open?id=14XMTot2eI61DB4QiGk-D70Rs1TiW8CIG
@derselbst: I think I get where you're coming from, which I believe is a scientific perspective of audio that prioritizes theoretical best practices and treats distortions as inherently undesirable. I fully understand the "rule" that you can't improve quality by resampling, the best you can do is minimize artifacts.
But I can't say quality can't be improved when, despite sound quality being an almost entirely subjective thing so there isn't any one correct answer, I think most people would agree that Diddy Kong Racing sounds better cubic interpolated (which is the way it sounds on real hardware because that's what real hardware does, contra your statement in the last sentence of your second paragraph) than with Whittaker–Shannon interpolation and lacking any frequencies above 11 kHz. Yes, they're aliasing artifacts, but that doesn't mean that they're unpleasant or that it sounds worse.
I like to add some distortion to my electric guitar sometimes, which creates overtones that aren't present in the pre-processed signal to achieve a desired sound; that's functionally equivalent to a Nintendo 64, GameCube, Wii, or PlayStation 2 upsampling with interpolation to create treble frequencies that wouldn't otherwise exist.
Evidently console designers have decided that resampling with interpolation methods (other than sinc) does improve sound quality because they've made the conscious decision to do just that.
Like I said in a previous post, sure it's not as good as natively using the higher sampling rate in the first place, so re-synthesizing should sound better as a general rule. But Star Fox 64 and Ocarina of Time's soundtrack albums seem to have been recorded pre-conversion to N64 and I really kinda prefer how they sound in the game.
Don't even get me started on OoT and MM's CD's sounding worse than in-game.
I think the main problem with the arguments here is that the title asks to "improve" low-quality audio (which in and of itself is completely subjective). Just because you've taken a low-quality source and processed it to sound "better" doesn't mean it actually is. Data lost is data lost, no exceptions, but that also doesn't mean you can't make it sound better than the low-quality version.
I'd say the better argument here is Purist Audio Vs. Enhanced Audio.
In response to the original question, yes there are (many) ways to improve low-quality audio, but it raises the extra question of "Knowing that it's been processed externally by a random person with no affiliation to the original source material and this edit is not official in any way whatsoever, is that a concern to you at all?"
@nothingtosay: You're correct. I was trying to state that there is no way to improve the quality of any sound. You can make it sound different of course, as we've already heard. Whether that sounds better or not lies in the ear of the listener.
However I don't see evidence that console designers consciously use resampling to improve sound quality. I only see that software synthesizers need resampling at instrument sample level. Why they haven't chosen sinc? Because it's too expensive. So they've chosen something cheaper, at the cost of arising artifacts... whether intended or not idk.
MoldyPond is right: Purist Audio lovers should just be happy with what a console natively outputs. Everybody else may subjectively "enhance" audio as he pleases.
So much misinformation in this thread. No interpolation method does anything to restore the missing frequencies from the sample/track. All it does is removing the aliasing artifacts from the harmonics generated when samples go unfiltered. Any interpolation method other than Linear is leaving aliasing artifacts on the track. There is this common thing that people mistakenly call any non Linear method in audio interpolation as higher quality (Sinc, Catmull, Cosine etc). The only real high quality is Linear. We are not dealing with image files here. Any artificially generated audio harmonics from those algorithms result in a fake and metalic sound that sounds identical to what you hear from nearest neighbor (no interpolation). The frequencies that are gone, are gone. Exciters and other stuff do nothing proper to restore those frequencies, they just add more artifacts that result into what we associate as metalic/cheap electronics sound. You don't want any of that in your quality audio tracks.
Linear interpolation isn't perfect, either. Far from it.
Here's an alternative fact: No resampling (interpolation) method is 100% objectively perfect.
Tidbit example: We can say that SNES samples were designed for the Gaussian interpolation of the system. Regardless of whether Mr. Sound Man consciously knew about the ass-end (for flavor; I like SNES) output of the system, because that was the target destination, Gaussian is correct in terms of accurate reproduction of SNES. If you want mathematically accurate, then Sinc. Both create artifacts AKA something there that wasn't there before.
Just be happy with what a console natively outputs? What the fuck? How much piss are you taking? GBA and DS' native output sounds like utter horseshit, and I would rather shove a fork in my dick than listen to them without interpolation, how it was "originally intended". Fuuuuuuuuck off.
Utter garbage on all sides. Also, I am apparently deaf, so I actually enjoy GBA and DS soundtracks in their native quality, and don't need someone to spend hours stripping MIDI files and re-rendering them with ripped sample banks.
I was considering posting a reply to that call for resequencing with a report of "I would suggest quoting a high price per hour for all that work, but that would imply you would do it at all, much even for a price."
I enjoy the GBA's quality as is too, but only because we don't really have any other option. I mean, if the GBA DID have an option for high-quality sound, would you be complaining? (Also why is this thread suddenly so toxic?)
SimonMKWii - Today at 12:13 AM @bxaimc My god, the "Ways to improve 24 khz or lower sound files using audacity/audition" thread is a total nightmare! It's basically just people flaming each other
bxaimc - Today at 12:18 AM You expected something different? I didn’t everyone that responded in that thread has their own quirky ego
@derselbst This is not linear interpolation. There is either some fault or there is no pre-filtering applied. I don't know from where you got the samples but there is somekind of fault in the implementation. Try instead the linear interpolation method and sinc interpolation option in NDS tracks. This implementation is done correctly.
@MoldyPond There are ways to render GBA audio in higher quality that GBA's DSP. One is to convert tracks to BASS midi and the other is this: Agbplay
Also for the record here's what happens with and without Linear interpolation. Any other than Linear interpolation will introduce a level of the artifacts shown on the right side of the Spectogram since they all rely on creating those fake metalic sounding harmonics: Image
@VIRGIN KLM: I've made it myself using fluidsynth. I can reproduce the same behaviour with libsamplerate. Which is totally comprehensible, because linear interpolation is not band-limited, while sinc interpolation is. Reconstructing a sample point without introducing artifacts cannot be done by only taking two samples into account as linear does. In fact all existing samples contribute to the sample being reconstructed. In other words: you must use a sinc interpolation in time domain to get a rectangular response in frequency domain that you want, rather than a sinc^2 frequency response like linear interpolation yields. You're mixing up sinc and linear interpolation.
Talking past each other is fun. As was my sarcasm. Right kode54?
But yes, kode54 did get it right in post #2.
There's nothing objective about subjectively enhancing source material. There's nothing wrong with it, but it's up to personal taste, and shouldn't be advertised as something more right. There's already enough confusion without that (see this thread).
I have a quirky ego? Meh, I'll take it. There are certainly worse things to have. Surely a natural consequence of being very interested in/passionate about almost anything, but especially things as niche as video game music and hardware behavior.
By the way, I've gone all this time without saying yo bx. It's been a while, but I still remember you as a kickass dude. I noticed arbingordon also popped in here some pages back. I enjoyed chatting with both of you and hcs back half a decade ago or something, hope y'all are doing well.
You're going to love the pending QSound emulation that's coming to MAME and VGMPlay and is already in libVGM in a fast HLE form. It's hard clocked to 24038Hz, or thereabout. And there's no way to change that. It's HLE code based on a 60MHz DSP16A, which runs specific code, and has specific FIR filter coefficients, and specific delay line lengths in game specified sample counts. Making it produce a higher sample rate is not really feasible without in fact harming the sound quality.
How disgusting, mirrored aliasing. When it's in foo_input_qsf, you can do what you want, but when it'll be in foo_gep or foo_input_vgm, it will use the plugin's own linear interpolation, with no other option.
Also, 24kHz isn't much lower than my threshold of hearing anyway. 12kHz vs 14.5kHz.
I forgot that thing was still being worked on. I've only been keeping up with Mame qsound emulation. I hope this doesn't mean I have to relog all my qsound rips since when I tested them in mame, notes were missed.