Kurs realizatorów fandubbingowych 2016

Kurs pokrywa wstęp do miksowania i masteringu projektów wokalnych w kontekście piosenek fandubbingowych. Kurs przeznaczony jest głównie dla realizatorów dźwięku, dźwiękowców i reżyserów piosenek. Osoby aspirujące do wdrożenia się w świat post-produkcji znajdą dobry wstęp do podstawowych technik miksowania i pracy z piosenkami fandubbingowymi. Doświadczeni realizatorzy dźwięku też mogą znaleźć coś co ich zainteresuje. Oczywiście zachęcam do przejrzenia kursu wszystkich zainteresowanych śpiewaniem, nagrywaniem i post-produkcją nagrań wokalnych.

Prezentacje rozdzielone są na 5 głównych części.

Co robić przed rozpoczęciem miksowania?



I dodatkowe materiały budujące na wcześniej zaprezentowanych pojęciach i technikach procesu post-produkcji.

Post-produkcja – miks

Post-produkcja – master

Tutaj zamieszczam slajdy ze wszystkich części kursu.



Have you ever wanted your computer to understand you?

Well, I exaggerated the title a little bit. I did teach a computer something though. I taught it to distinguish between ‘warm’ and ‘bright’ equalised guitar recordings. Yay, so exciting!

Have you ever talked about a song with your friend and said that the drums in it were really punchy or that the vocals were warm and crispy? People use these descriptors intuitively however, how could we teach a computer to understand them too? For my BSc dissertation I successfully trained kNN and SVM machine learning algorithms to tell apart between the warm and bright equalised guitar recordings using Mel Frequency Cepstral Coefficients (MFCC).

You can read about it here:

Aligning Vocals… Drudgery or a walk in the park?

This time I want to show you a vocal aligning plugin I’ve recently found and started using like a madman… Knowing that I often work with tracks that I need to have aligned pretty tight, this discovery was just what I needed. When I think of all the long ‘late’ hours spent on aligning vocals in software such as Melodyne I get these uncomfortable chills on my back, ughh. Oh man, why didn’t I know about Vocalign before?! Well there is a good reason for it. I just wasn’t looking for it! To be honest I would be probably still working with software like Melodyne till this day if my friend hadn’t told me about it, but about that in a second. So, I wanted to think that the software just wasn’t around until recently but here came a surprise for me… Apparently the VocAlign project was first released in 2001 and now grew to see its new version which is available here.

One day I talked with my friend about timing vocals to one of the songs I was was working on and (of course) I’d go on, and on about how annoying working with Melodyne can be. It can very easily destroy(affect) the signal making it sound like, hmm, well like Melodyne… Well, from time to time the vocals I work on need sound alien-like, but hey welcome to the 21st century! Sometimes alien-like vocals may be the thing that the song calls for, right? With many popular songs nowadays you simply cannot tell if the vocals were actually sang by an actual human being…

But back to the story… My friend answered to my complaining with a question ’why don’t you just use Vocalign for your tracks?’. I wasn’t completely aware at the time of how that conversation will change the way I handle vocal aligning… but now, I know, and I am really excited to share some of my results.

So what’s the VocAlign®? It is an audio plugin that will automatically align two audio signals so that the timing of one matches the other. It works at 16 and 24 bit resolutions and sampling frequencies from 44.1 to 192kHz. But what truly maters is the simplicity and speed of the software. The work that was taking me an hour using Melodyne for example, decreased now to only few minutes. Don’t get me wrong, I still think Melodyne is a great piece of software and I still use it quite often, but so far VocAlign made vocal aligning work for me to to be just like a walk in the park.

Aligning with VocAlign: Examples

I will present 3 different examples that will consist of: an .mp3 containing just a raw vocal tracks, .mp3 containing vocal VocAligned® tracks and final track with what that part sounded like in the master recording.

Example 1 No Effects

Example 1 VocAlign

Example 1 Master

The differences between not aligned and aligned tracks may be subtle in this example but I wanted to show it anyways because many things master engineers do, Are subtle. I learned that from Dave Pensado and his amazing series on YT called Pensado’s Place. Defnitely worth checking out if you are into music production, sound engineering, mastering, or anything else that has something to do with studio work.

So in this example we will see how VocAlign deals with a bit more recognisable timing variations.

Example 2 No Effects

Example 2 VocAlign

Example 2 Master

In this example you can hear the differences much better (if you still can’t hear it then listen out for the word ‘hearts’)

Now I will just quickly write down the process I have to go through to align one track to the other: Open VocAlign, side chain the track I’m working on to the track that I want to align it to, click ‘Capture Audio’, playback the part that I want to align, press ‘Analyse’>’Align’>’Edit’ (this literally takes me 2 seconds) and voilà… I have my aligned track.

Excluding the playback time the whole process takes me a few seconds. How amazing is that?! Definitely much better than moving each syllable manually.

But of course nothing is perfect (That’s a good thing. SMILEY)

Where does a subtle (=’good’) aligning end and where does a hardcore (=’not so good’) aligning start?

Example 3 No Effects

Example 3 VocAlign

What was wrong? You can probably tell yourself already.

Well, VocAlign uses sophisticated pattern matching algorithms that compare digital data. VocAlign moves AND stretches our recording so that it creates as closest match as possible to the side chained track. The obvious problem that arrises is this ‘unnaturally-stretchy-signal’ (probably the best way to describe it 🙂

Do we want it? Maybe yes, maybe not. In the end, I worked my way to have it sound like this Example 3 Master

Furthermore, in example 3 you should also notice the word ‘timing’ differences. These may cause more problems if one is not careful.

The Bottom Line

  • I no longer have to waste my time aligning each syllable of my layered vocals.
  • Working with backing vocals has never been easier.
  • Yes, VocAlign sometimes does this weird stretchy thing to the signal, but so what?  I’m ready to forgive these little flaws if the process of aligning, as a whole, has been reduced to just a few mouse clicks.

This is the band that I used for the above examples, Make Sure You Check Their YT Channel: JetfaceGroup – YouTube

Maciek (Razjel)

Listening Tests

So in the previous article we managed to rip and compress our really good quality Mp3s. So now its right time to ask questions… Just how good are these Mp3s if compared to the sound on a CD? Have we set up the encoder correctly? The most obvious way to find out is to conduct a listening test, but…

But there is a problem.

It turns out that it is quite challenging to compare two audio sources of different quality, timing, and volume. The traditional method of audio listening tests is to play one song followed by another and ask which one sounded better. Interestingly enough, our brains recognize a slight increase in volume as an increase in clarity, which is critical in this situation, as we want to avoid possibility of confirmation bias. We need a way of switching between two audio sources of different quality at any point in a song, with no delays or changes in timing and volume.

I’ve observed many fierce debates in real live and online, where people insisted that they can hear the difference between 320kbps Mp3s and ones of even slightly lower bit rate. Some say that Mp3s are no good enough for serious audio equipment. Let’s hear if this is true.

I ripped one song from the same CD as an uncompressed wave file, as a 320 bit rate Mp3 and as the blend of VBR.


I imported the raw wave file and the 320kbps Mp3 into Audacity, one after the other.


Since, the Mp3 compression process had padded the start and finish of the file, it is necessary to remove the silence so that both tracks are aligned.


It’s quite slow and notorious to align both tracks as audacity does not support a user-friendly zoom in/out option. However, once you managed to zoom in so that you can see all the individual samples, I suggest to pick one that is standing out, and use it as a guide-sample.


Now the interesting part kicks in. I selected the whole region of the 320kbps Mp3 and inverted it. It means that whenever the waveform had previously gone up it now is going down, and vice versa.


Then I selected the uncompressed wave file with the inverted Mp3, and clicked mix and render.


This action created a mix file that represents the differences between our uncompressed and compressed files. I think it’s really interesting to see this pretty digital representation of the data chunk that we loose completely when compressing our CD files. You can hear how this sounds like here. Cool, right? Does it make a difference?


Then I imported a 320kbps Mp3 into Audacity. In the screenshot above you can see very clearly (on the same zoom-in setting on both tracks) differences in the sizes of both files. After this, I did the sample alignment again, and hit the play button. Both tracks plated simultaneously and sounded like the source CD. Mathematically, it was the source CD. The difference file and the compressed file equaled the source file. By muting the difference file at any point in the song, I could hear without distortion, or changes in volume, what is the difference in quality after compressing a wave file into a 320kbps Mp3.

I listened, and listened, but there was no difference to be heard. Don’t believe me? See if you can spot the points where the quality changes.

Listen here.

Did you hear any difference? No? Yes? Maybe? Try it again and listen for an increase in quality at 5 seconds which disappears at 16. Using modern Mp3 encoding methods, a 320 bit rate Mp3 sounds very much like the original (at least to my ears).


I repeated the same test using a VBR Mp3 instead of the normal 320kbps. As you can see the difference file is smaller from the normal 320 bitrate Mp3. You can listen to it and see if there is any detectable change in playback quality.

Listen here.

The quality should increase slightly after 5 seconds, then drop off at 16. I cannot hear any difference here as well. So does it mean that the 320kbps MPS sounds as good as an uncompressed CD .wav format?

In order to do a more obvious comparison I did the same test using a 56 bit rate Mp3 and the results were, as expected, humongous. The difference track was huge. You could really see how much audio was lost in compression. The 56 bit rate audio sounded horrible, but as soon as I unmuted the difference track it was suddenly sounding like the CD once again. You can hear it for yourself here.

Listen here.

In the end, I have to say that it is (almost) impossible to tell the difference between a song as an uncompressed wave file and a song that has been compressed as 320 VBR. I used DT 770s and my M-audio interface for these tests, and of course all results were confirmed using my teenage ears. Furthermore, the 320 VBR file I used wighted 8.2 MB, the normal CBR Mp3 weighted 10.3 MB, and the uncompressed wave weighted 45.4 MB. This shows that the VBR method provides great results while taking 25.6% less space than a normal CBR method in this example.

I think many people are not 100% honest with themselves when they say that all Mp3s sound worse than CDs. This goes especially for people with worn out ears who speak such heresies. On the contrary, I’ve read a few interesting posts where audiophiles argued that the average person doesn’t have the knowledge or experience to verbalize the fact that what they hear is an MP3. I don’t think that is the case in the kind of tests I conducted, as I tried it on few other recordings, and all the findings were the same. However, it may vary in more surgical audio situations where a trained ear, that knows exactly what to look for in the high frequencies, will be able to distinguish between a wave file and Mp3. Often, it will be the ‘air’ of high frequencies that may provide some small differences between a good quality MP3 and a .wav file.

Speaking of recording, mixing and mastering though… I would never include good quality Mp3 files in my mixes because I believe there are certain situations that ask for certain tools. I’d try to keep my recordings the highest quality possible, but that also may change depending on a final, envisioned version of a song I’ll mix.

Again, Mp3 algorithms improved drastically over the past decade. We do not suffer anymore from tons of audio errors if an Mp3 is ‘only’ 192kbps. In fact, I’m positive that dropping to 128kbps would still be acceptable for many people. Especially if played on poor playback platforms such as iPods, iPods or similar, where listening conditions aren’t great anyways. I think we should admit however, that correctly compressed, high quality Mp3s are good enough to satisfy the needs of serious music listeners. Listeners who strive for a really good quality sounds that can be expressed through some serious equipment.

I’m also attaching this little section here, as I find it really fascinating how the same file compression comparison test can be shown visually.

So here is an alternative way of looking at compression subject.

Lossy Audio Codec’s Comparison

Another interesting visual example.

How to check quality of Mp3 file

And last, but definitely not least… for those who are interested in Recordingreview.com, I’m posting this link to a really interesting debate on this subject.

The endless mp3 vs wav debate: Randomized Blind Listening Test

I’d like to thank BugbrainNobax, Walter, and Jax for inspiring me to conduct this experiment.


Audio Ripping and Compression

Recently I’ve been really interested in the whole talk about audio quality. I really value ‘good’ quality recordings and cherish the moments when I have access to these, but I was never too obsessed about it. Everyone will have their own interpretation of what sounds good anyways. Ha! And that is what’s really interesting.

While I do not lament when I sometimes listen to 128kbps Mp3 files, I believe it is important to be able to compress audio files correctly because the poorly processed ones can ruin our ears forever… Or, for example, negate the advantages of an expensive speaker set that we may use.

So I will present a method of audio ripping that will allow anyone to be left with a really good quality compressed files, as well as a little experiment, which hopefully will once and for all, save me from wasting my time thinking about differences in quality between an uncompressed audio file and a ‘correctly’ compressed Mp3 file.

The first step is to find a CD that you would like to burn and then check it for any possible scratches. CDs are made from polycarbonate plastic (Macrolon) which is a sturdy and strong material that sadly scratches quite easily. The designers saw that coming and planned for it. The Red Book Standards for Audio Compact Discs enforces an error correction system that allows CD players to deduce a surprising amount of data based on the data which comes before and after a scratch.

Furthermore, the data is not stored on the shiny part of the CD, but is actually stored on the label side (the program area). The CD player’s laser has to make its way through a layer of plastic till it reaches the disc’s data. So after all, a small scratch on the back surface of a CD is not as dangerous as it may seem.

OK, so if you got that far and think that few facts about the program area of a CD will spoil this reading then just skip to the next paragraph… The program area of a CD also conforms to Red Book standards. Data is stored internally in a series of “bumps” known as “pits”. These pits are located in a single spiral track of TDM (time-division multiplexing) data. The data pits come in 8 different lengths: from 0.833μm to 3.56μm. The varying length and distance between the pits is how the digital audio information is stored in NRZI (the NRZI stands for non-return to zero inverted and is one of the most popular languages used for Pulse Code Modulation (PCM)).

So, once you’ve chosen a CD that you are satisfied with, you will need a piece of software that will allow you to rip and compress it. I’m using Switch, which works both on pc and mac. Once you have opened any reasonable audio file converter software then I would take a moment, and poke around the encoding tab. Mine looks like this.


These are the settings I’d use for creating high quality Mp3s. I’ll write a little bit about the details after.

Encoder itself is one of the applications to encode audio to Mp3 files so they take far less storage space. This is called lossy compression. On that note, many people think compression equals a reduction in data and quality let say Mp3, MPEG, DTS and JPEG. Well, this is not always the case. Few examples of lossless compression: ZIP, RAR, TIFF, MLP. But back to the encoder. The encoder uses two different compression methods namely, CBR or VBR (see above settings).

CBR stands for constant bit rate, which basically means that you tell the encoder to express every second of audio with a certain number of bits, and it does just that. It will give every single second of audio the same amount of space. There is nothing really bad about this method, it is simply very wasteful and produces bigger file sizes than VBR. Hence, somebody smart came up with another method called VBR.

VBR stands for variable bit rate and is the most commonly used encoding type. With VBR the encoder considers every section of audio very carefully and decides how many bits it would take to encode it with the desired level of quality. This means that the bit rate can actually drop to almost nothing in the parts where there is silence, shoot up to high hundreds when the song starts, and peak at the limits of the encoder during particularly delicate passages.

Mode in the stereo encoding section, I would leave on joint stereo as it offers better quality. In the joint stereo the encoder can spend the available bits more efficiently than normal stereo mode. There is a technique involved known as joint frequency encoding, which functions on principle of sound localization. It stores one channel as the difference from the other as opposed to an entire second audio stream.

Quality is the deceiving one. It basically tells the encoder how many shortcuts can it take during the analysis and compression process. When (q=0) the encoder does everything super accurately. When you set it to 9 it will do everything in a more ‘clumsy’ manner. The technology has moved by quite a lot since 1993, so nobody would really use low quality encoding today.

CRC stands for a cyclic redundancy check and is an error-detecting code commonly used in digital networks and storage devices. I don’t know what real advantage it will bring to this compression process so leaving it alone, and unchecked should be completely acceptable.

If I’m using a VBR then I want the encoder to have access to the widest range of bit rates possible. The minimum should be at 32 and the maximum should be at 320.

The output sample rate is not shown in this window but I know it is going to be the original CD sample rate, which again thanks to the Red Book Standards, is said to be 44,100kHz.

OK, lets get down to business.


Above you can see all my loaded tracks in .aiff file format. All that’s left to do is to click convert and all the Magic will be done for us.


The conversion of all the files will take a while. After the conversion finishes, you are done. You’ve just converted losses CD files into a ‘very good’ quality lossy files. OK, but that’s still just talking. So in the next post I’ll test how good they actually are when compared.