Everything is Out of Tune

What does it mean for an instrument to be in tune? The simple answer is that it produces the correct notes when played. So how do we decide what notes are the correct ones?

I have never learned to play an instrument and have no background in music theory whatsoever. What I do have is enough hubris to make Stockton Rush raise an eyebrow.

So who better than me to explain the string theory underlying all of music?

I'll learn about it the worst possible way, by starting from first principles. I'll build out my own set of musical notes from scratch!

Warning: the following contains math

What is a note?

A note is a consistent, repeating pattern of vibration that our ears hear as a sound. How quickly the pattern repeats is known as the frequency (or pitch in music speak).

Frequency is usually expressed in units of Hertz or Hz for short. 1 Hz simply means that the pattern repeats 1 time every second. 100 Hz means that the pattern repeats 100 times every second. Human beings can hear frequencies from about 20 Hz up to around 20 kHz.

The core note

There's a single note that most modern musicians can agree on.* The correct frequency of the A4 note is 440 Hz as officially determined by ISO 16. I'm sticking with that, because I ain't gonna argue with the guys who write the standards for literally everything you could possibly imagine. These folks decide the official method for determining how rubbery rubber is to the shape of a wine glass to how to wire up an airplane. They're a whole other post.

This 440 Hz note serves as the basis for all other notes and is referred to as the tuning note. I'll designate it as note zero or N0 for short.

Here's my list of notes so far:

N0 = 440 Hz

*footnote: there are those who wish to replicate the musical performances of the past as precisely as possible. They set A4 to whichever frequency is deemed most historically accurate.

Get ratioed

Now I've got a core note, but I'll need more if I want to compose anything interesting.

I've got to find some more frequencies that sound good together. How do I decide that? I can look at what notes humans naturally settle on when singing together, or when playing with instruments that don't have fixed notes such as trombones or fret-less stringed instruments like the violin.

When multiple notes are played together what matters is the ratio. Humans tend to like it when the frequency ratio of multiple overlapping notes have small, whole number ratio relationships like 3/2 or 5/4.

For example if I played a tone at 200 Hz then the playing a tone at 300 Hz at the same time would sound nice because 200 * 3 / 2 = 300.

Doubles and halves

The whole number ratio with the smallest numbers that produces two different notes is 2/1.

This gives me an elegant way to get started. I can double or halve the core frequency N0, giving me a new note with that optimal 2/1 ratio. I'll repeat this process to fill out the whole range of human hearing!

Here we simply multiply 440 by 2 to the power of the designator number. So N-3 = 440 Hz * 2^-3 = 55 Hz.

N-4 = 27.5 Hz

x2/1
N-3 = 55 Hz

x2/1
N-2 = 110 Hz

x2/1
N-1 = 220 Hz

x2/1
N0 = 440 Hz

x2/1
N1 = 880 Hz

x2/1
N2 = 1,760 Hz

x2/1
N3 = 3,520 Hz

x2/1
N4 = 7,040 Hz

x2/1
N5 = 14,080 Hz

Musical people call these "octaves" and I'm wondering what the number 8 has to do with anything. I assume I'll get there eventually.

Two is company but three is a scale

This still doesn't give me enough notes to compose anything worth hearing. I've got to come up with a method to add more notes in between each of these octaves.

The logical place to start is the next smallest ratio between 1/1 (1) and 2/1 (2), the seemingly straightforward 3/2 (1.5). Music theory people call this a fifth, despite a lack of fives in the ratio. Their naming conventions continue to baffle me.

I've just included two "octaves" for simplicity. The ratio in parenthesis shows the note's ratio to the octave note below it.

N-1 = 220 Hz (1/1)

x3/2
N-1.1 = 330 Hz (3/2)

x4/3
N0 = 440 Hz (1/1)

x3/2
N0.1 = 660 Hz (3/2)

x4/3
N1 = 880 Hz (1/1)

Now my list of notes has a note in between each octave. But you may have note-iced that something interesting happened. The 3/2 ratio decided to bring their partner 4/3 to this party.

4/3 can be thought of as 2/3rds of the octave note above it. A sort of matching pair to the 3/2 ratio. Another way to think about 4/3 is that it is 3/2 flipped and doubled (3/2 -> 2/3 -> 4/3).

May as well invite her to the club!

I'll call 3/2 NX.1a and 4/3 NX.1b

N-1 = 220 Hz

x4/3
N-1.1b = 293.33 Hz (4/3)

x9/8
N-1.1a = 330 Hz (3/2)

x4/3
N0 = 440 Hz (1/1)

x4/3
N0.1b = 586.67 Hz (4/3)

x9/8
N0.1a = 660 Hz (3/2)

x4/3
N1 = 880 Hz

Would you look at that! We've got another guest at our party. The congenial fellow 9/8. Let's give him a plus one so he can bring his partner, 16/9. (AKA 9/8 -> 8/9 -> 16/9).

The more the merrier and musical...er? Welcome to the club bud!

N-1 = 220 Hz (1/1)

x9/8
N-1.2a = 247.5 Hz (9/8)

x32/27
N-1.1b = 293.33 Hz (4/3)

x9/8
N-1.1a = 330 Hz (3/2)

x32/27
N-1.2b = 391.11 Hz (16/9)

x9/8
N0 = 440 Hz (1/1)

x9/8
N0.2a = 495 Hz (9/8)

x32/27
N0.1b = 586.67 Hz (4/3)

x9/8
N0.1a = 660 Hz (3/2)

x32/27
N0.2b = 782.22 Hz (16/9)

x9/8
N1 = 880 Hz (1/1)

One more pair of guests, 32/27 and 27/16 and then I think that's enough. Once I add them I'll have a nice symmetrical scale with 6 notes between every octave.

N-1 = 220 Hz (1/1)

x9/8
N-1.2a = 247.5 Hz (9/8)

x256/243
N-1.3a = 260.74 Hz (32/27)

x9/8
N-1.1b = 293.33 Hz (4/3)

x9/8
N-1.1a = 330 Hz (3/2)

x9/8
N-1.3b = 371.25 Hz (27/16)

x256/243
N-1.2b = 391.11 Hz (16/9)

x9/8
N0 = 440 Hz (1/1)

x9/8
N0.2a = 495 Hz (9/8)

x256/243
N0.3a = 521.48 Hz (32/27)

x9/8
N0.1b = 586.67 Hz (4/3)

x9/8
N0.1a = 660 Hz (3/2)

x9/8
N0.3b = 742.5 Hz (27/16)

x256/243
N0.2b = 782.22 Hz (16/9)

x9/8
N1 = 880 Hz (1/1)

I did it! I'm a genius! I created a musical scale from first principles! I must let all the musicians know about this tuning method! - What's that? They already know? Pythagoras wrote about it 2,500 years ago and they named it after him even though he didn't invent it?

Filling in the gaps

Okay so maybe I didn't invent it. But I may as well see it through. Looking at the size of the gaps between notes, the the pace is wildly uneven. Just look at the massive holes between notes on a logarithmic plot. I've marked the original octave notes in red and the notes that were inserted between them in blue.

Hey!

I've finally figured out where the names "octave" and "fifth" come from!

Looking at these ratios with fresh eyes, the mathematical rules become clear. Every single note can be described as 2^N * 3^M where N and M are whole numbers. The whole party is made entirely of the relatives of 3/2, otherwise known as 2^-1 * 3^1.

I warned you there would be math.

For example 9/8 = 2^-3 * 3^2, 32/27 = 2^5 * 3^-3, etc. Even our octave notes can be accessed this way. To get double the frequency, simply use the ratio 2^1 * 3^0!

So all I've got to do is continue the same pattern to fill in the gaps and everything will be perfect!

I'll invite 256/243 (who we've already met) and their partner 243/128. I'll also bring in 81/64 and her pair 128/81. That fills 4 out of the 5 gaps in the scale, just one gap left! But wait minute. If every ratio comes with a partner, how do we get only one? The ratio pair that best fits the last remaining gap is 1024/729 and 729/512. Here's what happens when we put all the ratios I mentioned onto our scale:

Now I've got to break up the couple? Just arbitrarily pick one that's allowed into the club and one that is kicked to the curb? That doesn't seem very polite or mathematical. Not to mention those ratios have pretty big numbers. The whole point was to keep them small.

For example, the ratio we get when we play a note using the 81/64 ratio with the 32/27 ratio is 2187/2048. Not exactly the sort of small whole numbers we are looking for.

Well at least the scale is starting to look like something familiar...

Let's try a slightly different approach.

Observant pianists may notice that our core N0 (A4) note is actually resting on the D note. Had we started with D instead of A, as the pythagorean scale does, it would match your lettered expectations.

Pushing the limit

The scale that I, or should I say Pythagoras, or should I say some ancient mesopotamians invented is one that is entirely based on the 3/2 ratio.

The 3/2 ratio is where the twelve notes in the octave come from. If we start at an octave note and step by a ratio of 3/2, dividing our result by 2 whenever the overall ratio is more than 2, we'll end up pretty darn close (within about 1%) of being back at our starting point.

But since 2 and 3 are the only numbers used, we miss out on all of the other prime numbers and their ratios. The next prime is 5, so how about instead of building a scale around just 2 and 3, we add 5 to the mix?

Now instead of our ratios all defined as 2^N * 3^M, we tack on 5^Q to make 2^N * 3^M * 5^Q determine our ratios.

Let's see what it looks like when we take every combination of M between -2 and 2 and Q between -1 and 1. I'll use N to divide or multiply by two (same as what I did earlier) to ensure the ratio falls between a into an octave with a value of 1 and 2.

For example for M = -2 and Q = -1, we get 2^N * 3^-2 * 5^-1 = 2^N / (3 * 3 * 5) = 2^N / 45. The value of N that ensures 2^N / 45 is between 1 and 2 is 6. This gives us 2^6 / 45 = 64/45.

It'll be nice to unlock a whole new set of ratios. It will open up a world of compositional possibility!

Here's what that looks like laid out on a log scale graph. This time instead of frequencies we'll just mark the octaves as 1 and 2.

Aw man... I'm right back to having to throw out notes that are too close together. Since that problem appears to be unavoidable, I've got to figure out a logical approach to choosing who gets in and who doesn't.

Pursuit of Justness

What happens if I make the decision based on whichever ratio is made up of smaller whole numbers? We know that the smaller the numbers in the ratio between two notes, the better it sounds to human ears. Music theory folks call this type of small ratio "just". 3/2 is juster than 5/3 which is juster than 7/4.

Let's see what happens when I pick the most just notes for my scale. The first two notes that are too close together are 10/9 and 9/8. So can I simply pick 9/8? Well yes, but it introduces another complexity. Another two notes that are too close are 16/9 and 9/5. 16/9 is the pair to 9/8, who we've already selected. However it is a larger numbered ratio than 9/5. If I choose 9/5 it will break the symmetry of my scale.

Screw it. Let's go all in. Not only will I break symmetry by choosing the purest ratios, but I will start introducing even higher prime numbers to get me even purer ratios! We'll throw in 7, 11, and even 13! Anything to get the purest possible notes. Instead of 9/8 I'll use 8/7, instead of 15/8 I'll use 13/7, etc.

Here's what I've got for the scale of justness:

1/1
14/13
8/7
6/5
5/4
4/3
7/5
3/2
8/5
5/3
7/4
13/7
2/1

And here is what it looks like on the logarithmic plot:

I finally found the most "correct" scale! If the goal is to have the smallest possible ratios, I've done it!

But something is bothering me. The gaps between each note is different and there's hardly any pattern to them. This seems to be a big problem. If I want to shift a lil' diddy up or down a few notes it will sound totally different due to the massively different gap sizes. Is there any way to solve this problem?

Rationalizing the irrational choice

There's a way to get nice even gaps, but it will mean giving up on those beautiful ratios.

To divide the octaves into logarithmically equal steps, I can simply use 2^(N/12) where N is the next note number.

The dots in blue show the maximally just scale that uses 12 divisions of the octave, while the dots in red show the equally spaced scale.

This new "equally tempered" (as the music theory people say) scale has a wonderful property. Any composition can be moved up or down any number of notes and still have the same ratio between each of the notes.

Justness of Equality?

However, it comes at great cost. Now all of my notes are related by irrational numbers. The beautiful sound of a perfect ratio will never be heard within an equally tempered scale (excepting those doubling octave steps).

When musicians want to move music around the scale, this is a fantastic system. It allows for songs to be shifted up and down and therefor played in many different ways on many different instruments. But for those of exceptional ear, it transforms masterful melodies into misaligned messes.

Take the 5/4 ratio as an example. It is 4 notes above the octave note in our 12 step scale. In just tuning, the note 5/4 above our 440 Hz root note would simply be 550 Hz. In equal temperament the note would be 554.37 Hz. For mere mortals such as myself, the difference is hardly noticeable. To a person of perfect pitch however, it sounds annoyingly inharmonious.

Listen for yourself and see if you can tell them apart. It may seem subtle when playing them on their own, but the difference is obvious when playing both the just and equal notes at the same time. A 4.37 Hz "beat" frequency forms from difference between the two notes. If two musicians using two different tuning systems try to play together, those beat frequencies will pop up everywhere and make everything sound wrong and warbley.

So how do I choose a tuning method? Which one is best?

On one hand, just tuning provides the purest, most stable combinations of notes. On the other hand, equal temperament tuning gives greater compositional flexibility, even if it sounds a little off to the well-trained ear.

Turns out music theory dorks have been fighting over this exact question for literally hundreds of years. There's no way I'm going to be able to settle it!

My arduous quest to determine what it means to for an instrument to be in tune has come to end with an enormously unsatisfying answer:

"it depends"

Oh you're still here? You actually waded through all that math and want more? Then here's one additional tuning headache, as a treat.

Higher Harmonic Hassle

Physics itself provides one last twist to my tuning torture.

The reason a string or a pipe sounds so nice when it vibrates is because it creates overtones. The note we hear is not just a single frequency that corresponds to the length of the string or tube, but also every higher harmonic multiple. These are called overtones.

For example if I were to pluck a string whose length corresponded to a note at 220 Hz, it would also create sound at 440 Hz, 660 Hz, 880 Hz, etc. Creating a lovely sound full of lovely ratios.

This seems great, except now I have to worry about making sure the harmonics are in tune with each other! Surely that will be as simple as keeping the primary note in tune right? Wrong!

As any Pink Floyd fan knows, light separates into many colors when passing through a prism. In a similar way different frequencies of sound separate when passing through a material such as a string. This phenomena is known as inharmonicity. It makes the overtones a little bit higher frequency than a simple power of two would suggest. The result is that the overtones of the lowest notes on an instrument may sound out of tune with the highest notes.

The only solution is to slightly decrease the frequency of the lowest notes and/or slightly increase the frequency of the highest notes so that the overtones of the low notes line up with the high notes.

So not only was I unable to find the one true tuning, but even if I had one there is no way to implement it on an analog instrument!

Shoutouts

Big thanks to Michigan Technology University and University of Waterloo for hosting wonderfully comprehensive pages on this topic. I shamelessly stole most of this content from here and also here. These pages stir my deep and abiding love for rustic homemade websites.