Consonant Intervals and Orthogonality

In this article I am going to explore some factors which are involved in the perception of consonance and dissonance of notes in the chromatic and major musical scales. The distance between two notes is called an interval, and there are 12 notes in the western chromatic scale. There are two common methods of calculating the frequencies of these notes.

Equal Temperment

The most common method used in modern times is called Equal Temperment. In this tuning, each adjacent note is related by the ratio of a 12th root of 2 or:

$2^{1/12} = 1.0594630943592953...$

Starting from A 440hz and calculating each of the 12 notes of the chromatic scale up to the next A, looks like this:

Equal Tempered Note Frequencies

r = 2.**(1/12.0)
start_note = 440.0
1.upto(12).map {|interval| start_note *= r }
=> [466.1637615180899, 493.8833012561241, 523.2511306011974,
    554.3652619537443, 587.3295358348153, 622.253967444162,
    659.2551138257401, 698.456462866008, 739.988845423269,
    783.9908719634989, 830.6093951598906, 880.000000000000]

This tuning has the benefit of being able to switch musical keys without retuning your instrument, and allowing different types of instruments to play together. This is the way a guitar, MIDI synthesizer, or piano is usually tuned. The problems with it are that essentially every note frequency besides the Octave is slightly wrong, and when calculated this way even that is a bit wrong due to floating point error and cumulative error of multiplying irrational numbers.

The benefits usually outweigh its problems, but for the purposes of this article I will be using…

Just Intonation

The ratios in Just Intonation are not all equal, but are based on the harmonic series. You find each ratio by mutiplying the root note or tonic by increasing whole numbers, and then dividing by a denominator that will bring the frequency back into the octave’s range. Here is a data structure we can use to look up the ratios and names for each degree in the chromatic scale.

Chromatic Scale Lookup Table

start_note = 440.0
Ratios= [
        {:ratio => 1,                   :name => 'Unison'},
        {:ratio => Rational(25,24),     :name => 'Minor Second'},
        {:ratio => Rational(9,8),       :name => 'Major Second'},
        {:ratio => Rational(6,5),       :name => 'Minor Third'},
        {:ratio => Rational(5,4),       :name => 'Major Third'},
        {:ratio => Rational(4,3),       :name => 'Perfect Fourth'},
        {:ratio => Rational(45,32),     :name => 'Diminished Fifth'},
        {:ratio => Rational(3,2),       :name => 'Perfect Fifth'},
        {:ratio => Rational(8,5),       :name => 'Minor Sixth'},
        {:ratio => Rational(5,3),       :name => 'Major Sixth'},
        {:ratio => Rational(9,5),       :name => 'Minor Seventh'},
        {:ratio => Rational(15,8),      :name => 'Major Seventh'},
        {:ratio => 2,                   :name => 'Octave'} ]
Ratios.map{|ratio| start_note * ratio[:ratio].to_f }
=> [440.0, 458.33333333333337, 495.0, 528.0, 550.0, 586.6666666666666,
    618.75, 660.0, 704.0, 733.3333333333334, 792.0, 825.0, 880.0]

This is already producing rational frequencies, and they are numbers I can work with mathematically in this example. So, what is going on when an A Major chord A C♯ E sounds consonant and pleasing to people, and what causes a disonanant sound?

A broad answer, is that the smaller the whole numbers are which are involved in the ratio, the more pleasing (or even boring) two notes will sound in relation to one another. You can see for yourself that the ratio of a relatively dissonant interval like the Diminished Fifth, also called The Tritone or Devil’s note, has much higher whole numbers in the numerator and denominator with 45:32.

This interval is so interesting sounding, hanging out on the verge of consonance and dissonance that it is also called the Blue Note, and plays a large part in the sound of Blues, Jazz, Rock, and Metal.

While I was playing with the math described on the excellent DSP website A Trip on the Complex Plane, I started playing with dot products and it was mentioned that a pure sine wave at frequency $f$ is orthogonal to a sine wave one octave higher at $2f$. I began to wonder what a dot product reveals about the orthogonality of other inervals besides the octave.

Orthogonality

One specific example of orthogonality that is easy to understand is on a 2D plane. Visually you can see it as two points that are rotated $90^{\circ}$or equivilenly $\pi/2$ radians from one another. Like these points at $(0, 0.5)$ and $(0.5, 0)$.

Here you can see points

Orthogonal points

It’s easy to see in this image, but you can figure out if any two points are orthogonal using a dot product. This is because the dot product of two vectors is equal to the cosine of the angle between them. The cosine of $90^{\circ}$, or from now on using radians, $\pi/2$, is $0$, therefore if the dot product of two vectors is $0$ they are orthogonal.

We can extend Ruby’s Array class to add some methods for working with orthogonality and dot products. There is already a Vector class in Ruby which does this, but it will be easier to show what’s happening by adding methods to Array and using it as a vector.

The formula for dot product is to multiply each element in the vector element-wise producing a third vector, then all elements of this resulting vector are summed.

$\vec{v} \cdot \vec{u} = \sum\limits_{i=1}^n \vec{v}_i \vec{u}_i$

Dot Product

class Array
    def dot_product(ary)
        self.zip(ary).inject(0) do |sum, pairs|
            sum += pairs.first * pairs.last rescue 0
        end
    end

    def is_orthogonal?(other)
        self.dot_product(other).abs < Epsilon
    end
end
p1 = [1, 0]
p2 = [0, 1]
p1.is_orthogonal?(p2)
=> true

Audio as N-Dimentional Vectors

So it’s great to be able to calculate if 2D points are orthogonal and all, but what about audio and musical notes? How can we prove that an A 440hz note is orthogonal to A 880hz, and what about the other notes in the chromatic scale?

A pure note with no overtones being played is just air pressure oscillating back and forth at a specific number of times per second, or frequency. That can be modeled using a sinusoidal function like sine or cosine. Generating a cosine wave to represent A 440hz can be done with the equation $cos(2\pi \cdot 440 \cdot t)$ where $t$ is time in seconds.

In the computer we can represent this digitally by sampling the values that come out of the above equation at regular time intervals, we can specify a ratio, number of cycles to generate, and a rate at which to sample the cosine function with the following code. This effectively makes our digital sampling the same as an N-dimentional vector, similar to the 2 dimentional vectors shown above.

def create_scale_degree(ratio, num_cycles = 1, sample_rate = 32)
    degree = 0.step(num_cycles, Rational(1,sample_rate)).map do |t|
        Math::cos(ratio * 2 * Math::PI * t)
    end
    degree.pop # One sample too many
    degree
end

Instead of using 440hz as the tonic, this code is just using 1hz to simplify things. It is also dropping the last sample because that sample actually belongs to the beginning of the next period of the cosine. Now we should be able to prove what we already knew, two cosine waves that are in phase but an octave higher (double the frequency) are orthogonal to each other.

Octaves are Orthogonal

p1 = create_scale_degree(1, 1, 5)
=> [1.0, 0.30901699437494745, -0.8090169943749473,
    -0.8090169943749475, 0.30901699437494723]
p2 = create_scale_degree(2, 1, 5)
=> [1.0, -0.8090169943749473, 0.30901699437494723,
    0.30901699437494773, -0.8090169943749477]
p1.is_orthogonal?(p2)
=> true
p1.is_orthogonal?([2,3,4,21,])
=> false

Above I am genrating 1 cycle of a 1hz and a 2hz cosine wave at a sample rate of 5 samples per second. These results can sometimes be hard to see due to floating point error, but we can unwind the dot product method and show it working manually.

Manual Calculation showing orthogonality

1.0 * 1.0
=> 1.0
0.30901699437494745 * -0.8090169943749473
=> -0.25
-0.8090169943749473 * 0.30901699437494723
=> -0.2499999999999998
-0.8090169943749475 * 0.30901699437494773 +
=> -0.2500000000000003
0.30901699437494723 * -0.8090169943749477
=> -0.24999999999999992
1.0 - 0.25 - 0.2499999999999998 - 0.2500000000000003 - 0.24999999999999992
=> 2.7755575615628914e-17
#  Without floating point error: 1.0 - 0.25 - 0.25 - 0.25 - 0.25 == 0.0

The above floating point error is the reasoning behind self.dot_product(other).abs < Epsilon in the method is_orthogonal? Epsilon is just set to some very small number to deal with floting point comparisons.

Are the other intervals orthogonal too?

Using all this, we should be able to answer the original question, are the other intervals in the chromatic scale all orthogonal to each other? The answer I found was no, not after only one cycle of the cosine waves. But, if you keep them running together for longer periods of time and more cycles, there is eventually a time when the waveform’s intervals will match up in period and the full waveform up until that point will be completely orthogonal to each other.

Checking Each Interval for Orthogonality

MaxCycles = 200
Ratios.each_with_index do |ratio, i|
    next if i == 0   #  Not orthogonal to itself!
    1.upto(MaxCycles) do |num_cycles|
        tonic = create_scale_degree(Ratios.first[:ratio], num_cycles)
        note = create_scale_degree(ratio[:ratio], num_cycles)
        if tonic.is_orthogonal?(note)
            puts "#{ratio[:name]} #{ratio[:ratio].inspect} is orthogonal to Unison after #{num_cycles} cycles"
            break;
        end
    end
end

Results

Below is a chart of the results sorted by how many cycles it takes for each interval to sync up and become orthogonal, which they all do after few relatively few cycles.

Results

| Cycles   | Ratio    |  Decimal  | Interval Name    |
| -------- | -------- | --------- | ---------------- |
|  1       |  2:1     | 2.0       | Octave           |
|  2       |  3:2     | 1.5       | Perfect Fifth    |
|  3       |  4:3     | 1.3333333 | Perfect Fourth   |
|  3       |  5:3     | 1.6666667 | Major Sixth      |
|  4       |  5:4     | 1.25      | Major Third      |
|  5       |  6:5     | 1.2       | Minor Third      |
|  5       |  8:5     | 1.6       | Minor Sixth      |
|  5       |  9:5     | 1.8       | Minor Seventh    |
|  8       |  9:8     | 1.125     | Major Second     |
|  8       | 15:8     | 1.875     | Major Seventh    |
| 24       | 25:24    | 1.0417    | Minor Second     |
| 32       | 45:32    | 1.40625   | Diminished Fifth |

The first things I notice about the results are:

The number of cycles required is equal to the denominator
The decimal value of the interval’s ratio doesn’t follow the number of cycles
The Perfect Fifth is the quickest to sync up (Circle of fifths?)
The Perfect Fourth is next
When DJing you generally mix songs in keys which are perfect fourths and fifths
My two favourite intervals take the longest to sync

What I can take away from this experiment, is that the human brain is basing its perception of consonance and dissonance on how long two frequencies or notes played together take until they match up in period and become orthogonal. This probably gives those intervals a feeling both of balance, but also temporary dissonance which resolves after a short period of time. Two sounds played at once which never resolve to orthogonality are considered noisey or out of tune.