The purpose of this project is to encode an image to a sound that can be viewed with a spectrogram. For some time I have known that musical artists have encoded pictures into their music. Most notable of these is artists is Aphex Twin. Luckily I had a copy of Windolicker and a great visualization program Sonic Visualiser. After looking at the images I decided it would be cool to try and encode my own images. I saw a few programs available, but decided it would be a better challenge to write my own program from scratch using Perl.
Spectrograms
A spectrogram is a graph representing the intensity or a frequency with relation to time. Normally the frequencies are along the Y axis, with the time on the X axis. The intensity of the frequency is represented by the brightness of the color. The frequency and color can use either a linear scale or a logarithmic scale. Below is an spectrogram of a few piano chords. The audio file used can be found on Wikipedia here.
Image encoding
The idea I had to encode the image was to simply create a sine wave at a corresponding frequency to represent the Y axis, a corresponding time to represent the X axis and a corresponding amplitude to represent the pixel color intensity.
Creating Sound
The first step to encoding an image was to learn how audio formats work. At first I tried writing a script that plays a frequency to the ‘/dev/dsp’ (Which is the sound card on Linux). When writing straight to /dev/dsp you are limited by a sample rate of 8000hz and a sample size of 8bits. Below simple Perl script that plays a concert A 440hz. To execute run ‘./sin.pl > /dev/dsp’.
#!/usr/bin/perl
use Math::Trig;
use strict;
use POSIX;my $sample = 8000;
my $frequency = 440;
my $cycles = 6;
my $period = POSIX::floor($sample / $frequency * $cycles);while (1) {
for(my $i=1;$i<=$period;$i++)
{
my $x = 128 + sin($cycles * 2 * pi * $i / $period) * 128;
$x = POSIX::floor($x);
my $char = pack(“C“,$x);
print “$char color=”#ff00ff”>”;
}
}
The DSP defaults do not offer much fidelity I needed at least the fidelity of an audio CD, which is 16bits at 44.1khz. I did some of searching on CPAN to find a library that allowed me write wave files. Most of the audio libraries had a too much overhead for what I wanted to do. Instead I looked up the file format for a ‘.wav’ and coded my own library. This library is limited to only producing a 16bit 44.1khz mono wave.
#!/usr/bin/perl
#Author Evan Salazar
#——————————————–
#
#Generate a .wav file for 16 bit mono PCM
#
#——————————————-
use strict;
package SimpleWave;sub genWave {
#Get the reference to the data array
my ($audioData) = @_;
#This is the default sample rate
my $samplerate = 44100;
my $bits = 16;
my $samples = $#{$audioData} + 1;
my $channels = 1;
#Do Calculations for data wave headers
my $byterate = $samplerate * $channels * $bits / 8;
my $blockalign = $channels * $bits / 8;
my $filesize = $samples * ($bits/8) * $channels + 36;#RIFF Chunk;
my $riff = pack(‘a4Va4‘,‘RIFF‘,$filesize,‘WAVE‘);#Format Chunk
my $format = pack(‘a4VvvVVvv‘,
‘fmt ‘,
16,1,
$channels,
$samplerate,
$byterate,
$blockalign,
$bits);#Data Chunk
my $dataChunk = pack(‘a4V‘,‘data‘,$blockalign * $samples);#Read audoData array
my $data;
for(my $i=0;$i<$samples;$i++) {$data .= pack(‘v‘,$audioData->[$i]);
}#Return a byte string of the wave
return $riff . $format . $dataChunk. $data;
}
1;
Reading a Bitmap
Luckily I found a simple bitmap reader on CPAN called Image::BMP. This is a nice lightweight library that dose not depend on any external libraries or compiled code. Using this library I was able to easily load and read the bitmap data.
Encoding the Image
The first pass of my program disregarded the color data and only produced a frequency for the Y axis if the color intensity was less that half the sum of all colors. Below is an example. Note: I converted the WAV to an MP3 to conserve bandwidth, at 320kbps not much data is lost.
I was really shocked to fist see the image! The only tweaking I needed to do was to use a linear scale for the frequency. Also if I selected too high an amplitude for the sin wave, clipping occurred in areas with too much black. For image above I used an amplitude of about 1000 on a scale of 0 to 32768.
The next step was to add amplitude scaling to match the color intensity. For this I summed all the color channels for a given pixel and scaled it to represent the max amplitude ‘(R + G + B) / 768 * max_amplitude’. Below is a picture of me after using the scaling.
By selecting a color scheme that goes from black to white and using a linear scale for the volume I get a very good black and white image. To prevent clipping on very dark images I added an inverse option that will invert the color producing a negative image.
You can reverse the color scheme to go from white to black to produce the regular image
Full Program
Below you can view and/or download the full code to this program. Currently performance is not optimized. So don’t write me telling me its slow. I currently have a few idea to speed it up. Also for best results use a small image around 100px tall.