The purpose of this project is to encode an image to a sound that can be viewed with a spectrogram. For some time I have known that musical artists have encoded pictures into their music. Most notable of these is artists is Aphex Twin. Luckily I had a copy of Windolicker and a great visualization program Sonic Visualiser. After looking at the images I decided it would be cool to try and encode my own images. I saw a few programs available, but decided it would be a better challenge to write my own program from scratch using Perl.
Spectrograms
A spectrogram is a graph representing the intensity or a frequency with relation to time. Normally the frequencies are along the Y axis, with the time on the X axis. The intensity of the frequency is represented by the brightness of the color. The frequency and color can use either a linear scale or a logarithmic scale. Below is an spectrogram of a few piano chords. The audio file used can be found on Wikipedia here.
Image may be NSFW.
Clik here to view.
Image encoding
The idea I had to encode the image was to simply create a sine wave at a corresponding frequency to represent the Y axis, a corresponding time to represent the X axis and a corresponding amplitude to represent the pixel color intensity.
Creating Sound
The first step to encoding an image was to learn how audio formats work. At first I tried writing a script that plays a frequency to the ‘/dev/dsp’ (Which is the sound card on Linux). When writing straight to /dev/dsp you are limited by a sample rate of 8000hz and a sample size of 8bits. Below simple Perl script that plays a concert A 440hz. To execute run ‘./sin.pl > /dev/dsp’.
#!/usr/bin/perl
use Math::Trig;
use strict;
use POSIX;my $sample = 8000;
my $frequency = 440;
my $cycles = 6;
my $period = POSIX::floor($sample / $frequency * $cycles);while (1) {
for(my $i=1;$i<=$period;$i++)
{
my $x = 128 + sin($cycles * 2 * pi * $i / $period) * 128;
$x = POSIX::floor($x);
my $char = pack(“C“,$x);
print “$char color=”#ff00ff”>”;
}
}
The DSP defaults do not offer much fidelity I needed at least the fidelity of an audio CD, which is 16bits at 44.1khz. I did some of searching on CPAN to find a library that allowed me write wave files. Most of the audio libraries had a too much overhead for what I wanted to do. Instead I looked up the file format for a ‘.wav’ and coded my own library. This library is limited to only producing a 16bit 44.1khz mono wave.
#!/usr/bin/perl
#Author Evan Salazar
#——————————————–
#
#Generate a .wav file for 16 bit mono PCM
#
#——————————————-
use strict;
package SimpleWave;sub genWave {
#Get the reference to the data array
my ($audioData) = @_;
#This is the default sample rate
my $samplerate = 44100;
my $bits = 16;
my $samples = $#{$audioData} + 1;
my $channels = 1;#Do Calculations for data wave headers
my $byterate = $samplerate * $channels * $bits / 8;
my $blockalign = $channels * $bits / 8;
my $filesize = $samples * ($bits/8) * $channels + 36;#RIFF Chunk;
my $riff = pack(‘a4Va4‘,‘RIFF‘,$filesize,‘WAVE‘);#Format Chunk
my $format = pack(‘a4VvvVVvv‘,
‘fmt ‘,
16,1,
$channels,
$samplerate,
$byterate,
$blockalign,
$bits);#Data Chunk
my $dataChunk = pack(‘a4V‘,‘data‘,$blockalign * $samples);#Read audoData array
my $data;
for(my $i=0;$i<$samples;$i++) {$data .= pack(‘v‘,$audioData->[$i]);
}#Return a byte string of the wave
return $riff . $format . $dataChunk. $data;
}
1;
Reading a Bitmap
Luckily I found a simple bitmap reader on CPAN called Image::BMP. This is a nice lightweight library that dose not depend on any external libraries or compiled code. Using this library I was able to easily load and read the bitmap data.
Encoding the Image
The first pass of my program disregarded the color data and only produced a frequency for the Y axis if the color intensity was less that half the sum of all colors. Below is an example. Note: I converted the WAV to an MP3 to conserve bandwidth, at 320kbps not much data is lost.
Image may be NSFW.
Clik here to view.
I was really shocked to fist see the image! The only tweaking I needed to do was to use a linear scale for the frequency. Also if I selected too high an amplitude for the sin wave, clipping occurred in areas with too much black. For image above I used an amplitude of about 1000 on a scale of 0 to 32768.
The next step was to add amplitude scaling to match the color intensity. For this I summed all the color channels for a given pixel and scaled it to represent the max amplitude ‘(R + G + B) / 768 * max_amplitude’. Below is a picture of me after using the scaling.
Image may be NSFW.
Clik here to view.
Audio File: evan.mp3
By selecting a color scheme that goes from black to white and using a linear scale for the volume I get a very good black and white image. To prevent clipping on very dark images I added an inverse option that will invert the color producing a negative image.
Image may be NSFW.
Clik here to view.
You can reverse the color scheme to go from white to black to produce the regular image
Image may be NSFW.
Clik here to view.
Full Program
Below you can view and/or download the full code to this program. Currently performance is not optimized. So don’t write me telling me its slow. I currently have a few idea to speed it up. Also for best results use a small image around 100px tall.