Im trying to build a simple audio player for playing music from my iTunes library, a bit like the systems you see in a bar: very user-friendly, with simple cross fading and simple seek functionality to add songs to the running playlist. I want to keep all management of music & playlists and all importing of musicfiles, editing properties, genre etc. in iTunes and have a clean and simple player.
Why you might ask? Partly because I dont like iTunes for playing music anymore (way to bloated and hard to find music and manage a running playlist) and partly because I like to program and need a new project
I have built a working prototype setup that plays mp3 and AAC files but I still have one major problem I need to take care of before I can proceed with my project
My iTunes library currently contains about 15.000 songs and has taken me 14 years to collect. This results in a wide variety in volume of all these music files. iTunes corrects this by changing the volume to a preset value while playing a file. This value is not stored in the iTunes library, so I presume it is stored in the ID3 tag? My problem is that without this correction, my music files are all over the place regarding volume. One song plays soft, the next is loud, etc.
I looked at different ways to overcome this problem, but I think the easiest way is to read the ID3 tag and see if there may be volume info in there? I dont think it will be easy to read the music data to try and establish a volume level in my program?
Hmm. Tried Sox and it seems to work, even getting the output in my program, but Sox comes without mp3 support (because it’s patented so it seems). Need to recompile your own Sox with mp3 support and no way I know how to do that…
Once you have the wav file I would recommend building an amplitude histogram of the entire wav samples.
Find you mean and +/- std deviations and use those values to control your volume before you play the wav.
This filter applies a certain amount of gain to the input audio in order to bring its peak magnitude to a target level (e.g. 0 dBFS). However, in contrast to more “simple” normalization algorithms, the Dynamic Audio Normalizer dynamically re-adjusts the gain factor to the input audio. This allows for applying extra gain to the “quiet” sections of the audio while avoiding distortions or clipping the “loud” sections. In other words: The Dynamic Audio Normalizer will “even out” the volume of quiet and loud sections, in the sense that the volume of each section is brought to the same target level. Note, however, that the Dynamic Audio Normalizer achieves this goal without applying “dynamic range compressing”. It will retain 100% of the dynamic range within each section of the audio file. More infos
The ffmpeg program only works after a lengthy analysis of the sound file and I think that will mess up the flow of my program. Dynamic Audio Normalizing was actually my “wet dream”. I have an old Windows DJ program called OtsJuke or OtsDJ that has a live audio compression and normalization system and that is really cool for the problem I have here, but creating this in my own program will be impossible, because you have to analyse and change a live audio stream.
My idea was to do the analysis “on the fly” just before a song plays. My program runs with two Movieplayer controls that are the two decks that play music. When player 1 starts playing, the next song on the playlist can be analyzed and after that can be loaded up in player 2. Volume of player 2 will be adjusted according to the results, player 2 is ready to play/fade the song in when needed.
At the moment I have Sox working… sort of. Sox works fine in my Terminal window, does everything I need it to, even with mp3 files (no m4a files, but that’s not that important). However, it will not work in my Shell from Xojo, since the Sox command is not recognized there resulting in an error. Very strange…
I compiled and installed Sox with Brew and the strange thing is that in Terminal it works like a charm. However in Shell there is an error saying that Sox is an unknown command (so Shell uses different path settings as Terminal does and the Xojo manual confirms that). I don’t know where the Sox command is installed, so I can’t add the path to it when I call the program from within Shell.
Frustrating being so close and not being able to get it working…
Yess! Found Sox and with it the way to use it in my program. Sox is now returning the volume settings for each mp3 file and I can compensate accordingly. Sounds nice at the moment!!
Part of the reason I suggested ffpeg was to convert the mp3 into something ‘legible’ by software such that you might be able to stream as you process.
maybe ffmpeg has some real time streaming you can take advantage of.
anyways, sounds like you’ve got lots of options!
Don’t give up on your goal.
After a struggle with Sox I revisited your suggestion: ffmpeg. This package also comes with “ffprobe” and guess what? This program can read the iTunNORM tag off of an mp3 or AAC file and that was exactly what I was looking for!! This tag holds iTunes’ magical volume normalization info and that was what I needed.
This discovery immediately dropped me in another deep hole: interpreting the iTunNORM tag, since it holds 10 fields of 8 character long strings. These strings have hexadecimal values and the first two fields hold the values that are connected to the normalized volume in iTunes. After some hours of Googling I found some obscure forum with a topic that explains how to calculate from this hexadecimal value to the much needed volume info. Now I can calculate the volume in dB.
Next step is to make a connection between this dB value and the volume setting of my Movieplayer control.