Learn practical skills, build real-world projects, and advance your career

The purpose of creating this kernel is to provide - not only a step by step guide on how to convert a given audio clip to spectrogram which will be useful for various other audio analysis but also to explain what each step in audio loading and visualiztion is doing.

Provided some links in reference section at the end of the kernel.

** More information to be added

Step-1: Let's import all the required libraries

import os
import matplotlib.pyplot as plt

#for loading and visualizing audio files
import librosa
import librosa.display

#to play audio
import IPython.display as ipd

audio_fpath = "../input/audio/audio/"
audio_clips = os.listdir(audio_fpath)
print("No. of .wav files in audio folder = ",len(audio_clips))
No. of .wav files in audio folder = 2002

Some information about audio data before we start with audio data processing

What are x and y axis in a audio wave representation?

Sound wave image

  • The y-axis represents sound pressure, the x-axis represents time.

Standard waveforms

Sine waveform

Sine wave image

Square waveform

Square waveform image

Rectangular waveform

Rectangular waveform image

Triangular waveform

Triangular waveform image

Sawtooth waveform

Sawtooth waveform image

** More info will be added here

Step-2: Load audio file and visualize its waveform (using librosa)