Easy way to implement audio processing with Python.

Handling audio files.


Nowadays Python is prevailing as a programming language thanks to its user-friendly approach and many available libraries such as Pydub described here and many others. Pydub is a simple and easy high-level library which is based on ffmpeg and influenced by jquery.

Audio processing with Python.



It is used for processing audio, adding effects, id3 tags, slicing, concatenating audio tracks. Pydub supported in python starting from version 2.6, and widely used in Python 3.

In other words Pydub is a simple, well-designed Python module for audio manipulation. As stated by PyDub creators: “Pydub lets you do stuff to audio in a way that isn’t stupid.”


Basic audio file manitulations:


The following simple script shows example of opening audio file, making and exporting reversed file, adding volume to it (+15), slicing it (original[0:2000]) and creating merging file as 2 original fragments, silent 1 second fragment plus reverse part.




from pydub import AudioSegment

original = AudioSegment.from_wav('original.wav')
print(type(original))
print(original)

reversed = original.reverse()
reversed.export('reversed.wav')
reversed = reversed + 15

# print(dir(original))

first_two = original[0:2000]
first_two.export('first_two.wav')

print(len(original))

merged = original * 2 + AudioSegment.silent(1000) + reversed
merged.export('merged_audio.wav')
print('Merged file is created !!!')

OUT: Merged file is created !!!



Mixing and overlaying audio files:



from pydub import AudioSegment

beat = AudioSegment.from_wav('beat.wav')
sax = AudioSegment.from_wav('sax.wav')

print(len(beat), len(sax))

beat2 = beat * 2
beat2.export('beat2.wav')

mixed = beat2.overlay(sax)
mixed.export('mixed.wav')

final = beat2 + mixed * 2 + sax + beat2 + sax
final.export('finalmix.wav')
print('Final mix is created !!!')

OUT: Final mix is created !!!


Mixing, applying low pass filter and mono audio effects:



from pydub import AudioSegment

beat = AudioSegment.from_wav('mybeat.wav')

beat_low = beat.low_pass_filter(2000)
beat_low.export('beat_low.wav')

beat_left = beat_low.pan(-1)
beat_right = beat_low.pan(1)

beat_final = beat_left + beat_right + beat_low
beat_final.export('beat_final.wav')

print('Final mix is created !!!')

OUT: Final mix is created !!!


Speech recognition script:



from speech_recognition import Recognizer, AudioFile

recognizer = Recognizer()

with AudioFile('bratus_net_demo.wav') as audio_file:
  audio = recognizer.record(audio_file)

text = recognizer.recognize_google(audio)
print(text)

OUT: bratus net is a best jokes puns and memes site ever


Mood detection from audio file:



from speech_recognition import Recognizer, AudioFile
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer

recognizer = Recognizer()

with AudioFile('chile.wav') as audio_file:
  audio = recognizer.record(audio_file)

text = recognizer.recognize_google(audio)

nltk.download('vader_lexicon')

analyzer = SentimentIntensityAnalyzer()
scores = analyzer.polarity_scores(text)
# print(scores)

if scores['compound'] > 0:
  print('Mood detected as Positive')
else:
  print('Mood detected as Negative')


OUT: Mood detected as Positive





See also related topics: