#1 Solution to SOUSEN Blitz5

The trick to this competition was to identify negative reviews with high confidence. My approach had 2 easy steps:

  1. Convert speech to text.
  2. Classify text based on sentiment.

In both steps there was no training involved, only inference using pretrained models from internet.

  1. Convert speech to text.
    For this I used a free model found on torchhub. It wasn’t 100% accurate, but did ok.

    import pandas as pd
    import torchaudio
    import torch
    from glob import glob
    
     device = torch.device('cpu')  # gpu also works
     model, decoder, utils = torch.hub.load(repo_or_dir='snakers4/silero-models',
                                            model='silero_stt',
                                            language='en', # also available 'de', 'es'
                                            device=device)
     (read_batch, split_into_batches,
      read_audio, prepare_model_input) = utils  # see function signature for details
    
     stage = 'test'
     files = sorted(glob(f'{INSERT_YOUR_PATH_TO_WAV_FOLDER}/*.wav'))
     bsize = 10
     batches = split_into_batches(files, batch_size=bsize)
     ids = [f.split('/')[-1].split('.')[0] for f in files]
    
     res = []
     for i, batch in enumerate(batches):
         minput = prepare_model_input(read_batch(batch), device=device)
         output = model(minput)
         for j, example in enumerate(output):
             res.append([ids[i * bsize + j], decoder(example.cpu())])
    
     df = pd.DataFrame(res)
     df.columns=['wav_id', 'text']
     df.to_csv(f'text_{stage}.csv', index=False)
    
  2. Compute sentiment
    For this task I used transformers library, which gives “POSITIVE/NEGATIVE” sentiment of a phrase and a confidence in its prediction.

    from transformers import pipeline
    nlp = pipeline(‘sentiment-analysis’)
    bulk = 50

    res = []
    for i in range(math.ceil(len(z) / bulk)):
    r = nlp(list(df.iloc[bulk * i: bulk * (i + 1)][‘text’].values))
    res.extend( r)

    rdf = pd.DataFrame(res).rename(columns={‘label’: ‘sentiment’})
    d = pd.concat([df, rdf], axis=1)
    d = d.sort_values(‘wav_id’).reset_index(drop=True)

Finally, submit 0(=negative) only when the model is very confident:
d[‘label’] = 2
d.loc[(d.sentiment == ‘NEGATIVE’) & (d.score > .995)] = 0
d.to_csv(‘submission.csv’, index=False)

3 Likes

This was an out of the box step. Thanks for sharing

1 Like

That’s clever, good job!