Rapid prototyping with Cloud services on mobile

Auteur
Meredith Vigelandzoon
Datum

Nowadays everyone can look up a translation through Google or some other fascinating translation app or service in a matter of seconds. But how about making such an app yourself? Sounds challenging? Not when using standard cloud platform building blocks. For example: with the Apple Speech framework and the Google Translate Cloud Service, everyone with some coding skills can make this kind of app in under half an hour. In this blog article I’ll explain how I have made an app that is capable of:

  1. Recognizing your speech
  2. Recognizing the language
  3. Translating your spoken word to another language
  4. Synthesizing the translated text into audio

Speech recognition

In 2016, Apple introduced the Speech framework. A handy API for speech recognition - used by Apple’s Siri - to recognize live audio.
The sample app I made is coded in Swift, for which I needed the Speech framework & the AVFoundation framework from the Apple SDK. The app asks for permission to use the microphone, not just for privacy reasons, but also for the improvement of its technology. Apple sends all messages to their server, where the input is analyzed and evaluated.

Google translate cloud service

The Google Translation API is easy to integrate in applications, and makes it possible to do various things with different languages. The API supports more than a 100 different languages. The API includes translation, but also language recognition in cases where the source language is unknown. In combination with the app’s framework, you can even apply a Spanish, German, Swedish or any of the other 99 accents.

Implementing Speech recognition and Translation into one app

To get access to the Google translation API, you’ll need to setup an account. This can be done for free and you will also get about $300 of free translation-credit to play with. With your account, you will also receive an API-key, which needs to be implemented into your project.

So this is what you need to do in code:

  1. Import SDK’s
  2. Ask the user for microphone permission
  3. Make a speech recognition request
  4. Start recording

With the help of speech recognition, your voice will be transformed into text. This text will be sent to the Google API, which will translate the text to another language. It will send this translation back to your app, and then all there is left to do is to let your app say the translated sentence out loud. Again, with standard cloud component building, this kind of functionality has become very easy nowadays.

Mobile Development - screenshot

Snippet of code I used in my sample project:

public func translateAndSpeak(text: String) {
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayback, with: AVAudioSessionCategoryOptions.mixWithOthers)

    translator.apiKey = "Y0ur-OwN-Ap1-k3Y"   // 1
    self.translate(text: text, callback: { (result) in // 2
    self.myUtterance = AVSpeechUtterance(string: result)  
    self.myUtterance?.rate = AVSpeechUtteranceDefaultSpeechRate
    self.synth.continueSpeaking()
    let accentLanguage = self.outputParam            // 3
    self.myUtterance?.voice = AVSpeechSynthesisVoice(language: accentLanguage) //4 
    self.synth.speak(self.myUtterance!)  //5
        })
    } catch {
        print(error)
    }
}

This is what the code does:

  1. Provide you with your private API-key
  2. Translate speech to text
  3. Get your output language
  4. Optionally choose a voice with the pronouncement/accent of your output language
  5. Let your phone say the text out loud.

Now you can have endless multilingual conversations with the use of your own app!

Advantages for businesses

From a business perspective, you can gain many advantages from these quick and simple cloud integrations:

  • Save costs on developing effort and certificates, by ready-for-use services
  • Thanks to Artificial Intelligence: automatic (and thus, free) translational improvements in the underlying cloud engines.
  • Easy to apply according to your wishes (like extra security or encryption if needed)
  • Fast implementation

And most important. Building apps with cloud component is like playing with Lego. It is fun and very enjoyable!

Tags

Development Cognitive Services Mobile