Machine Learning -

Machine Learning for {30% laymen/90% noobs/ 85% amateurs}

Carlos Rodriguez Garcia, Software Engineer & B.I. Analyst, Vistatec

We have all heard this new buzz word in the world of technology. It is the new trend, everybody wants to jump in the { 80% band wagon, 50% train, 5% lake } and we are all making claims about how amazingly good and new it is: Machine Learning. But, is it really that new?

Machine Learning is just another fancy word or area within a broader field called Predictive Analysis or Predictive Modelling. This { 70% area as it was used before, 90% branch as not in text yet, better style } of the statistics was born in the 1940s when governments invested in this area for military purposes. Soon, in the 1950s, businesses started seeing { “seeing potential” in many matches around businesses, 90% potential } in this for risk management and travelling solutions (solving the shortest path problem). The birth of your { 50% awful, 50% great } credit score is right here and closely tied to Machine Learning.

We could go into the details of what Machine Learning is and how the different engines out there work, how the wonderful Python libraries help you with this, etc., but for the purpose of {100% this} article we will try to explain Machine Learning in the easiest way possible and how it is applied in the localization and content creation { 80% world, 70% businesses }.

Machine Learning is trying to do one thing we humans do constantly: predict the future. We have something right under our hair that has been doing predictions throughout our lives. Some people use it more and some less, but we all use it, even if we do not want to. Yes, you have just predicted what I am talking about, the brain.

Our { high chance of “brain” as referenced in the paragraph before } is constantly predicting what comes next in a sentence, what is the car going to do next when we are { 99% driving, automotive reference in sentence } down the road or when we are playing sports.

Let’s look at a very quick example of how our brain uses the models and engines it has created and retrained over the course of your life. When I was learning English as a non-native { 100% speaker }, I had to go through the painful process of learning all those wonderful phrasal verbs (and learn that “speak up” does not actually refer to a person looking at the ceiling as they speak …). I then knew that when “take”, “make”, “speak” and many other verbs are followed by a specific word, their meaning could change. A model was created. My brain was now ready to use this { 80% model, used in previous sentence and referenced by “this” } in real life. Before I knew this, my brain only considered direct or indirect complements after a verb, however, after learning about phrasal verbs, my model had been recreated, and now I could predict due to the context and other parameters that the possibility of another item such us a preposition could appear after a verb. I had LEARNED!

This is what the new (or not so new) Machine Learning is all about, trying to create models to predict the future. But, why the fuzz now and not in the 1940s, when it was discovered? Well, as you may remember, in the 1960s a computer would only take punch cards and it would fill a whole room. On the acclaimed movie “Hidden Figures”, we saw how in the 1960s the U.S.A sent astronauts to the moon with a computer that had as much computing { 70% power, 60% knowledge, 30% degree } as the laptop you are reading this article on.

Nowadays, we have increased this computing power exponentially, as well as the amount of data we gather. For every click, scroll, hover, stop and move you do on the internet, consumer and personal information is gathered. Cars are gathering information as they drive. Satellites are tracking mobile phones as we walk on the streets and into stores, bots are collecting multilingual content from all the { 90% millions, 20% hundreds, 2% dozens } of websites on the internet. All this data can be processed within seconds or minutes to create a model, then tested against reality and retrained if not correct, all with barely a few people looking at it.

In the 1940s we had the idea in our heads, but we were ahead of what technology could do. Nowadays, we have all we need and more, and this is why the train has just departed from the { probably train station, as train is in the sentence and this is where trains departure from } and everybody (all businesses, all governments, E-V-E-R-Y-O-N-E) wants to get on it.

In the localization industry, this is the dawn of MT on steroids. Machine Learning will help us offer up more fluent and accurate translations to post-editors and translators so all they have to do is fix just a few { 90% issues, fixing issues is something translators and reviewers do } where the model was not correct. But guess what, eventually that issue will disappear, as the model will be retrained. There will be no “Oops, I forgot we agreed on that” discussions anymore.

But, does this mean that Machine Learning will solve all issues and computers will eventually be able to do anything, given the time? I am sorry to disappoint you, but the Terminators T-800 are not coming and humans are still working for at least a few centuries to { 100% come, lots of matches on this construction}, including yourself. Machine Learning works great when logic and reason apply, when trends are not affected by the { 90% absurd, 10% genius } of human nature or just pure chance.

To give an example, the new self-driven Tesla is a great piece of engineering that applies Machine Learning to know when to take over, stop and turn without having an accident. It probably has a rate of accidents much lower than any car driven by the most careful driver. The model works fine (and it will be almost perfect with time) as it has so much data to create predictions that the chance of an action being out of the model is very low.

Unless the oil drums in the trailer in front of your car tilt and spills loads of oil on the road. This is something that is not in the model, as the chances of this happening are very low and the machine has not been able to “learn” what to do in those cases nor predict that this would happen. As soon as it occurs once or twice, it will include it in the bag of things to be “aware of”, but there will always be a first. Don’t be disappointed, your brain has the exact same issue, it cannot predict the absurd or the very likely improbable. But guess what, it will also learn and expect this to happen next time on the road.

We are at the { 50% beginning, 50% doorstep } of a new world of opportunities in all areas and businesses, we just need to wait and let the models get created, tested, retrained, and redeployed. Humanity has a head start of a few millennia on machines regarding learning, I think it is only fair we give them a few years to catch up (and they will).

{Testing Model} The answer to life, the universe and everything is { 100% chance 42 }

{Model test successful}

{Prediction complete}