Big Data and Analytics dominate column inches, blogs, industry
events, office conversations, business strategies, and the like, but Tech
Giants IBM, Microsoft and Google are already busy uncovering the next big gem
in IT. Step forward Linguistics.
Linguistics is a relatively young scientific study that
seeks to understand the true building blocks of human language – something that
has eluded even the most respected evolutionary biologists of today. Some
consider it almost impossible to explain the origins of human language but that
has not stopped the budding Linguistics community asking a series of searching
questions. How and where did language originate? How did we come to understand
language? What is the role of the brain? What are the intricate rules of
language?
The Computer Science discipline attempts to consider all of
this in the context of computer consumption – no mean feat for such an elusive
specialism. To be a little more specific, that is to reverse engineer human
language so that computers can understand, act upon and respond to human
intent. Genius. Long the realm of science fiction, the two-way converse between
man and intelligent machines is fast moving towards science fact. Artificial
Intelligence is on its way, and boy, is it going to be worth the wait.
Natural Language
Processing Edges Closer
The Linguistics field of study, often referred to as natural
language processing (NLP) or natural language understanding (NLU) in computer
science, is rapidly evolving after decades spent in R&D labs, at the cost
of a small fortune. Whilst it is true NLP exists in part today, it is exactly
that. “Approximations” best describe current state capabilities, a combination
of statistical analysis, rule-based methods and heuristics (a type of shortcut
technique), helping arrive at estimated results. Mathematics not Science, if
you will.
The problem being it fundamentally lacks “intelligence” and
therefore is not fit for purpose. At least not mainstream consumption anyway. Current
methods fail to grasp the true “meaning” of language – a pre-requisite for Artificial
Intelligence. To do so, you must understand grammar (structure), morphology
(formation of words), syntax (formation of sentences from words) and most
importantly of all, semantics (the relationships between words and sentences
that forms “meaning”).
And if you didn’t have to read that twice, you are doing
well. Very well. Add in the complexities of language ambiguity, idiosyncrasy
and sheer global variety and you’d be forgiven if your head starts spinning. Further
consider that exact details are not explicitly coded in the language we use –
in fact, much of our understanding is formed from our knowledge of the real world,
learned through time – and you quickly come to realise that language is only
part of the problem. The point being, this is an immensely complex subject
domain, offering some explanation as to why this field remains very much in
R&D mode.
Nonetheless, NLP is an area moving forward with speed as key
players look beyond the traditional statistical modeling approach and look
towards natural science for answers; including biology, anthropology,
psychology and neuroscience amongst others. The complex understanding of
linguistics, gained through decades of interdisciplinary scientific study, is
about to ensure Natural Language Processing gets a serious facelift.
Race For Glory
When the Tech Giants ramp up efforts, you can be sure the
space is hotting up, each trying to reach the sea of gold that awaits. And
that’s exactly what the likes of IBM, Microsoft, and now Google, have been busy
doing. IBM and Microsoft have long been doing battle in the labs, whilst
Google, a relative newcomer to the party, have also joined the race.
So, what of the challengers?
Well, IBM leads the way with their creation of “Watson”, the quiz-winning
supercomputer that beats humans at their own game. Using a unique
combination of natural language interpretation, cognitive-style learning and
hypothesis-based decisioning Watson has the potential to define a revolutionary
new service category – “experts-as-a-service” in highly specialist domains
where decision-making is vital.
For example, Watson is currently “in
training” as a medical practitioner, with the end-goal of providing physicians
with decision-support in the patient diagnosis phase. Watson will help to
understand symptoms through the mass analysis of medical research data and
application of patient history.
Whilst IBM could be considered most advanced in terms of an
end-to-end decision-support platform, other players are focusing on key
components in the decision chain, such as NLP in its entirety. That can be said
for Microsoft who, not to be outdone, recently wowed
a Chinese audience with a demo of real-time speech translation based on its
own NLP developments. The Language Translation market alone is projected
to reach $100 billion by 2020, signaling the immense potential of NLP
technology, especially when you consider this is merely one application of
many.
And if IBM and Microsoft thought they were the only two players
in town, then Google will no doubt have something to say. The recent
high-profile hire of Artificial Intelligence thought-leader Ray Kurzweil
to head up Google’s Natural Language Processing Group is a sure fire
statement of intent, as is the follow-up acquisition of summarisation vendor Wavii,
perhaps a sign of further things to come.
Yet that only tells half of the story, at least if you
emanate from Russia. In what appears to be a classic tale of Russia vs America
in yet another race for glory, the Russians boast a historic association with
linguistics research dating way back to the 1950s.
And it doesn’t end there. Niche Russian software vendor
ABBYY, better known for its market-leading OCR and data capture capabilities,
is understood to have made significant progress on what it calls a “universal
linguistic platform”, claiming a first-of-its-kind, having been in stealth
mode since its founding days, some 18 years previous. That’s a rather large
research project. Information remains at a premium with the company rumoured to
be gearing up for launch.
In my next article I will focus more specifically on the
coming impact of Linguistics, including use cases of emerging semantic
technologies and how these applications will completely redefine the computing
landscape. Stay tuned.
No comments:
Post a Comment