Big data basically refers to data sets that are so large in multiple varieties (example videos and images) and comes at such a speed that it is virtually impossible to use the traditional data processing applications to handle them. Every day, data is being generated in various ways and forms through devices, sensors and new technology platforms. Due to the nature of collection, it is often unstructured, however big data encompasses all data whether structured, semi structured or unstructured. The measure of what is considered big data keeps growing, as data keeps being collected. It is estimated that on the average people send 500 million tweets a day and analysts believe that by 2020 there would be 5,200 gigabytes of data on a person, that should give an idea of how much data would be available for use in the next few years.
Why the hype on Big data? Data reveals meaningful information when properly gathered and analysed. Now that we have higher volumes of data, a good analysis would mean better predictability of trends and cuts across all areas be it government, business, medicine, crime and so on. What it means is that in business for example, we can now predict services or products and give them on a hyper individualized basis, on crime, police patrols can be concentrated in areas of higher crime based on accurate trends and predictions and in medicine better disease mapping and allocation of resources in areas where they are needed.
Before Machine learning, Business Intelligence played a major role in ensuring that business executives could visualize the data they obtained from traditional databases but business are now also confusing business Intelligence with machine learning. The picture below shows the differences between the two and how they complement each other.
For emphasis, machine learning is the field of computer science that uses statistics to build systems that can learn from the data and identify patterns and make decisions. As pictured above, whilst business intelligence looks at historic events and answers the question what has happened, machine learning does 3 kinds of analysis - descriptive, prescriptive and predictive. Business Intelligence focuses on descriptive analysis whilst Machine Learning does all three kinds of analysis. Descriptive analysis means data can be broken down to a level that can aid in understanding factors that influence choices. It uses data aggregation and data mining to provide insight into the past and answer: “What has happened?”. With prescriptive analysis, patterns can be drawn from collected data. It uses optimization and simulation algorithms to advice on possible outcomes and answer: “What should we do?”. Lastly with predictive analysis, after knowing the factors that influence choices and the patterns that exist in random data, the system can now predict the occurrence of events based on information or data it receives.
Machine learning is not such a new concept in the IT world, however, the ability to automatically apply complex mathematical calculations to big data repeatedly and in a faster manner is a recent and exciting development. Examples of everyday uses of machine learning are on social media, for example Facebook utilizes machine learning for face recognition by matching unique features. On Facebook, when you are uploading a picture of you with a friend, it automatically suggests you tag that friend. In a matter of seconds data has been analyzed and used. That been said it must be emphasized that machine learning is actually a subset of artificial intelligence and particular applications of AI include expert systems, speech recognition and machine vision.
Virtual personal Assistants and online customer support services are possible because of machine learning. Examples are Apple’s Siri, Google’s Alexa and closer home, recently birthed Maame (self-created by Npontu Technologies, Ghana). Chatbots come with varied degrees of human intelligence processes which include learning, reasoning and self-correction. Maame is a chatbot that can accept several communications at the same time minimizing the waiting time in most customer service centres. “She” is also a non-emotional person that would not complain of tiredness or having a long day after serving so many people. In a country where majority of its citizens are not on the literate and sophisticated side, Maame is being trained to not just engage in English but to handle communication in various local dialects of the country to offer most needed assistance to the customer service wing of any company. In this case, its focus is not just text to text but also speech to speech in the local language to aid organizations best serve their diverse clientele.
With all the brouhaha and hype it comes with, the good news is that the application of machine learning and AI to huge gigabytes of data is no longer the domain of just the experts. There are a now platforms where companies can simply upload data and have it analysed at the click of a button. A phenomenom that would have been impossible just a few years ago. This trend is not peculiar to the US or Europe known for quick technological advances but is also possible right here in Africa, Ghana with the application Snwolley (snwolley.com), which offers machine learning as a service. Snwolley allows predictions in three steps which are identifying the category for the intended prediction (e.g. Classification, regression etc.), creating and storing your model and then running your prediction using the created model. It relies on complex multithreading technologies to run all possible algorithms for the intended prediction providing the highest prediction to the user. It also uses data sourcing from multiple platforms such as Hadoop Distributed File System (HDFS), Open Database Connectivity (ODBC), API and local file. The good news is that all this can be done without creating a high level specialized team as the processes of data science are automated on the platform.
In Africa just as in the world, organizations must begin to leverage on data and analytics to create value through innovation. To do this there must be a paradigm shift, an explicit data governance structure that creates a culture of value for data. Data is the oil fueling the next level of change in our world and in the proceeding articles, we will be speaking on challenges in the implementation of machine learning in Africa.