What is machine learning?
Machine learning is the science of developing algorithms and statistical models that computer systems use to perform tasks without explicit instructions, relying on patterns and inference instead. Computer systems use machine learning algorithms to process large quantities of historical data and identify data patterns. This allows them to predict outcomes more accurately from a given input data set. For example, data scientists could train a medical application to diagnose cancer from x-ray images by storing millions of scanned images and the corresponding diagnoses.
Why is machine learning important?
Machine learning helps businesses by driving growth, unlocking new revenue streams, and solving challenging problems. Data is the critical driving force behind business decision-making but traditionally, companies have used data from various sources, like customer feedback, employees, and finance. Machine learning research automates and optimizes this process. By using software that analyzes very large volumes of data at high speeds, businesses can achieve results faster.
Where is machine learning used?
Let’s take a look at machine learning applications in some key industries:
Manufacturing
Machine learning can support predictive maintenance, quality control, and innovative research in the manufacturing sector. Machine learning technology also helps companies improve logistical solutions, including assets, supply chain, and inventory management. For example, manufacturing giant 3M uses AWS Machine Learning to innovate sandpaper. Machine learning algorithms enable 3M researchers to analyze how slight changes in shape, size, and orientation improve abrasiveness and durability. Those suggestions inform the manufacturing process.
Healthcare and life sciences
The proliferation of wearable sensors and devices has generated a significant volume of health data. Machine learning programs can analyze this information and support doctors in real-time diagnosis and treatment. Machine learning researchers are developing solutions that detect cancerous tumors and diagnose eye diseases, significantly impacting human health outcomes. For example, Cambia Health Solutions used AWS Machine Learning to support healthcare start-ups where they could automate and customize treatment for pregnant women.
Financial services
Financial machine learning projects improve risk analytics and regulation. Machine learning technology can allow investors to identify new opportunities by analyzing stock market movements, evaluating hedge funds, or calibrating financial portfolios. In addition, it can help identify high-risk loan clients and mitigate signs of fraud. Financial software leader Intuit uses AWS Machine Learning system, Amazon Textract, to create more personalized financial management and help end users improve their financial health.
Retail
Retail can use machine learning to improve customer service, stock management, upselling and cross-channel marketing. For example, Amazon Fulfillment (AFT) cut infrastructure costs by 40 percent using a machine learning model to identify misplaced inventory. This helps them deliver on Amazon’s promise that an item will be readily available to customers and arrive on time, despite processing millions of global shipments annually.
Media and entertainment
Entertainment companies turn to machine learning to better understand their target audiences and deliver immersive, personalized, and on-demand content. Machine learning algorithms are deployed to help design trailers and other advertisements, provide consumers with personalized content recommendations, and even streamline production.
For example, Disney is using AWS Deep Learning to archive their media library. AWS machine learning tools automatically tag, describe, and sort media content, enabling Disney writers and animators to search for and familiarize themselves with Disney characters quickly.
How does machine learning work?
The central idea behind machine learning is an existing mathematical relationship between any input and output data combination. The machine learning model does not know this relationship in advance, but it can guess if given sufficient data sets. This means every machine learning algorithm is built around a modifiable math function. The underlying principle can be understood like this:
- We ‘train’ the algorithm by giving it the following input/output (i,o) combinations – (2,10), (5,19), and (9,31)
- The algorithm computes the relationship between input and output to be: o=3*i+4
- We then give it input 7 and ask it to predict the output. It can automatically determine the output as 25.
While this is a basic understanding, machine learning focuses on the principle that all complex data points can be mathematically linked by computer systems as long as they have sufficient data and computing power to process that data. Therefore, the accuracy of the output is directly co-relational to the magnitude of the input given.
What are the types of machine learning algorithms?
Algorithms can be categorized by four distinct learning styles depending on the expected output and the input type.
- Supervised machine learning
- Unsupervised machine learning
- Semi-supervised learning
- Reinforcement machine learning
1. Supervised machine learning
Data scientists supply algorithms with labeled and defined training data to assess for correlations. The sample data specifies both the input and the output of the algorithm. For example, images of handwritten figures are annotated to indicate which number they correspond to. A supervised-learning system could recognize the clusters of pixels and shapes associated with each number, given sufficient examples. It would eventually recognize handwritten numbers, reliably distinguishing between the numbers 9 and 4 or 6 and 8.
The strengths of supervised learning are simplicity and ease of design. It's useful when predicting a possible limited set of outcomes, dividing data into categories, or combining results from two other machine learning algorithms. However, labeling millions of unlabeled data sets is challenging. Let’s take a closer look at this:
What is data labeling?
Data labeling is the process of categorizing input data with its corresponding defined output values. Labeled training data is required for supervised learning. For example, millions of apple and banana images would need to be tagged with the words “apple” or “banana.” Then machine learning applications could use this training data to guess the name of the fruit when given a fruit image.However, labeling millions of new data can be a time-consuming and challenging task. Crowd-working services such as Amazon Mechanical Turk can overcome this limitation of supervised learning algorithms to some extent. These services provide access to a large pool of affordable labor spread across the globe, making data acquisition less challenging.
2. Unsupervised machine learning
Unsupervised learning algorithms train on unlabeled data. They scan through new data, trying to establish meaningful connections between the inputs and predetermined outputs. They can spot patterns and categorize data. For example, unsupervised algorithms could group news articles from different news sites into common categories like sports, crime, etc. They can use natural language processing to comprehend meaning and emotion in the article. In retail, unsupervised learning could find patterns in customer purchases and provide data analysis results like — the customer is most likely to purchase bread if also buying butter.
Unsupervised learning is useful for pattern recognition, anomaly detection, and automatically grouping data into categories. As the training data does not require labeling, set up is easy. These algorithms can also be used to clean and process data for further modeling automatically. The limitation of this method is that it cannot give precise predictions. In addition, it cannot single out specific types of data outcomes independently.
3. Semi-supervised learning
As the name suggests, this method combines supervised and unsupervised learning. The technique relies on using a small amount of labeled data and a large amount of unlabeled data to train systems. First, the labeled data is used to train the machine-learning algorithm partially. After that, the partially trained algorithm itself labels the unlabeled data. This process is called pseudo-labeling. The model is then re-trained on the resulting data mix without being explicitly programmed.
The advantage of this method is that you do not require large amounts of labeled data. It is handy when working with data like long documents that would be too time-consuming for humans to read and label.
4. Reinforcement learning
Reinforcement learning is a method with reward values attached to the different steps that the algorithm must go through. So the model’s goal is to accumulate as many reward points as possible and eventually reach an end goal. Most of the practical application of reinforcement learning in the past decade has been in the realm of video games. Cutting edge reinforcement learning algorithms have achieved impressive results in classic and modern games, often significantly beating their human counterparts.
While this method works best in uncertain and complex data environments, it is rarely implemented in business contexts. It is not efficient for well-defined tasks, and developer bias can affect the outcomes. As the data scientist designs the rewards, they can influence the results.
Mentor - Anjali Mourya
Intormation and research are sources from internet.
 
