Data Annotation Boosts AI Performance and Decision-Making

Data is the fuel for new age technology companies to significantly improve their operations and technology-based decision making. Accurate data labelling is essential for organizations to reap the benefits and secure their investments.

With the increased popularity of artificial intelligence (AI) among leading corporates, decision making is fast slipping in the hands of machines and AI/ML technologies rather than humans. The delegation of responsibilities has expanded beyond fellow humans in 21^st century across industries.

Interestingly, these technologies rely on the data in the back end to be able to make better decisions and gradually improve their artificial thinking power. This data can be in the form of images, text, videos, audio, simulations, real time IMS, sensor data & image curation for ML, perception modelling, logical simulations, event testing and triaging, etc. essentially the eye of the camera, depending on the industry, purpose and use cases.

This data must be guarded and updated; and be accurate and precise enough to help the underlying technology to work efficiently. Experienced GIS experts work on curating and maintaining such data on a regular basis within the client systems without disrupting their daily operations. Data annotation activity is the primary bridge between raw data and AI/ML human-like actions.

Benefits and importance of AI- data annotation

With the fast adoption of AI across industries, data annotation has also taken a front seat in driving this shift to rely on deep learning solutions and decision making. But machines only act according to the underlying parameters set by humans. Hence, the accuracy and comprehensiveness of such data sets hold paramount importance.

The global data annotation tools market size was valued at USD 494 million in 2020 and is expected to expand at a compound annual growth rate (CAGR) of 27.1% from 2021 to 2028, according to Grand View Research. There are numerous examples across industries where the use of machine learning coupled with data annotation capabilities have drastically improved the outcome with much lesser turnaround time as compared to human intervention.

Use case 1: Improved medical diagnosis

Research by Johns Hopkins has indicated that diagnostic error related payments amounted to $38.8 billion between 1986 and 2010, while the misdiagnosis related significant permanent injury or death annually ranges from 80,000 to 160,000 patients in the US region.

Doctors and medical institutions use data annotation services to create bounding boxes around medical examinations in medical imaging, ultrasound scans, MRI scans, etc. and label it accordingly. Millions of such annotated data then serve as reference point for the AI, thus helping in significantly faster and accurate medical diagnosis, saving multiple human lives in time.

Use case 2: Accurate search results

With the help of data annotation, high quality search results are obtained for search queries, as the data is tagged to possible search keywords accurately, which is then sequenced appropriately based on the relevance depending on the search query. This includes metadata tagging of articles like text, video results, audio, images search, etc.

Today, chatbots and voice assistants have become popular, and are trained to have more human-like conversations with the user. Speech recognition technology has made content more accessible to people with visual and hearing impairments. Similarly, uniquely identifiable facial and bodily recognition has taken biosecurity to the next level.

Use case 3: Replacing human actions in self-drive cars

Self-drive cars are fast taking over the automotive market. However, for it to react and function properly in a real world environment, it requires a back-end database which helps the vehicle to immediately identify the surrounding objects, project its trajectory, and take corrective course correction decisions to prevent collision and follow the traffic rules effectively. The more comprehensive and extensive the dataset is, the better the performance level of the technology, and closer it gets to human reaction to any changes in the vehicle’s environment.

Use case 4: Natural language interpretation, speech & facial recognition

The data labelling services have now expanded beyond just text and images. Wipro’s customers have adopted audio recognition and identification of dialects to understand the verbal commands and act upon them as instructed. This system runs on a deep learning technology which is able to distinguish between languages and understand them effectively based on metadata sets fed into the system. Similarly, programs are designed to identify facial attributes such as retina scan, nose, etc. which helps in identification of the face. The back-end metadata includes pronunciations of words in different languages, which are identified and matched with the voice command provided to interpret the commands.

Use case 5: Object identification and related searches

Technology has made it possible for entities to move one step ahead in the object identification world by offering translation services, smart text search, offering similar shopping options, etc. This is all made possible through machine learning, which identifies the objects shown by matching them with the database and performing a search operation online for similar such objects. This helps the technology to provide useful information such as shopping options, prices, colors available in market, opening and closing hours for key interest areas, and so on.

Challenges in data annotation

While the use cases related to applications of AI/ML technologies sound fascinating and interesting to the reader, the amount of efforts and cost implications involved in creating and maintaining such technologies is humongous in nature. This is one of the reasons why many companies still struggle to adopt these technologies and reap its utmost benefits. Some of the key challenges associated with data annotation services include:

Requires a large workforce

The process of creating such massive amounts of labelled data fed into the ML and AI models is a manually handled procedure which also requires deployment of huge amount of trained workforce. Accurate and proper labelling of the data is the backbone of such AI/ML models, as their accuracy, responsiveness and precision is directly derived from these data sets.

Access to correct data labelling technologies

In addition to the experienced workforce, it is also imperative that the workforce is provided access to the best and correct version of technology which will enable them to perform the data labelling procedure efficiently. The lack of knowledge to select the correct technology, and massive pricing involved in obtaining it are some of the factors which makes entities suffer while adopting AI/ML models.

Heavy maintenance costs

It would have been much simpler for companies to shift to such technologies if they only required one-time hefty investment. On the other hand, the maintenance costs of such technologies are also a burden on companies, since the data sets need to be constantly fed with new and updated data sets, newly labelled metadata which considers additional possibilities & use-cases, prevention of security breaches, preventing data privacy violations, data leakages, eliminating any wrongly tagged data from the repository, etc.

Opportunities for IT services companies

According to a research by Cognilytica, around 80% of the AI project time is spent in gathering, organizing and labelling the data which is available in raw form with the organizations, making it challenging for companies to be able to afford spending such time and money internally on the collected data.

Adopting deep learning technologies is equally challenging to handle internally within the organization. Hence, it’s prudent to outsource it to leading IT companies which specialize in handling such requirements. This eliminates the need for a large workforce, training in technology, investments in tools and requirements to improve the database.

While companies focus on their core business activities, IT service companies such as Wipro work for clients in the background to create, maintain and improve their deep learning technologies.

Further, the adoption of such technologies is on the rise in the market on a global level. Research conducted by McKinsey suggests that 50% of the corporates have adopted AI in at least one of their business functions.

How Wipro helps companies leverage this opportunity

With Wipro’s experienced employees and dedicated delivery centers, clients can leverage outsourcing the data labelling work while keeping their focus on core business activities.

Wipro has experience in operating on various AI/ML platforms depending on the input data type. In addition, Wipro has explored new AI/ML arenas like OVM infotainment systems for major car manufacturers, auto speed limits, street view field operations, autonomous cars: 2D,3D, 4D labelling, fused labelling, scenario labelling, PBR labeling, etc.

Wipro’s pool of over 4,000 labelling experts, experience of creating and delivering more than 1.6 billion labels across the globe, and improvement in labels per hour by 58.06% clearly speaks volumes about the efficiency and capabilities of Wipro. The clients served by Wipro belong to different sectors including social media & search engines, autonomous car companies, ride hailing companies, online marketplaces, and so on.

Wipro provides end to end services to clients including data creation, curation, and analytics services to help develop, test, and train the deep learning technologies such as speech recognition, object detection & identification, etc. Figure 1 illustrates typical image and video processing workflow followed by Wipro for its clients:

Figure 1

Wipro realizes that the success of deep learning technologies is dependent on quality and accuracy of data feeds, hence it goes an extra mile in focusing on quality assurance of the deliverables for its client. The dual quality check (QC) ensures that the client’s data repository is up to date with well-labelled input metadata to improve the technology’s performance and decision-making power in the long run.

Wipro has experience of working with some of the leading corporates in this area. For a leading multinational technology company in self-drive cars, Wipro helped to build algorithms to classify objects, annotate database to support machine learning, software testing, etc. It created a dedicated pool of GIS engineers from Wipro’s GIS Academy to facilitate continuous and uninterrupted support to the client.

Future traction in data annotation

As the world moves towards automation, depending on machines for critical decisions, tasks, and action points, it becomes critical that the back services helping in running the show are up-to-date and well organized. The concept of data annotation is the building block which helps hold this fort together. Companies currently struggle to handle this process, as it requires immense investment, manpower and skills; however, IT service companies such as Wipro have come to the rescue in these cases for corporates. In future, more and more entities will be able to adopt AI/ML technologies and reap its benefits as the process of maintaining these technologies will get more streamlined and simpler.

About the Author(s)

Ankur Saxena

Practice Manager, Geospatial Information Systems, Wipro

Ankur is the Practice lead for the Geo Spatial Information Systems & AI/ ML data annotation service offerings for digital content practice of Knowledge Services. He leads one of the premier technology accounts for GSIS practice as the solution head. He has over 17 years of industry experience on varied business domain including GIS, Technology, Finance, and Insurance verticals. He holds n MBA degree from Institute of Management Technology, Ghaziabad, with specialization in Finance & Operations.

Kunal Jain

Presales Consultant, Wipro

Kunal is a Presales Consultant with the digital content practice of Knowledge Services, where he supports business growth, GTM planning, and solution pitching. As part of Knowledge Services, he is involved in the Geo-Spatial Information Systems (GSIS) practice. He has an MBA degree from the Indian Institute of Technology (IIT), Delhi.

Building Stronger Deep Technologies Through Data Annotation

About the Author(s)

Related Articles

Contact Wipro