Generative AI (GenAI) is unique in its ability to address an incredibly wide variety of use cases across business functions that involve day-to-day communication, general productivity, as well as creative expression, programming and data analysis. This is because, at its core, GenAI involves advanced machine learning algorithms that can understand, generate, and modify content across numerous formats and structures.
Typically, large amounts of data are utilized to train models to perform various tasks, from natural language processing to complex problem-solving. This capability allows organizations to automate mundane and repetitive tasks, significantly reducing the time and human effort required for such activities. GenAI has also been transformative towards use cases that require more in-depth expertise, such as code generation. For example, developers using Amazon CodeWhisperer have seen a 40% reduction in prototyping times and up to a 50% decrease in security vulnerabilities.
GenAI also simplifies intermediary processes in data analysis. Through its advanced algorithms, GenAI can perform tasks such as data cleaning, normalization, and preliminary analysis automatically. One of our customers, IdenX, has achieved at least a 2X increase in their data qualification and processing capabilities by involving GenAI in their workflow (use case discussed below).
In this blog, we will explore three compelling case studies where we have helped customers across diverse industries—IdenX in data analysis, Everyrealm in social media, and Transit Technologies in public utilities—each highlighting how GenAI has been instrumental in enhancing their operational efficiency and data management capabilities.
IdenX - Analytics-as-a-Service
Autonomous data processing to improve efficiency and accuracy in analytics-as-a-service
IdenX, an analytics-as-a-service company, leverages proprietary machine learning and artificial intelligence technology to deliver customized, data-driven insights across various organizational challenges in talent acquisition, logistics, mergers and acquisitions, and more. Its analytics provide clients with comprehensive and transparent insights quickly for strategic decision-making.
IdenX employs a meticulous approach to research and data analysis that can be time and resource-intensive. This included querying thousands of files and datasets, converting text to SQL, assessing data quality, and analyzing data, which involved repetitive tasks that significantly slowed operational efficiency. IdenX sought a solution to streamline these processes, lower the technical barrier to entry for their operational and analytics teams, and democratize access to actionable data across the organization.
In collaboration with Caylent, IdenX decided to utilize a Generative AI-powered application leveraging Claude 2, to address these challenges. The solution involves:
Data Processing Pipeline: IdenX’s solution utilizes a Large Language Model (LLM) to identify metadata such as delimiter, encoding, and headers across files of various sizes, formats, and languages. This approach aims to improve target identification and calculate a density quotient to assess the value of parsing each file. This accelerates the qualification process, ensuring IdenX’s experts only focus on files that offer value.
Amazon Bedrock Implementation: After evaluating Amazon Sagemaker AI, Amazon Textract, and AWS Comprehend, the teams chose Amazon Bedrock for its GenAI capabilities. Caylent developed scripts to extract metadata from files and measure density, streamlining the data processing workflow.
Cost-Effective Strategies: Utilizing samples from files for prompts and employing an on-demand pricing model for Bedrock allowed initial development within free tiers, optimizing costs.
The deployment of the Amazon Bedrock-powered solution enabled IdenX to:
Enhance Efficiency: Query thousands of files instantly, significantly improving the efficiency of time and resources compared to the manual processing of approximately 500 files per day per resource.
Accelerate Processing: Achieve processing times of around 25 seconds for small files and 40 seconds for wider files, with the capability to process about 20-30 files within a 15-minute window, depending on file complexity.
Seamless Integration: Easily integrate the solution with other services, further enhancing operational efficiency and data analysis capabilities.