January 8, 2024 | Posted in News
Incoming generative AI costs and security concerns make strategic adoption a must
Interest in artificial intelligence (AI) is at an all-time high due to the influx of generative AI products on the market. Business leaders are seriously considering integrating the technology within their stack like never before, but must bear in mind specific considerations before doing so.
AI is often thought of and sold as a single, monolithic technology, whereas it’s really a vast family of processes, models, and workflows that need to be carefully implemented to really add value. To benefit from incoming AI innovations, businesses need clear plans and a strong idea of how the technology interfaces with their existing systems and sectors.
In the coming months and years, executives expect generative AI to boost financial performance and more than half of firms have already adopted generative AI, whether in pilot stages or in full production.
However, there are a number of hurdles between the current state of investment and the hoped-for gains of generative AI implementation. In the short term, many firms are investing in application security alongside generative AI to mitigate security risks such as proprietary data leaks and legal issues such as AI bias.
Overprovisioning on this technology could also end up being very costly in 2024. CCS Insight tells ITPro that smaller firms will find the cost of running AI models locally “prohibitively expensive” due to the need to pour up-front investment on powerful graphics processing units (GPUs).
When it comes to running AI models, it will be important to identify the size and power one’s business seeks. For firms looking to implement AI in a limited manner, such as the simple generation of some internal documents, lightweight models could be adequate.
Business leaders must always consider the needs of their organization, rather than just the scope of their budget or project timelines.
If an organization sets out to implement the latest large language models (LLMs) on the most powerful hardware, they must be aware this means spending endless money on more powerful hardware to keep up. The majority of the major AI developers are in a race to build larger LLMs with many hundreds of billions of parameters, which will necessitate incredibly costly and stock-limited chips to run.
Instead, businesses should engage in rightsizing, working with a consulting partner if necessary, to determine the precise workloads they need — if any. More lightweight LLMs may operate at a slower rate but could be far cheaper to run and therefore align with the strategies of a firm more closely.
Open source models will also be a low-cost option for businesses in the immediate future, but will need to be carefully integrated into one’s ecosystem to meet the concerns of the security department and to run as smoothly as possible on the hardware one owns or rents. For example, the PyTorch framework is supported on hardware like AMD’s Radeon GPU family to allow for cheap on-prem AI inference but is not universally accessible. Knowing the products one has already invested in will be key for forming any coherence in one’s AI future AI investment strategy.
AI-focused chips such as AMD’s EPYC CPU family come in a range of models that balance performance and energy efficiency, helping them to serve the specific AI needs of a business as components in a data center.
Rightsizing in these data centers and the kinds of efficiency gains will be essential in the coming years, as enterprises seek to embrace the gains of generative AI and frontier models. While businesses will want to stay relevant and match the technological level of competitors, they will also need to meet sustainability commitments and keep a lid on energy prices.
It’s already becoming incredibly expensive to build generative AI platforms, and a recent report from Pure Storage indicated that businesses still lack the right infrastructure to cope with AI energy demands.
This will be true of organizations that have a poor grasp on structured vs unstructured data management, perhaps with a sprawling and siloed system that prevents IT managers from achieving proper oversight of the firm’s data. Fixing data silos and unifying data across an organization can help improve the performance of any generative AI model while reducing the burden on data engineers and the chief data officer (CDO).
The channel can play a key role in the adoption of AI hardware and software. As generative AI is adopted on a wider scale, it will be rare to find businesses with the resources and specific needs that allow them to go it alone on AI integration. Rather, channel partners with massive computing resources or pre-made generative AI models will supply businesses with what they need.
Many generative AI services are already available through public cloud partners such as AWS’ Amazon Bedrock, Microsoft’s Azure AI Platform, or Google’s Vertex AI. Equally, businesses may seek out private cloud options for AI or rent AI hardware directly from a vendor. Leaders must decide the route they will take based on their individual needs; those with particularly sensitive data such as financial services companies may choose to rely more on private cloud models or even fork out to run models on-prem, for example, rather than rely on the goodwill of public AI firms in handling their data.
Ultimately it will come down to the individual needs of each business. Considering these in advance and knowing the precise outcomes one is aiming to achieve through the use of an LLM or other AI model will help partners bring this to fruition.
When adopting AI in the coming year, businesses should consider what their needs truly are and then apply these to AI integration rather than seeking solutions without a problem in mind. There are a multitude of paths available to adopting AI systems with generative AI, in particular, being offered via a number of trusted partners as well as through the open source community.