January 29, 2025

Our View on DeepSeek and the Impact on CX Applications

&
Tod Famous
Discover how DeepSeek challenges GPT-4o in CX applications and what it means for LLM costs, AI performance, and the future of multi-LLM strategies. Explore Crescendo’s insights on speed, cost, resilience, and the evolving AI landscape

Let’s take a breath.

The AI industry was shaken this past week by the introduction of DeepSeek, a top-tier AI model trained using just a fraction of the usual resources, then released as open source. At Crescendo, we focus on validating new models for our customers, so we immediately tested DeepSeek with our Crescendo CX Assistant.

Quick take: Very fast. Very impressive. Nearly as good as GPT-4o for our CX Assistant.  Note: we’ve been optimizing our conversational AI use case on OpenAI models for more than a year, so it’s really shocking to us that DeepSeek is almost as good out of the box.

Much of the publicized comparisons focus on ChatGPT vs DeepSeek style and tone differences.  For example, you see comparisons where they are both given the same prompt.  While interesting, this doesn’t compare performance at the highest level where the prompt is adjusted through iterations to maximize the quality of the result.

There are strengths in DeepSeek default consumer experience behavior that are less relevant to our CX Assistant use case.  For example, DeepSeek will more likely explain its ‘chain of thought’ reasoning which is super helpful if you want to understand how it solved a complex problem so that you can check its work or learn from the AI.  It’s less helpful if you are explaining an ecommerce company's return policy, indeed there might even be some brand damage by explaining too much of the logic around the decision to offer a credit.

Are we moving to DeepSeek?

While it appears to be a huge step forward in price/performance, there are a number of considerations that suggest a cautious approach.  The tuning of the model by a Chinese company will give pause to most western companies.  Without passing judgment, there are safety guardrail choices in DeepSeek that don’t align with western corporate values therefore we know that our customers would object to the use of this model as it is tuned today.  In our guard-railed use case, these differences probably wouldn’t surface but the cost/benefit isn’t there to take that risk.

If we read the principle advantage of DeepSeek for an application developer is speed/cost then it is likely that advantage is mitigated by the reaction of the other LLM service providers in the next few months.  We will likely get the cost-benefit of DeepSeek in a few months without deploying it (?)  We’ll see.

Will the cost of LLM services continue to decline?

Yes, it seems so. This isn’t a new trajectory, in fact, it’s been our bet since we started Crescendo.   As an AI application developer, the cost of LLM services is a limiting factor in terms of what use cases we pursue so this is a trend we’ve watched very carefully as we expand into more use cases.  With the release of DeepSeek, it seems very likely this pace of declining costs will continue.

What is the impact of declining costs on CX?

Our initial focus was on the most difficult use cases with “the best” models and use cases where we displace labor.  For example, when considering a feature to improve customer satisfaction (CSAT) vs a feature to reduce labor, we would always pursue labor savings.  Generative AI software has a much higher runtime cost than traditional SaaS or even legacy AI so we’ve stayed very focused on use cases with clear and simple ROI.

Note: We actually have seen deployments where we can improve CSAT but this is a nice side effect of efficient and accurate customer service.

With declining LLM costs we can now apply our Crescendo technology platform to a broader range of CX use cases. In short, we’re exploring ways to use AI wherever it can help—with less concern about the AI cost for the feature since it’s going to drop another 90% next year.

Will there be a single LLM winner?

This market is looking less and less like a winner-take-all situation.  We build our platform to use multiple LLM services.  In early 2024, after some embarrassing fumbles by Google, there were some analysts who felt OpenAI was going to run away with the market.  Despite this possibility, we continued a multiple LLM service path for our Crescendo technology platform.  In fact, we expanded the number of models per use case for some features:  Our CX Assistant now integrates with four different LLMs:

1. Model 1 - Conversational AI

2. Model 2 - Intent Classification

3. Model 3 - CSAT Scoring

4. Model 4 - Automated Quality Assurance

To restate: a single customer service inquiry with our technology will touch four different LLM services.

We select each model based on the specific use case since every LLM shows slightly different strengths depending on the task.

What are some other benefits of multiple LLMs?

Speed

We recently started using two separate LLM queries to support a single request to improve speed.  In our CX Voice Assistant we use a very fast (less sophisticated) model to predict when a customer has completed their question.  This prediction allows us to send the question to the primary model with less delay.  Latency is still a limiting factor in voice assistant customer satisfaction so every 100 milliseconds we can save has an important impact on perceived quality.

Diversity

In early 2024 we moved our CSAT feature from OpenAI to Google because the Google model was 1/20th the cost and we found it performed ‘as good” as OpenAI.  This was a very smooth transition in our customer base and to our surprise we noticed situations where the Google model was making meaningfully useful suggestions on the OpenAI model performance.  This aligns with other AI research published in 2024 where fewer models working collaboratively can produce a better result than a single model.

Resilience

Multiple LLMs give us failover resilience. In December, OpenAI experienced a seven-hour outage, affecting every application vendor dependent on GPT-4. Our Crescendo technology platform design allows us to set/change the LLM services for all features.  Due to this design choice, we simply switched our conversational AI to Llama and our CX Assistant deployments continued running. Meanwhile, our competitors (and their customers), who relied solely on GPT-4, were offline for the entire evening.

Quality

We’ve also been doing research on two-stage model usage to improve accuracy with our most complex Agentic AI use cases.   As context windows expanded and token prices dropped in 2024 we started building larger and larger LLM prompts.  It turns out that if you send too much context to the current LLM services, they perform worse.  The noise of unnecessary context can confuse the AI.  With a two-stage approach, we use one LLM query to write the query to the primary LLM.  The first AI query isn’t determining the answer to the customer’s question but rather it’s determining what tools and what context is needed for the primary AI to answer the question.  This architecture is unlocked by having fast and cheap LLM services.  

Conclusion

The release of DeepSeek reinforces our viewpoint that the marketplace for LLM services is going to remain extremely competitive.  Application developers, like Crescendo, with the scale, expertise, and platform architecture to take advantage of multiple LLM services will reap the benefits of this competition.

We're building the world's best customer experience platform.

Crescendo has joined forces with PartnerHero to launch an advanced suite of customer experience services, powered by Augmented AI.
Try our voice assistant.
This is a sample of Crescendo’s voice assistant technology. Take it for a spin.
End
Mute