logo
XOps For Enterprise AI: The Convergence Of MLOps, LLMOps And AgentOps

XOps For Enterprise AI: The Convergence Of MLOps, LLMOps And AgentOps

Forbes11-04-2025

Chiranjiv Roy spearheads globally products, solution and consulting across industries at C5i.ai
getty
Today's enterprise AI landscape faces exponential growth in model complexity and data volumes, posing significant challenges. As organizations rapidly scale their AI ambitions, they inevitably encounter bottlenecks related to operational efficiency, compliance and scalability.
To address these challenges comprehensively, businesses require an integrated approach—XOps, which combines MLOps, LLMOps and AgentOps. This unified framework isn't merely about execution; it's about strategically leveraging AI operations to deliver sustainable business value.
AI operations today require more than discrete practices. MLOps helps streamline traditional machine learning workflows, LLMOps enables the efficient deployment of sophisticated language models and AgentOps coordinates complex autonomous agent systems. However, implementing these components in isolation misses significant opportunities for holistic efficiency and strategic value.
XOps solves this by bringing these distinct yet complementary operational disciplines under one strategic umbrella, ensuring smoother, more scalable adoption of AI capabilities.
For example, in one impactful experience with a global consumer electronics company, supply chains suffered from manual, resource-heavy processes, slowing insights and innovation. By developing a no-code, intuitive ML platform with automated data pipelines and AutoML capabilities, business and data analysts independently designed and deployed models without extensive IT involvement.
The results were transformative:
• Dramatically faster project cycles
• Significantly reduced dependence on engineering teams
• Enhanced strategic agility, empowering quicker, informed supply chain decisions
As another example, in healthcare, inconsistent and siloed workflows complicate and delay AI adoption due to compliance risks. Establishing a standardized, end-to-end MLOps pipeline ensures consistent, compliant model deployment across diverse teams.
In our experience, automating data preprocessing, model validation and real-time monitoring can significantly shorten deployment timelines, improve regulatory compliance and strengthen collaboration between technical and business stakeholders.
A digital analytics agency we worked with faced slow insights generation and scalability issues from fragmented NLP processes. Integrating CI/CD pipelines for NLP models on cloud infrastructure accelerated insights and improved model accuracy. Automated data preprocessing and robust governance mechanisms ensured reliable and trustworthy analytics.
Business outcomes included:
• Analysis time reduced from weeks to near real time
• Increased accuracy and reliability of marketing insights
• Improved scalability and responsiveness to changing market demands
Implementing XOps successfully at an enterprise level requires more than technology and talent—it demands a structured, strategic approach that aligns clearly with business objectives and operational realities. Please keep in mind that both engineering and machine learning/data science teams need to get aligned so that both learn the ways of working.
Begin by combining deep data science expertise, domain-specific experience and MLOps proficiency. Assemble cross-functional teams, including data scientists, domain experts and solution architects who champion end-to-end ML lifecycle management and are familiar with industry-specific use cases, particularly in areas like consumer packaged goods (CPG).
Move from basic DevOps to full-scale automated MLOps by setting up structured automation stages:
• Automated data gathering and version control
• Automated training with robust monitoring and model evaluation
• CI/CD-driven automated deployment with infrastructure as code (IaC)
• Automated retraining to sustain model performance and interpretability
Address the critical gaps between model development and operational deployment by ensuring continuous governance, standardized metrics and integrated training processes. Focus not just on sophisticated models but on building reliable, repeatable processes that enable smooth transitions from development to production.
Establish comprehensive ML operational excellence by implementing the following:
• Version control for traceability
• CI/CD pipelines for streamlined deployment
• Infrastructure-as-code for reproducible infrastructure
• Model monitoring to proactively address degradation
• Automated model deployment to minimize manual intervention
• Data operations to ensure data traceability and integrity
To execute a planned production approach effectively, it is essential to begin with a thorough understanding of the data and models involved. Next, refactor the code to ensure scalability in a production environment. Develop automated pipelines to standardize workflows and maintain consistency. Finally, implement deployment strategies that incorporate seamless monitoring, allowing models to adapt dynamically to real-world conditions.
The transition should be smooth from concept to engineering, as this process involves serious change management.
To effectively scale an AI application from concept to production, we follow a structured, iterative process encompassing multiple clearly defined stages:
• Define: Begin by collaborating closely with business consultants and SMEs to articulate the business questions, objectives and requirements.
• Design: Proceed with comprehensive data acquisition and preparation. This stage ensures the data quality is robust and suitable for further modeling.
• Describe: Implement feature engineering, model training and experimentation. Evaluate and compare models meticulously to select the optimal approach for deployment.
• Deploy: Integrate models into user-centric applications via intuitive UI/UX designs, dashboards or web and mobile applications. The deployment also includes ensuring data schema alignment and leveraging granular-level model optimization using approaches like GraphRAG.
• Drive: Continuously monitor and track model performance in production. Incorporate consumer feedback for ongoing model refinement and improvisations, fostering a responsive and adaptive model lifecycle that aligns with real-world performance and consumer expectations.
Looking ahead, success in 2025 and beyond hinges on effectively integrating predictive, generative and autonomous agent capabilities. The XOps approach, rooted in structured operational excellence and proactive governance, positions businesses for sustained leadership.
Organizations must move beyond isolated AI initiatives toward scalable, governed ecosystems that continuously evolve, shaping their industries and setting new standards for operational excellence and innovation.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

ZS expands partnership with Salesforce to help companies scale AI agents with Agentforce
ZS expands partnership with Salesforce to help companies scale AI agents with Agentforce

Yahoo

time44 minutes ago

  • Yahoo

ZS expands partnership with Salesforce to help companies scale AI agents with Agentforce

EVANSTON, Ill., June 25, 2025--(BUSINESS WIRE)--ZS announced the continued expansion of its partnership with Salesforce to offer consulting and implementation services for Agentforce, Salesforce's digital labor platform. With a focus on AI to help companies solve commercialization and go-to-market challenges, ZS helps clients implement and manage evolving workforces in which humans and AI agents complement each other to drive business value in sectors including pharma, healthcare, medtech, travel and retail. By using ZS's services for Agentforce, clients are able to deploy and scale AI agents with a trusted, open and complete agent platform. These AI agents help drive tangible business outcomes while helping to ensure governance and compliance are built in from day one. "Organizations need a new labor model that unlocks the full potential of humans with AI at work. ZS is a critical partner for identifying and developing specific use cases with our joint customers across industries, helping to ensure tailored and effective AI solutions for scaling digital labor," said Phil Samenuk, SVP of Global Alliances & Channels and Outsourcing Service Providers, Salesforce. "With Agentforce constantly evolving and expanding, ZS demonstrates the company's commitment to empowering customers to deliver success with Agentforce." ZS has been at the forefront of innovation with Agentforce and has developed solutions and services to help clients quickly get up and running across multiple industries, including: Pharma and healthcare: ZS and Salesforce are redefining pharma's commercialization and go-to-market strategies by embedding AI into engagement workflows for sales representatives, medical science liaisons, key account managers and patient case managers. This streamlines operations, prioritizes interventions and delivers real-time insights that improve both healthcare provider and patient experiences. Medtech: ZS's services for Agentforce help track opportunities, plan and log sales calls, manage accounts and generate insights. A real-world example: AI agents are helping medtech companies manage product warranties, coordinate returns and streamline field service operations, all of which accelerate product support efforts. Travel and retail: Clients can engage with ZS to use Agentforce to leverage lead scoring, sales development, sales coaching, contract automation and service insights in an effort to increase personalization and productivity across the customer journey. "ZS is committed to helping sales and marketing teams blend the human and digital touch to deliver growth and impact—and our Salesforce partnership is critical," said Swati Gokhale, ZS principal for strategic alliances and investments. "Agentforce opens up real opportunities for organizations to boost productivity, improve customer experiences and bring AI into day-to-day work in a meaningful way. With our deep industry knowledge and proven implementation expertise, we're helping clients move faster and smarter as they build and scale their digital workforce." "The real power of Agentforce lies in embedding AI agents into everyday workflows," added Srihari Sarangan, a ZS principal focused on digital transformation in pharma. "We're helping clients go beyond pilots to scale digital labor that augments real roles and drives measurable impact across the customer journey." Salesforce, Agentforce and others are among the trademarks of Salesforce, inc. About ZS ZS is a management consulting and technology firm that partners with companies to improve life and how we live it. We transform ideas into impact by bringing together data, science, technology and human ingenuity to deliver better outcomes for all. Founded in 1983, ZS has more than 13,000 employees in over 35 offices worldwide. To learn more, visit or follow us on LinkedIn. View source version on Contacts MediaMaryam AyromlouRuder FinnZS-RFMedia@ 703-474-5685 Sign in to access your portfolio

NTT DATA Announces New Industry-Ready Service for Salesforce's Agentforce
NTT DATA Announces New Industry-Ready Service for Salesforce's Agentforce

Yahoo

time44 minutes ago

  • Yahoo

NTT DATA Announces New Industry-Ready Service for Salesforce's Agentforce

NTT DATA boosts Salesforce partnership with turn-key services, end-to-end expertise, and the Smart AI Agent™ Ecosystem to help clients navigate agentic transformation TOKYO & LONDON, June 25, 2025--(BUSINESS WIRE)--Following the unveiling of NTT DATA's Smart AI Agent™ Ecosystem, a transformative enterprise-grade framework for agentic solutions, NTT DATA today announced a new service offering for Salesforce's Agentforce that will help clients accelerate the adoption of autonomous AI agents to work alongside humans. The service will be delivered through an "EPAS" model – Evangelize, Pilot, Adopt and Scale – and will work in harmony with NTT DATA's existing data and cloud offerings, including Agentic AI Services for Hyperscaler AI Technologies. Evangelize: NTT DATA will help evangelize the use of Agentforce, identify use cases and build return on investment proposals for adopting Agentforce. NTT DATA will leverage its domain-specific leadership, digital workforce expertise, and repository of hundreds of agentic AI use cases and roadmap, classified by industry, to align with what works best in each client ecosystem. Pilot: NTT DATA will support a client's initial deployment and build the first use case as a proof-of-concept implementation of Agentforce. NTT DATA will advise on opportunities to add the power of complementary end-to-end AI agent ecosystem capabilities. Adopt and Scale: Once the value of Agentforce is realized, NTT DATA will build a product-oriented delivery model to support scaling and adoption of Agentforce. NTT DATA will also reuse its extensive repository of Agentforce use cases to help its client get a head start on adoption. With the NTT DATA offering for Agentforce, clients can experience the benefits of robust solution architecture and services delivery capabilities, along with the opportunity to integrate with MuleSoft and Data Cloud. This multi-faceted advantage is rooted in NTT DATA's award-winning expertise in both integration and data unification platforms, providing clients with the comprehensive and tested scale required for global enterprises. Megan Piccininni, SVP and Global Salesforce Practice Leader, NTT DATA, commented, "With our new service for Agentforce, our partnership with Salesforce underscores the transformative potential of agentic AI. Central to this innovation is the coordination and orchestration of multiple intelligent agents, which are essential for achieving comprehensive end-to-end automation across various platforms. Our Smart AI Agent Ecosystem, expert advisory services, depth of AI, data, and cloud talent, position NTT DATA as unique in this space with Salesforce. NTT DATA has been part of Salesforce's Agentforce Partner Network since its inception, and we are committed to deliver client success leveraging Agentforce." Agentforce is a digital labor platform for enterprises to augment teams with trusted autonomous AI agents in the flow of work. With Salesforce's AgentExchange, a leading AI agent ecosystem for enterprises, clients have access to hundreds of ready-to-use actions, topics, and templates built by partners, and will have access to pre-validated Model Context Protocol (MCP) servers, that have passed rigorous security reviews to quickly create and deploy their digital workforce of AI agents. NTT DATA's new service for Agentforce is adaptable to different use cases. Clients will be able to benefit from agentic AI and see tangible outcomes across industries. The top use case for NTT DATA's service for Agentforce is focused on Customer Service and Experience. Application Management Services Agentification includes deployment of utility agents that interact seamlessly with various observability and service management ecosystems. The service for Agentforce enables Agentic Business Process as a Service across different domains such as Life Insurance-as-a-Service and Contact Center-as-a-Service. In Health and Life Sciences, AI agents can help transform patient management and improve patient outcomes. Real Estate and Vendor Management task automation, such as technical support, helps address changes and vendor management operations, reducing support tickets and manual process time. Seller Community applications streamline deal validation and sales intake, reducing deal approval time. Marketing Community use cases include automating email credit management and accelerating marketing email delivery, achieving faster email processing. Faster Time-to-Hire outcomes from optimized recruitment processes with Agentforce. Governance and Security Control offer centralized management of security and reuse, ensuring consistency and control across all deployed agents. Digital labor is already here, delivering a meaningful competitive advantage for organizations that embed it effectively across departments. To truly scale this potential, businesses need clear insight into agent deployment, how agents enhance human productivity, and secure tool usage. Salesforce's latest Agentforce release provides an enterprise-grade platform to manage human-AI collaboration, connect agents to tools via open standards, and rapidly deploy industry-ready agents with the trust, scale, and performance enterprises demand. Agentforce expands digital labor across the enterprise with new industry-specific actions to provide industry readiness out of the box that delivers a fast path to value from AI agents. NTT DATA plays a crucial role in driving an agent economy with leadership scale and expertise and guiding clients in their agentic maturity, from task automations to interoperable agents, while helping to ensure responsible innovation and global governance. Megan Piccininni further added, "In our role as an Outsourcing Service Provider (OSP), our competence to deploy the new service for Agentforce across industries differentiates us from the rest. By merging our competencies in Salesforce, Application Management Services, Business Process Services, Data and AI Services, Cloud and Security Services, and next-generation technologies, we deliver multi-faceted benefits to our clients. This integrated approach allows us to take ownership, manage, and operate within a business outcome-focused framework." "Organizations need a new labor model that unlocks the full potential of humans with AI at work. NTT DATA is a critical partner for identifying and developing specific use cases with our joint customers across industries, helping to ensure tailored and effective AI solutions for scaling digital labor," said Phil Samenuk, SVP of Global Alliances & Channels and Outsourcing Service Providers, Salesforce. "With Agentforce constantly evolving and expanding, NTT DATA's new service demonstrates the company's commitment to empowering customers to deliver success with Agentforce." Additional Resources Follow NTT DATA on LinkedIn Follow Salesforce on LinkedIn and X Learn more about Salesforce's Agentforce Learn more about NTT DATA's Salesforce practice Salesforce, Agentforce and others are among the trademarks of Salesforce, Inc. About NTT DATANTT DATA is a $30+ billion trusted global innovator of business and technology services. We serve 75% of the Fortune Global 100 and are committed to helping clients innovate, optimize and transform for long-term success. As a Global Top Employer, we have experts in more than 50 countries and a robust partner ecosystem of established and start-up companies. Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure and connectivity. We are also one of the leading providers of digital and AI infrastructure in the world. NTT DATA is part of NTT Group, which invests over $3.6 billion each year in R&D to help organizations and society move confidently and sustainably into the digital future. Visit us at View source version on Contacts Media Contacts NTT DATA, NTT DATA Group CorporationGlobal Innovation HeadquartersGenerative AI Office / Morino, HassettGAO_Global_Marketing@

KAYTUS Enhances KSManage for Intelligent Management of Liquid-Cooled AI Data Centers
KAYTUS Enhances KSManage for Intelligent Management of Liquid-Cooled AI Data Centers

Business Wire

timean hour ago

  • Business Wire

KAYTUS Enhances KSManage for Intelligent Management of Liquid-Cooled AI Data Centers

SINGAPORE--(BUSINESS WIRE)--KAYTUS, a leading provider of end-to-end AI and liquid cooling solutions, has announced the release of the enhanced KSManage V2.3, its advanced device management platform for AI data centers. The latest version introduces expanded monitoring and control capabilities tailored for GB200 and B200 systems, including integrated liquid cooling detection features. Leveraging intelligent automation, KSManage V2.3 enables AI data centers to operate with greater precision, efficiency, and sustainability, delivering comprehensive refined management across IT infrastructure and maximizing overall performance. As Generative AI technology accelerates, AI data centers have emerged as critical infrastructure for enabling innovations in artificial intelligence and big data. Next-generation devices such as NVIDIA's B200 and GB200 are being rapidly adopted to meet growing AI compute demands. However, their advanced architectures differ substantially from traditional systems, driving the need for more sophisticated management solutions. For instance, the GB200 integrates two B200 Blackwell GPUs with an Arm-based Grace CPU, creating a high-performance configuration that poses new management challenges. From hardware status monitoring to software scheduling, more precise and intelligent control mechanisms are essential to maintain operational efficiency. Moreover, the elevated computing power of these devices leads to higher energy consumption, increasing the risk of performance bottlenecks, or even system outages in the event of failures. As a result, energy efficiency and real-time system monitoring have become mission-critical for ensuring the stability and sustainability of AI data center operations. KSManage Provides Intelligent, Refined Management for AI Data Centers KSManage builds on a wealth of experience in traditional device management and supports more than 5,000 device models. Its comprehensive management framework spans IT, network, security, and other infrastructure components. The platform enables real-time monitoring of critical server components, including CPU, memory, and storage drives. Leveraging intelligent algorithms, KSManage can predict potential faults, issue early warnings, and support preventive maintenance, helping ensure servers operate at peak performance and reducing the risk of unplanned downtime. The upgraded KSManage delivers comprehensive monitoring of key performance indicators for GB200 and B200 devices, including GPU performance, CPU utilization, and memory bandwidth. Through 3D real-time modeling, it dynamically visualizes resource distribution and intelligently adjusts allocation based on workload demands. The platform also features automated network topology management, enabling real-time optimization of NVLink connectivity, and contributing to a 90% boost in operational efficiency. During large model training, KSManage automatically allocates more computing resources to relevant tasks, optimizing the distribution of CPU, GPU, and other components. This ensures higher device utilization, improved computational efficiency, and significantly faster training times. Specific for intelligent fault detection, the upgraded KSManage introduces a three-tier monitoring framework spanning the component, machine, and cluster levels. At the component level, it leverages the PLDM protocol to enable precise monitoring of critical metrics such as GPU memory status. When computational errors are detected in B200 GPUs, KSManage rapidly analyzes error logs to distinguish between hardware faults and software conflicts, achieving over 92% accuracy in fault localization and taking timely corrective actions. At the machine level, KSManage integrates both BMC out-of-band logs and OS in-band logs to support fast and reliable hardware diagnostics. At the cluster level, federated management technology enables cross-domain alarm correlation and analysis, and triggers self-healing engines capable of responding to risks within seconds. The system also synchronizes with a high-precision liquid leak monitoring solution to enhance equipment safety. Collectively, these capabilities significantly reduce Mean Time to Repair (MTTR) and improve Mean Time Between Failures (MTBF), ensuring higher stability and resilience across AI data center operations. Intelligent Management of Green, Liquid-Cooled AI Data Centers As power density in AI data centers continues to increase, cooling has become a critical factor influencing both device performance and operational lifespan. To address this challenge, liquid cooling technology—recognized for its high thermal efficiency—has been widely adopted across next-generation AI infrastructure. The upgraded KSManage introduces a new liquid cooling detection feature that enhances both the efficiency and safety of liquid cooling operations in AI data centers. The system provides real-time monitoring of key parameters such as coolant flow rate, temperature, and pressure, ensuring stable and optimal performance of the liquid cooling infrastructure. By analyzing data from chip power consumption and cooling circuit pressure, KSManage employs a multi-objective optimization algorithm to dynamically adjust flow rates and calculate the optimal coolant distribution under varying workloads. Powered by AI-driven precision control, the platform achieves a 50% improvement in flow utilization and delivers up to 10% additional energy savings in the liquid cooling system. In addition, KSManage enhances operational reliability by providing real-time anomaly detection in the liquid cooling system. When issues such as abnormal flow rates, pressure fluctuations, temperature control failures, or condensation are detected, the system triggers instant alerts and delivers detailed fault diagnostics, enabling maintenance teams to quickly identify and resolve problems. In the event of a critical coolant leak, KSManage coordinates with the Coolant Distribution Unit (CDU) to deliver a millisecond-level response. Upon detection, the system immediately shuts off coolant flow and initiates an automatic power-down of the CDU, ensuring maximum protection of devices and infrastructure. For high-power devices such as the GB200 and B200, KSManage delivers fine-grained energy consumption management at the GPU level. It dynamically adjusts the Thermal Design Power (TDP) thresholds of H100/B200 GPUs, while integrating intelligent temperature regulation technologies—such as variable-frequency fluorine pumps—within the liquid cooling system. These optimizations help reduce Power Usage Effectiveness (PUE) to below 1.3. Additionally, the platform's power-environment interaction module leverages AI algorithms to predict potential cooling system failures. Through synergistic optimization of computing power and energy consumption, KSManage reduces the power usage per cabinet by 20%, effectively lowering device failure rates and improving overall energy efficiency. KSManage has been successfully deployed across a wide range of industries globally, including internet, finance, and telecommunications. With its intelligent, refined, and sustainable management capabilities, it has become an essential tool for overseeing device operations in AI data centers. In one notable case, an AI data center in Central Asia achieved more than a fourfold increase in operational efficiency by leveraging KSManage's intelligent diagnostic features. Device fault handling time was also reduced by 80%. Monitoring and control of the liquid cooling system, and firmware optimization collectively contributed to a 20% reduction in energy consumption. Additionally, the hardware service lifespan was extended by one to two years. KSManage continues to play a critical role in ensuring the efficient, stable, and sustainable operation of AI data center infrastructure. KAYTUS is a leading provider of end-to-end AI and liquid cooling solutions, delivering a diverse range of innovative, open, and eco-friendly products for cloud, AI, edge computing, and other emerging applications. With a customer-centric approach, KAYTUS is agile and responsive to user needs through its adaptable business model. Discover more at and follow us on LinkedIn and X

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store