25-06-2025
How Modern Archiving Helps Enterprises Contain The Data Explosion
Carl D'Halluin, Chief Technology Officer at Datadobi.
As data creation and retention rates continue to increase, the time frame for active data use has shrunk, with most files accessed less frequently and for shorter periods before becoming dormant. This growing imbalance means managing ever-larger volumes of inactive data on high-cost primary storage, driving up infrastructure spend without delivering equivalent business value.
It's a persistent challenge, with data expanding to fill available storage almost as quickly as capacity is added. It drives near-relentless system expansion in a largely futile effort to keep up. In response, archiving has become a more critical part of enterprise strategies, particularly in the struggle to retain vast volumes of unstructured data. Think of it as a technology coping strategy where leaders prioritize short-term relief without addressing the underlying problem.
For many, archiving has also become synonymous with tiering solutions. This means keeping frequently accessed data on high-performance systems and relocating colder data to lower-cost tiers. While this can appear similar to archiving, in that data is moved off primary storage to another system, a key difference is that tiering leaves behind an artifact, such as a stub or link, which is later used to access the relocated content.
Tiering has its challenges. For instance, tiering solutions insert themselves into the data path and control how archived files are accessed. As a result, cold data can only be recalled by going through the tiering system itself. If that system goes offline or is removed, access to the data is lost, regardless of whether the files still exist. This can lock organizations into fragile tiering ecosystems that don't scale well, resulting in a messy combination of hidden costs, operational drag and long-term technical debt.
The Move To Modern Archiving
Modern archiving technologies can offer a more strategic route. The emphasis on "modern" reveals a very important distinction, given the widespread confusion that exists between the act of archiving and the archive platform itself. In reality, archiving is the process of identifying and relocating specific data, whereas the storage platform is simply the destination. Yet many vendors blur the lines, positioning their storage systems as complete archiving solutions, whereas they're only where the data ends up.
True archiving starts with identifying what should be moved. That could mean targeting files that haven't been accessed or modified for a certain period or isolating data that an inactive user ID owns, perhaps tied to a departed employee. Finding that data, buried among billions of files across multiple systems, is typically far from straightforward. Without the right tools, it's a time-consuming and impractical task.
Modern archiving relocates inactive files to lower-cost storage platforms, reducing primary storage capacity requirements and delaying expensive system expansions. This approach reduces ongoing expenses and frees up high-performance infrastructure for critical workloads.
Our customer research shows that in a typical real-world 10-petabyte enterprise storage environment, primary storage costs around $1,100 per terabyte annually, including hardware, software, power and staffing. Assuming 6.5 petabytes of that data was inactive and low-touch (a number that's not uncommon to see in large enterprises), an organization in this situation could be spending roughly $7.15 million per year to retain it.
Archiving that same 6.5 petabytes to a combination of active and deep archive tiers, such as those available through AWS, lowers the ongoing cost to just under $15,000 per month, an overall reduction of 97%—a remarkable saving that should be within the grasp of IT teams everywhere.
Barriers To Adoption
Despite the clear advantages, the widespread adoption of modern archiving solutions has been slow.
One key reason is the complexity of today's IT environments. Organizations deploy solutions from multiple vendors across hybrid cloud models, making interoperability a challenge. Historically, companies have lacked the technology to efficiently archive data at a massive scale across heterogeneous environments. As a result, many have relied on backup and recovery systems as a makeshift archive even though these solutions aren't designed for long-term data retention.
Another significant barrier is vendor lock-in. Many traditional archiving solutions store data in proprietary formats either to optimize storage efficiency or for other purposes, creating long-term dependencies on specific vendors. This makes it difficult for organizations to retrieve archived data outside of the original system. A truly modern archiving solution must maintain data in its original format, ensuring accessibility regardless of changes to platform or vendor.
Organizational buy-in also plays a crucial role in adoption. Until recently, many companies lacked formal retention policies, leaving IT teams without clear directives for an archiving strategy. Governance, risk and compliance teams are beginning to define organizationwide policies, but in their absence, data accumulates unchecked. This leads to increased costs and security risks associated with aging data.
The tech industry must support these efforts by offering solutions that integrate seamlessly with existing infrastructures, ensuring compliance while simplifying data management. By addressing these concerns, modern archiving solutions can transition from niche tools to essential components of digital infrastructure.
Orchestrate To Accumulate
Operationally, archiving can also help avoid performance degradation as storage systems inevitably fill up, resulting in faster backups and reduced replication loads. From a sustainability perspective, moving cold data off energy-intensive storage platforms can significantly reduce carbon emissions, resulting in a smaller overall technology footprint.
Success also depends on metadata insight. Files untouched for three-plus years often account for over 60% of stored data, representing a major opportunity at scale. Conceptually, this may seem trivial, but access to this kind of information is essential, particularly when an organization is using multiple storage systems and cloud services (probably from different vendors), storing billions or even tens of billions of files.
Get this approach right, and the result is a far more manageable environment with the ability to continue to identify and relocate data as it ages and crosses archiving policy thresholds. When delivered as part of a broader data orchestration strategy, where data is automatically managed and coordinated across diverse storage systems, modern archiving enables ongoing accessibility throughout the data life cycle without the cost and complexity inherent in legacy approaches.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?