Monday, May 19, 2014

Data purging and log rotation are now "dead"

In my early days of career as a developer in AT&T SVR4 sub-systems, I worked on several new and improvements related to the core OS and also ecosystem solutions. One of the most used ecosystem solutions at that point was about Backup, Archival and Recovery (a.k.a. BAR) use-cases. Fast forward 15 years and the value of the OS up-time deprecates, cost of storage falls from $1 per KB to 1c per GB, changing the use-cases totally.

In the past decade of business application development, one of the core operational elements had been to handle applications for data purging and log rotation. Just think for yourself – how many times you haven’t seen on web applications that only the past 3 months data is available to be viewed? While most developers don’t realize and the devops don’t pay attention to – the programs written by younger developers don’t consider this operational requirement at all because the data storage is infinite for them, also for the business teams it is all about the data – more the better, bigger the merrier (a.k.a big data!). There is a generation gap that would see if you just think about this one aspect.

In the past decade of IT operations, the core monitoring elements was storage and if the volumes were becoming full. The typical managed services of storage was 28c per GB even an year ago and one must believe that this is not a one-off case, and every enterprise was paying this whether the IT is done in-house, or outsourced. While most CIOs don’t realize, the cloud infrastructures and the cloud applications are built to operate in a never-ending pool of storage volume(s). In fact, someone will be stupid in the future if they plan to monitor the storage capacity without considering the eco-system involved. Unless either the application, or the business mindset is a legacy – the purging of data will be dead, and the application designers looking to rotate log files will also be dead.

What makes sense is to build applications in such a way that the data is always fetched in pages by using good data architectures, and operational tools that take care of auto sizing and tiering. Whether the storage costs 1c per GB or 1c per PB, tiering would always play a role in order to maintain a competitive position of business applications. Summary… forget data purging and plan for data tiering.