KBA-01471: Moving Cataloged Files to Archive

Question:

My Spitfire Catalog has grown quite large. How do I avoid backing it all up all the time?

Answer:

Both Spitfire and SQL have tools to deal with terabytes of data and the reality that most of it becomes effectively read-only.  For example, some editions of SQL Server support “table partitioning”—but this feature actually complicates backup and recovery scenarios. Spitfire‘s solution is “Archive Databases.”

Archive Databases

Spitfire supports a series of Archive Databases: you add each archive database as required as time goes on, perhaps one per year, but this is actually arbitrary and totally dependent upon your needs.  The “newest” archive database is the “current” archive database, and Spitfire can be configured to move versions of files in the catalog that have aged beyond a certain threshold into the current archive.

The reason you‘ll want more than one archive database is that all archives older than the current archive are marked “‘read-only” (a SQL feature), and these read-only databases need not be backed up daily and could even be stored on cheaper, slower storage devices.

Spitfire‘s main catalog keeps the indexing information, so accessing something that has been archived may take a second or two longer but is otherwise totally transparent to the end user.  The only loss of functionality is that archived versions cannot be deleted.

Version Pruning

In addition to archiving, Spitfire can prune versions of files generated from templates.  The pruning process finds all versions created within a certain number of hours (perhaps 3).  If any of these multiple versions are not routed to anyone and there are other versions before and after, then the file version is considered redundant and prunable.  Pruned versions are deleted and cannot be recovered.  A gap will appear in the file version sequence.  A file can only be pruned automatically if it is a template-generated file and other versions exist within the time window.

Archive Processing

For the job to run, the SQL Server Agent must be running.  The routine to move files into the current archive runs once a day and

  • selects the oldest versions based on configurable months of age and SQL Transaction log overhead limit, and moves them.
  • selects the oldest audit trail data and moves it. (See KBA-01587: Retention of Audit Trail Data.)
  • checks the health of statistics on the xsfFileVersion table and updates statics if necessary.  Statistics on this table are also set to not automatically recalculate to avoid long random delays.

Enabling Archiving

  1. Create an archive database using the SQL Archive tab in ICTool.
    1. Specify the database name, something like sfArchive2010.
    2. Click the CREATE button. If a prior archive databases exist, it is automatically made read-only.
  2. Review the parameters
    1. Prior Version Months—the number of months ago a file version that has been replaced by a new file version must have been cataloged before being archived.
    2. Final Version Months—the number of months ago the final (or only) version of a file must have been cataloged before being archived.
    3. Max Per Day—the number of MB of data that can be moved into archive daily.  Expect LDF file grown of 3x this value.  If the oldest eligible file exceeds this setting, only 1 file will be archived that day.
  3. On the SQL Job Tab in ICTool, create the job to move eligible file to archive daily.
  4. On the Finish Tab in ICTool,
    1. Verify that Apply sfPMS Package is checked.
    2. Notify users (or else they might be annoyed).
    3. Click the Publish Configuration button.

Enabling Pruning

In ICTool, on the SQL Archiving tab, review the parameters:

  1. Prune Version Months—the number of months before file versions can be pruned.  1-5 months can be specified.  0 disables the pruning feature.   This number must be less than the Prior Version Months value specified above.
  2. …Within hours—the number of hours within which multiple versions are considered for pruning.  A number between 1 and 23 can be specified.  Larger number will potentially allow more pruning.

For example, if you specify 4 months and 8 hours, then 4 months after a series of versions are created, the interim versions within the 8-hour time span will be discarded automatically.  Unlike strict archiving, pruning does destroy data,  but only carefully selected versions of files that are likely redundant in the big picture.

Long Term Backups and SQL Upgrades

In ICTool, on the SQL Archiving tab, all but the current archive target should be marked as is read only. This means you can make a final SQL backup and move the backup to long term storage for disaster recovery.

If you upgrade the SQL server, you will need to

  1. Make the Archive DB writable, e.g., ALTER DATABASE sfArchive2015 SET READ_WRITE.
  2. Change its compatibility level, e.g., ALTER DATABASE database_name SET COMPATIBILITY_LEVEL = xxx
  3. Rebuild its indexes (exec sp_updatestats may suffice).
  4. Make a new long-term backup and store it appropriately.
  5. Mark the database as read only, e.g., ALTER DATABASE sfArchive2015 SET READ_ONLY

Additional Comments:

Keep in mind that archiving file versions does not reduce accessibility of the data.If your test site uses a copy of your live databases, you can share the archive databases.  Just be sure *not* to enable move to archive jobs in the test site!

You can use these commands to set the archive range when creating prior year archives to ‘catch up‘:

exec dbo.p_WriteConfig @company=‘[ALL]‘, @key=‘ArchiveFinalAfterMonths‘,@NewValue=‘13‘
exec dbo.p_WriteConfig @company=‘[ALL]‘, @key=‘ArchiveVersionAfterMonths‘,@NewValue=‘12‘


KBA-01471; Last updated: September 17, 2018 at 13:31 pm;
Keywords:  backup files; file backup; file archive