Icon Fame Journal.

Juicy entertainment chatter with tabloid flavor.

general

How to archive your data

By Matthew Martinez

Last Updated: June 4, 2020

wikiHow is a “wiki,” similar to Wikipedia, which means that many of our articles are co-written by multiple authors. To create this article, 13 people, some anonymous, worked to edit and improve it over time.

This article has been viewed 91,303 times.

Archiving is the storage or preservation of information. Organisations are converting paper documents to digital every day, to increase the life span of documents. That, coupled with the immense amount of data being generated by computers today, means that archiving documents is only going to grow in importance. If you know how to archive documents, you can free up space so that your computer handles current documents more efficiently.

How to archive your data

How to archive your data

How to archive your data

How to archive your data

How to archive your data

How to archive your data

How to archive your data

Every day at wikiHow, we work hard to give you access to instructions and information that will help you live a better life, whether it’s keeping you safer, healthier, or improving your well-being. Amid the current public health and economic crises, when the world is shifting dramatically and we are all learning and adapting to changes in daily life, people need wikiHow more than ever. Your support helps wikiHow to create more in-depth illustrated articles and videos and to share our trusted brand of instructional content with millions of people all over the world. Please consider making a contribution to wikiHow today.

Many businesses and individuals hold onto a lot of data, especially after years of business. In some industries, it’s vital that this information is organized and stored for later retrieval. The process of storing and organizing this data is called data archiving.

Data Archiving Defined

A data archive is simply a collection of data or information that is stored in an organized manner. This information can be quickly accessed for later retrieval or compliance requests. In the tech world, an archive is different than a backup. Backups are more focused on speed. This is information that can quickly be retrieved to be edited or updated. Archives, on the other hand, are filled with data that are no longer being updated or revised. These are static documents kept on hand for documentation and record keeping.

Benefits – Why Archive Data?

Data archiving is primarily used to eliminate the need for large backups. While it’s wise to backup data you use regularly, data that hasn’t been used in forever can be moved into the archive and stored. Additionally, storage costs for archiving data is typically less expensive than primary storage because it’s focused on large capacity, not speed.

The main benefit of archiving data is the ability to remain organized. Whether you’re a self-employed professional, a major executive for a large corporation, or simply trying to keep track of your family history, archiving your data will ensure that you keep photos, documents and other media safe for the long haul while still being able to easily find and access them for later use.

Online vs. Offline Data Archiving

When starting your own data archiving system, you’ll need to decide what kind of storage you’d like to use. There are several different options for both online and offline data storage. Let’s review a few of them.

Mobile Media

Many people choose to store their data on mobile media devices such as USBs, external hard drives, tapes, CDs, DVDs and the like. There are benefits and drawbacks to each but the main thing you need to realize is that data archiving via mobile media gives you the ability to take those documents wherever you go. For instance, if you’re a photographer and you’re meeting with a client for the first time, you may want to take your external hard drive with you. If you’ve consistently archived your photos and stored them on that drive, you’ll be able to quickly access and show off your work to potential clients. They’ll also be happy to see that you are so organized, giving you more credibility and authority in your field.

Cloud Storage

Another popular choice for data archiving is cloud storage. Cloud storage is the act of storing your data within an interface that is hosted and managed by another company. One simple example of cloud storage that many people use on a daily basis is Google Drive. This tool is synched with your Gmail account so if you have that, you have access to Google Drive as well. This tool enables users to efficiently store, add, share and edit documents, spreadsheets, photos, pdfs, etc. and for the everyday person, this is enough storage. Google Drive also has apps for Android and iOS so files can also be managed from mobile devices. Professionals may want to look into other more hefty cloud storage providers such as Carbonite. Carbonite is a cloud storage provider that offers unlimited cloud storage and a variety of different plans for both businesses and individuals. Services like this often have a monthly or yearly fee, but if you need secure cloud storage, the fee is worth protecting your data.

Getting Organized

Once you’ve determined what kind of data archiving storage system you prefer, it’s time to get organized and come up with a file naming system. It’s one thing to dump all your data into a folder on Dropbox or your external hard drive, and it’s another thing entirely to clearly depict and separate that data into logical chunks.

For instance, let’s say you have 2,000 photos on your computer that you want to archive with cloud storage. Before you dump them all onto the cloud, create individual folders with a clearly defined format. Create individual folders labeled “2016 PHOTOS” and inside that folder, break your data down even further into separate folders such as “FAMILY REUNION” or “GRADUATION.”

There’s no wrong way to use a file naming system – just use one that works best for you! If you need suggestions or help getting started with your data archiving, browse our blog for more information.

How to archive your data

CEO at DefendX, overseeing Secure Data Management- File Discovery, Compliance and Mobility for our customers globally.

How to archive your data

Over the past decade, the phrase “cloud-first” has become common language among those involved in business technology. The term has developed a definition all on its own and essentially refers to cloud-based technology as a leader in the industry. It has established itself in conversations regarding IT projects in industries across the board and has most recently become a trend to look out for in 2021.

It’s not surprising that the cloud-first mentality has become a common theme in the computer world because it allows businesses to get their hands on top-rated technology with little effort. Since 2013, several enterprises have established a cloud-first policy, and it has quickly become the choice of employers because it requires no in-house technical management, freeing up time and staffing for other projects. Adding to the additional time gained, the enhanced functionality of being a cloud-first business is a sought-after piece of the efficiency puzzle, especially because in-house options have proven to be more challenging.

By putting a cloud-first strategy in place, you’re immediately reducing your overhead costs. With in-house servers, you are typically required to pay any fees upfront — a pretty penny, compared to the pay-as-you-consume setup of the cloud-first option. When a business adopts the cloud-first strategy, it subscribes to a service provider for software, platforms or infrastructure and has the potential to obtain the best services at low and secured rates.

In addition to the obvious cost-effective perks of going cloud-first, there are no requirements to have on-site hardware or capital expenses. For smaller companies with growth goals, cloud-first offers the option of additional storage and can be initiated on demand, allowing the business owner to only pay for what they need.

Surprise! The Samsung S21 Ultra Is Great For The Enterprise, Too

Day After Trump Acquittal “Impeach Harris” Trends On Twitter

A New Look Inside The Brain Of Qualcomm’s Licensing Head

More convenient than the in-house option, backup and restore can be instituted from anywhere simply by using a computer, tablet or smartphone. To prevent data losses that can occur during disaster incidents, data in cloud-first can be backed up in as often as 15-minute intervals, and the time for recovery of small data is greatly improved. As an added bonus, you won’t have to stress about taking on the role of managing the complex technical aspects that are often necessary with an in-house server. Your subscription provider will include the management and support of the system, as well as upkeep and security.

The combination of added value, better uniformity, cost-effectiveness and less waste all point toward cloud-first as the better investment for most businesses; however, it’s crucial to have a solid strategy in place when implementing a cloud-first way of working.

Like all things in business, you’ll need to take into consideration a few key factors that will help you map out your plan. Choosing the right cloud provider for your needs should be a top priority, and you’ll likely need to brainstorm with trusted employees to narrow it down. First, take the time to establish the security risks of your business. From there, you can assess the policies of the available cloud providers. The provider of your choice should educate you on how the cloud migration will affect security and what they can do to protect the business in the process. Whether you are migrating some or all your assets to the cloud, a thorough security approach will save you complications in the long run.

A cloud provider should adapt to your business’s needs and work like a well-oiled machine, contributing to the overall effort and following a meticulously defined plan, including next-generation firewall platforms, endpoint protection, enterprise segmentation and guidance of identities and entry.

By taking the following steps into consideration, you can have a successful migration to the cloud:

1. Analyze your system

Before you hire a cloud provider, it will be helpful to understand what role your team will play in the process and how much time will be involved. Embrace an all-encompassing mentality, and make it your mission to adopt whatever your company will need to align with the cloud process.

2. Set a strong foundation

By having a solid foundation in place, you can save time and money in the long run. This can be achieved by dedicating time to plan out enterprise structures and build your workforce. And if your company grows, simply add on to the foundation.

3. Establish your team

You’ll need to recruit the help of qualified employees who will be dedicated to managing the cloud project. When determining who should be on the team, it’s critical to recruit the help of those who excel in communication because keeping tabs on the various stages of the project will help everything fall into place effortlessly.

4. Prepare data

A productive implementation is heavily reliant on data that is prepared well in advance. Prior to the data conversion process, team members need to have proper tools that are designed to extract, cleanse, transform and upload data to the cloud. This should all happen while your organization maintains solid data integrity.

5. Provide training

Whenever an organization adopts a new technology system, there will undoubtedly be some challenges to overcome. One way to prepare for these roadblocks is to provide training for those who will be affected by the new system. Because employees will adapt to change at different speeds, it’s important to open up communication for feedback and offer regular training to iron out any wrinkles in the system.

The growing complex regulatory environments have resulted in countless trials for IT management teams, which is why it’s incredibly important to choose a software that streamlines your business’s needs.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

How to archive your data

CEO at DefendX, overseeing Secure Data Management- File Discovery, Compliance and Mobility for our customers globally.

How to archive your data

Over the past decade, the phrase “cloud-first” has become common language among those involved in business technology. The term has developed a definition all on its own and essentially refers to cloud-based technology as a leader in the industry. It has established itself in conversations regarding IT projects in industries across the board and has most recently become a trend to look out for in 2021.

It’s not surprising that the cloud-first mentality has become a common theme in the computer world because it allows businesses to get their hands on top-rated technology with little effort. Since 2013, several enterprises have established a cloud-first policy, and it has quickly become the choice of employers because it requires no in-house technical management, freeing up time and staffing for other projects. Adding to the additional time gained, the enhanced functionality of being a cloud-first business is a sought-after piece of the efficiency puzzle, especially because in-house options have proven to be more challenging.

By putting a cloud-first strategy in place, you’re immediately reducing your overhead costs. With in-house servers, you are typically required to pay any fees upfront — a pretty penny, compared to the pay-as-you-consume setup of the cloud-first option. When a business adopts the cloud-first strategy, it subscribes to a service provider for software, platforms or infrastructure and has the potential to obtain the best services at low and secured rates.

In addition to the obvious cost-effective perks of going cloud-first, there are no requirements to have on-site hardware or capital expenses. For smaller companies with growth goals, cloud-first offers the option of additional storage and can be initiated on demand, allowing the business owner to only pay for what they need.

Surprise! The Samsung S21 Ultra Is Great For The Enterprise, Too

Day After Trump Acquittal “Impeach Harris” Trends On Twitter

A New Look Inside The Brain Of Qualcomm’s Licensing Head

More convenient than the in-house option, backup and restore can be instituted from anywhere simply by using a computer, tablet or smartphone. To prevent data losses that can occur during disaster incidents, data in cloud-first can be backed up in as often as 15-minute intervals, and the time for recovery of small data is greatly improved. As an added bonus, you won’t have to stress about taking on the role of managing the complex technical aspects that are often necessary with an in-house server. Your subscription provider will include the management and support of the system, as well as upkeep and security.

The combination of added value, better uniformity, cost-effectiveness and less waste all point toward cloud-first as the better investment for most businesses; however, it’s crucial to have a solid strategy in place when implementing a cloud-first way of working.

Like all things in business, you’ll need to take into consideration a few key factors that will help you map out your plan. Choosing the right cloud provider for your needs should be a top priority, and you’ll likely need to brainstorm with trusted employees to narrow it down. First, take the time to establish the security risks of your business. From there, you can assess the policies of the available cloud providers. The provider of your choice should educate you on how the cloud migration will affect security and what they can do to protect the business in the process. Whether you are migrating some or all your assets to the cloud, a thorough security approach will save you complications in the long run.

A cloud provider should adapt to your business’s needs and work like a well-oiled machine, contributing to the overall effort and following a meticulously defined plan, including next-generation firewall platforms, endpoint protection, enterprise segmentation and guidance of identities and entry.

By taking the following steps into consideration, you can have a successful migration to the cloud:

1. Analyze your system

Before you hire a cloud provider, it will be helpful to understand what role your team will play in the process and how much time will be involved. Embrace an all-encompassing mentality, and make it your mission to adopt whatever your company will need to align with the cloud process.

2. Set a strong foundation

By having a solid foundation in place, you can save time and money in the long run. This can be achieved by dedicating time to plan out enterprise structures and build your workforce. And if your company grows, simply add on to the foundation.

3. Establish your team

You’ll need to recruit the help of qualified employees who will be dedicated to managing the cloud project. When determining who should be on the team, it’s critical to recruit the help of those who excel in communication because keeping tabs on the various stages of the project will help everything fall into place effortlessly.

4. Prepare data

A productive implementation is heavily reliant on data that is prepared well in advance. Prior to the data conversion process, team members need to have proper tools that are designed to extract, cleanse, transform and upload data to the cloud. This should all happen while your organization maintains solid data integrity.

5. Provide training

Whenever an organization adopts a new technology system, there will undoubtedly be some challenges to overcome. One way to prepare for these roadblocks is to provide training for those who will be affected by the new system. Because employees will adapt to change at different speeds, it’s important to open up communication for feedback and offer regular training to iron out any wrinkles in the system.

The growing complex regulatory environments have resulted in countless trials for IT management teams, which is why it’s incredibly important to choose a software that streamlines your business’s needs.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

To download an archive of your Microsoft account data:

  1. Login to account.microsoft.com.
  2. Click “Privacy”.
  3. Click “Download your data”.
  4. Click “Create new archive”.

Applies to

Microsoft lets you download an archive of all the data you’ve created across its services, such as search, browsing and location history. This allows you to backup and store your Microsoft activities, or use the data to extract information about how you use Microsoft services. It could also assist you as you migrate to another technology provider.

To begin, head to your Microsoft Account page at account.microsoft.com. You may be prompted to login to your account; enter your password or acknowledge a Microsoft Authenticator confirmation on your phone.

How to archive your data

You’ll land on your account homepage, which gives you an overview of everything associated with your Microsoft account. Click “Privacy” in the top navigation menu. You’ll be prompted to enter your password – or use Microsoft Authenticator – again, due to the sensitivity of these settings.

How to archive your data

The Microsoft Privacy dashboard will display, which lets you control how Microsoft uses your data. The relevant link here is the “Download your data” tab underneath the main banner.

How to archive your data

On the “Download your data” screen, click the “Create new archive” button. You’ll see a popup which lets you choose which data types to include in the archive. Available data sources include your browsing history, search history, location history and all spoken voice commands, as well as usage information for apps, services, films and music delivered through the Microsoft Store.

How to archive your data

Tick the checkbox for each data type you’d like to archive and then press the “Create archive” button. The process may take a few minutes to complete, while Microsoft collates all the relevant information. Your download will then begin in your browser.

If you leave the page while your archive is still being created, you’ll be able to return to the “Download your data” screen to access it later. It will display under the “Archives” heading once it’s ready to download. Archives are automatically removed after “a few days” to help protect your privacy.

You should remember that the data archive isn’t intended for direct consumption. The data is delivered as a set of JSON files, which is a structured format for key/value pairs. Although the files are essentially plain text and can be opened in any text editor, some of the values may appear meaningless or be difficult to interpret without some understanding of what they represent and how they’re stored.

The data archive does not include any data you create within Microsoft apps and services. Think of it as an archive of everything directly associated with your Microsoft account, not the files you’ve created with the account. You can usually export data from apps using the apps themselves – for example, to get an archive of your Outlook emails, you can visit and click the blue “Export mailbox” button.

How to archive your data

The ability to create an account data archive ensures Microsoft’s services remain GDPR complaint. It allows you to move away from Microsoft’s ecosystem, or scrape your own Microsoft data for any insights you’re looking for. The data could be used to create custom spreadsheets, databases or applications which help visualise your Microsoft activities, giving you a record of how you used Microsoft services which exists long after the apps themselves are gone.

When you sign up for a Google account, you get access to many services, such as Gmail, Contacts, and Google Drive. Upon activation, each of these services will rapidly start accumulating large amounts of data on Google’s servers, especially if you use the services for work. However, you won’t be able to access any of your information if you need to work offline.

Creating an offline copy, or archive, of your Google data is a good way to get around this problem. An archive is also useful if you want to export your bookmarks to a different web browser, or if you accidentally delete something in Gmail or Google Drive. Plus, it can be a good idea to create an archive of your data if you decide to delete your Google account. Once deleted, you won’t be able to retrieve your data from the company’s servers.

Google’s archiving tool is full-featured. Within the tool, you can select the data you want to archive from more than 15 Google services, including Gmail, Google Drive, Calendar, Contacts, Bookmarks, Google Photos, and Google searches.

The archiving tool is also easy to use. Just follow these steps:

  1. Go to Google’s Download your data web page.
  2. Sign in to your account.
  3. Review the list of services from which you can archive data. Initially, all of them will be selected, but you can clear the check boxes of the ones you do not want to include in your archive.
  4. Check to see if the items you selected have a small arrow next to them. If so, click it to reveal options available to you. For instance, clicking the “Contacts” arrow reveals that you can choose to save your contact data in vCard, HTML, or comma-separated values (CSV) format.
  5. Click “Next”.
  6. Choose the file format (ZIP, TGZ, or TBZ) for your archive using the “File type” option. Generally, a ZIP file is the preferred format as it can be opened on almost any computer with no additional software.
  7. Select the maximum size for the archive in the “Archive size (max)” drop-down list. There are a variety of options ranging from 1 to 50 gigabytes. If your archive is larger than the size you specify, Google will split it into multiple files.
  8. Using the “Delivery method” option, indicate how you want the archive to be delivered. You can choose to receive an email that includes a download link or have the archive sent to Google Drive, Dropbox, or Microsoft OneDrive.
  9. Click “Create archive”.

Archiving large amounts of data can take hours or even days to complete, so Google will send you an email when your archive is ready.

There are many reasons to archive data━to meet compliance regulations, to retain historical data, or simply backup resources . Archiving preserves data long term so that it can be retrieved when necessary. This article covers how archiving can benefit your business and explains some important factors to consider when creating an archival strategy.

In this article:

What Is a Data Archive?

A data archive is a place to store data that is important but that doesn’t need to be accessed or modified frequently (if at all). Most businesses use data archives for legacy data or data that they are required to keep in order to meet regulatory standards like HIPAA, PCI-DSS or GDPR.

Archive vs Backup

Archives and backups are not the same, even though they are both used to store data outside of production, and you should use them for different purposes.

Data backups are a safeguard for data that is currently in use, which allows you to restore lost or corrupted data from a single point in time. They store data as it existed in the original file, server, or database, including location information, and are not indexed. To restore data, you need to know which backup has the version you need and where the data is stored in that backup.

Data archives store data that is not currently being used and allow you to retrieve data across a period of time based on search parameters. They store data in an indexed fashion, through the use of metadata, independent of how it may have been originally stored during active use. To retrieve data, you need to know the search parameters, such as origin, author or file contents.

While some businesses try to use backups as archives, it is not advisable. Since backups are usually images of the full system, it can be very difficult to single out specific files for long-term retention. This essentially requires keeping the entire backup as an archive, increasing the resources needed for storage and making it difficult to retrieve specific records when they are required in the future.

Benefits of Data Archiving

The primary benefits of archiving data are:

  • Reduced cost━data is typically stored on low performance, high capacity media with lower associated maintenance and operation costs
  • Better backup and restore performance━archiving removes data from backups, reducing their size and eliminating restoration of unnecessary files
  • Prevention of data loss━archiving reduces the ability to modify data, preventing data loss
  • Increased security━archiving removes documents from circulation, limiting the chance of cyberattack or malware infection
  • Regulatory compliance━built-in policies ensure records are kept for an appropriate amount of timeand indexing makes data more retrievable

Top Considerations Before Archiving Data

There are a few things to consider when creating a successful archive strategy.

Storage Requirements

The type of storage you choose plays a big role in how accessible your data is, how much your archive costs to create and store, and how safe your data is once it’s archived. An archive is only useful if you are able to retrieve data when you need it, so it’s important to periodically verify that the storage you select continues to be functional.

If tapes are demagnetized or current technologies no longer support archived file types, your efforts will have been wasted. When choosing a storage type, keep in mind how long you need to store data for, how much data you need to store, and what your priorities are in terms of storage or transfer. This includes deciding whether you want to store data on or offline:

  • Online storage━storing your archive online allows you to easily access it from multiple locations and ensures that you can retrieve the data quickly. It also makes it easier to manage efficiently and add more data to it. The downside of online storage is that it increases opportunities for theft or tampering and is only accessible when you have a network connection. Private clouds can reduce your security risks but have high upfront and operating costs whereas public clouds are cheaper upfront and include built-in support and encryption but require ongoing fees for use.
  • Offline storage━storing archives offline, such as with disk or tape drives, reduces the risk of theft or modification as well as maintenance and storage costs. Offline storage often has a better capacity to cost ratio but means longer retrieval times and greater barriers to managing or transferring data.

Selective Archiving

Efficient archives retain the minimum amount of data necessary in order to reduce resource use and liability as well as the amount of effort or time required to find data. It is counterproductive to archive all of your data so you must determine what data you need and for how long you need to keep it.

When deciding which data to keep, you should consider what format it’s in and whether to archive installation files for viewing applications. If you’re archiving file types that are proprietary, there’s a risk that they won’t be supported in the future when you retrieve your data but archiving their associated programs will ensure future readability.

Retrieval Requirements

Consider the impacts that retrieval times and methods will have on your business. Some archives can take days to retrieve data from (such as those that are offsite or require extensive searches to find the relevant data) or the archive may only be able to return collections of data instead of individual parts of databases or files.

The transparency of the solution should also be considered. Requiring data users to request access through IT staff or from third-party providers will have an impact on productivity. If the data you are archiving is not truly cold but instead just infrequently accessed, transparent solutions in which data appears to be stored in its original location can reduce the impact on employees.

Archiving with Cloudian

Archiving data is a good solution for ensuring that valuable but intermittently used data is kept safe without taking up expensive resources. It might be tempting to use your backups as archives but this is likely to end up costing you more time and money in the end. To save yourself the trouble, create complementary backup and archive strategies and backup and archive strategies.

Cloudian HyperStore can help you simplify the process of archiving. The on-premise object storage solution that is highly scalable, geo-distributed and can natively and can be tiered to the cloud, making it flexible to your needs.

HyperStore is an object storage solution that uses a fully distributed architecture to avoid a single point of failure. It is fully compatible with the S3 API.

Cloudian stores your data securely and guarantees data availability so you can maintain your business operations and improve your response to market conditions.

Article 2 of 4

When evaluating archiving services, you must determine what you need to archive, requirements around that data and how to make the most cost-effective selection.

How to archive your data

Long-term data archiving can and should be an important piece of an organization’s data storage strategy and policy. If it’s too costly to manage a brick-and-mortar data archive, consider using one or more of the many data archiving service companies. Data archiving services can be standalone products or part of a data backup and archiving vendor’s product offering.

Until recently, tape storage had been the archival storage medium of choice due to its low cost and survivability. Advancements — such as powerful compression algorithms, deduplication and rapid data retrieval algorithms — have increased the storage capacity of tape and kept it a relevant archiving medium. And despite advances in electronic media, tape storage, especially for long-term data archiving, is still on the radar of most CIOs.

Depending on the content to be archived, data archiving service companies can scan hard copy images and convert them to an approved data format. They can also store electronic data in an approved format — consider data deduplication to reduce storage requirements — on a variety of secure storage platforms.

Cloud technology is a frequent component of a long-term data archiving service due to its cost, availability, capacity and convenience.

Some vendors use technology and resources to archive both electronic and non-electronic assets. The keys are to know what you need archived and your storage, access and retrieval requirements in advance of any vendor discussions.

This article provides advice on how to evaluate vendors and some examples of data archiving services and products.

Most organizations have two types of data: daily operational data, databases and other content; and infrequently used — yet still important — data sets that may be needed at a future date. As data sets continue to grow in both size and complexity, archiving becomes an essential activity for space and cost reasons.

The first step is to determine if there is a real business need for long-term data archiving. If a formal data backup policy and program exists, examine it to ensure that you meet any requirements to add a data archiving service.

A business impact analysis may help identify data to be archived. Ask business unit leaders for their input on what resources they may need stored. Determine the kinds of assets to be archived, their data formats, storage requirements, accessibility requirements, duration of storage and frequency of access.

A data storage and archiving policy will outline all audit and compliance requirements and ensure there is a standardized process for archiving and retrieving data. It should also specify circumstances when archived data is to be destroyed and the method of destruction.

Research data archiving service vendors and compile a list of candidates. Well-established vendors like Amazon, IBM, Iron Mountain and Microsoft have a broad range of long-term archiving offerings, and the cost of their services may be commensurate with that capability. If you have a limited budget, consider small, boutique vendors that can provide appropriate products, possibly for less money.

Prepare a request for proposal (RFP) to obtain information about selected vendors and their proposed offerings. Be sure to include any security requirements that may necessitate encryption of archival data while in transit to the storage site, and note if encryption at rest is needed.

You should also identify any additional requirements you may have regarding your ability to access archived data in an emergency.

When the RFPs come in, take the time to contact all provided references and compare pricing options for the best fit. You should then determine how you want to move forward, such as with a pilot program, full-scale program or using a service with only selected data sets.

How to archive your data

Once you have executed a contract, use the systems development lifecycle to plan the installation, programming, documentation, training, acceptance testing and entry into production. Be sure to schedule one to two tests during the year, especially if the vendor permits more than one test without an additional charge.

Long-term data archiving costs are typically based on the following factors: amount of data to be stored, type of repository (cloud, tape or disk), accessibility requirements, estimated frequency of access, file format, archival software and redundancy options. Pricing is ultimately based on the amount of storage needed over a specific period. For example, the estimated cost of a cloud archiving service over a five-year contract period can range from less than $100 for 50 GB to 100 GB of data, to $5,000 to $10,000 for 5 TB.

For on-site tape storage, storage drives can be purchased for $3,000 to $3,500 on the low end up to $20,000 to $30,000 at the high end. Archiving software licenses can range from $1,500 to $2,500, not counting the cost of software maintenance, which adds monthly fees of between $100 to $300. Magnetic tape can be as low as $0.02 per GB of storage, while reel-to-reel tape prices range from approximately $35 for 1,200 feet of tape to $130 for 3,600 feet. Reel-to-reel tape storage racks can go for $125 to $1,000, depending on the number of tapes to be stored. Tape vaulting and rotation service fees are based on the number of tapes to be stored, frequency of rotation, estimated storage duration, and tape pickup and delivery costs. Monthly costs range from an estimated $2,200 for a basic vaulting arrangement to $24,000 to $30,000 for a large-scale operation.

Here are some examples of long-term data archiving services vendors and products.

  • AWS offers Amazon S3, Amazon S3 Glacier and Amazon S3 Glacier Deep Archive. Data to be archived is transmitted via a network connection to the S3 system and then routed to its primary and alternate storage locations.
  • DRS Imaging Services scans and archives documents, and offers cloud and on-site archiving services.
  • Informatica performs archiving services in the cloud.
  • Iron Mountain Offsite Media Vaulting picks up archival data from customers and transports it to an off-site vaulting facility where it is logged in, checked for any issues and then moved to a secure and environmentally safe storage area.
  • Microform Imaging scans documents and offers archiving services, cloud archiving and document management.
  • Microsoft Azure Archive Storage offers secure data storage in the cloud as well as blob-level tiering that enables users to change an object’s tier as needed.
  • StoneFly Inc. provides cloud storage services and storage appliances.
  • Veritas Enterprise Vault organizes archives into vault stores that can be divided into partitions as storage requirements grow.

Research data and related files require reliable and trustworthy storage at all phases of the research process. Best practices include documenting the information below either in a Data Management Plan or as part of project protocols. For a more detailed guide to storage and archiving best practices, see the attached PDFs.

At a minimum ensure you can document:

  • Ownership and responsibility for the data at all phases (active, archived) of it’s life-cycle.
    • For collaborative projects, Memoranda of Understanding (MOUs) should be developed that detail roles and responsibilities towards the data.
  • Who has access to the data and how it is restricted.
    • If data includes PII (personally identifiable information) or other sensitive data, access must be limited and data must be stored in a secure location.
  • Process and procedures for creating and verifying backup copies during the project.
    • Follow the 3-2-1 rule: There should ideally be 3 copies of the data, stored on 2 different media, with at least 1 stored off-site or in the “cloud”.
  • Length of time data must be maintained and why, e.g., raw sensor data must be kept indefinitely, analyzed final data should be kept for 10 years or until raw data can be re-analyzed.
    • Data that can be re-created, such as OCR text files, may only need to be kept until the software that created it is superceded by better technology.
  • Where data will be archived permanently.
    • See the Guide to Choosing a Data Repository and the Smithsonian Institution Archive’s guidance on Appraising Research Records for more information about deposit options and decision making for Smithsonian researchers.

In order to preserve data so it is accessible and useable in the future it must:

  • Have adequate descriptive metadata for correct interpretation and use by future researchers (see Describing Your Data : Data Dictionaries)
  • Be available in an open, non-proprietary, commonly used format (see attached PDF Best Practices for Choosing File Formats below).