AWS Storage Gateway-AWS Blog Info

AWS Storage Gateway provides integration between the on-premises IT environment and the AWS storage infrastructure. The user can store data in the AWS cloud for scalable, data security features and cost-efficient storage.

  • Storage Gateway is a service in AWS that connects an on-premises software appliance with the cloud-based storage to provide secure integration between an organization's on-premises IT environment and AWS storage infrastructure.

Note: Here, on-premise means that an organization keeps its IT environment on site while the cloud is kept offsite with someone else responsible for its maintenance.


There are three types of Storage Gateways:

AWS Storage Gateway
  • File Gateway (NFS)
  • Volume Gateway (iSCSI)
  • Tape Gateway (VTL)

The above image shows that the storage gateway is categorized into three parts: File Gateway, Volume Gateway, and Tape Gateway. Volume Gateway is further classified into two parts: Stored Volumes and Cached Volumes.

File Gateway

  • It is using the technique NFS.
  • It is used to store the flat files in S3 such as word files, pdf files, pictures, videos, etc.
  • It is used to store the files to S3 directly.
  • Files are stored as objects in S3 buckets, and they are accessed through a Network File System (NFS) mount point.
  • Ownership, permissions, and timestamps are durably stored in S3 in the user metadata of the object associated with the file.
  • Once the objects are transferred to the S3, they can be used as the native S3 objects, and bucket policies such as versioning, lifecycle management, and cross-region replication can be directly applied to the objects stored in your bucket.

Architecture of File Gateway

AWS Storage Gateway
  • Storage Gateway is a virtual machine running on-premises.
  • Storage Gateway is mainly connected to aws through the internet.
  • It can use Direct Connect. Direct Connect is a direct connection line between the Data center and aws.
  • It can also use an Amazon VPC (Virtual Private Cloud) to connect a storage gateway to aws. VPC is a virtual data center. It represents that the Application server and storage gateway do not need to be on-premises. In Amazon VPC, storage gateway sits inside the VPC, and then storage gateway sends the information to S3.

Volume Gateway

  • Volume Gateway is an interface that presents your applications with disk volumes using the Iscsi block protocol. The iSCSI block protocol is block-based storage that can store an operating system, applications and also can run the SQL Server, database.
  • Data written to the hard disk can be asynchronously backed up as point-in-time snapshots in your hard disks and stored in the cloud as EBS snapshots where EBS (Elastic Block Store) is a virtual hard disk which is attached to the EC2 instance. In short, we can say that the volume gateway takes the virtual hard disks that you back them up to the aws.
  • Snapshots are incremental backups so that the changes made in the last snapshot are backed up. All snapshot storage is also compressed to minimize your storage charges.

Volume Gateway is of two types:

Stored Volumes

  • It is a way of storing the entire copy of the data locally and asynchronously backing up the data to aws.
  • Stored volumes provide low-latency access to the entire datasets of your on-premise applications and offsite backups.
  • You can create a stored volume that can be a virtual storage volume which is mounted as iSCSI devices to your on-premise application services such as data services, web services.
  • Data written to your stored volume is stored on your local storage hardware, and this data is asynchronously backed up to the Amazon Simple storage services in the form of Amazon Elastic Block store snapshots.
  • The size of the stored volume is 1GB - 16 TB.

Architecture of Volume Gateway

AWS Storage Gateway
  • A client is talking to the server that could be an application server or a web server.
  • An application server is having an Iscst connection with the volume Gateway.
  • Volume Gateway is installed on the Hypervisor.
  • The volume storage is also known as a virtual hard disk which is stored in physical infrastructure, and the size of the virtual hard disk is 1TB.
  • The volume storage takes the snapshots and sends them to the Upload buffer.
  • The upload buffer performs the multiple uploads to the S3, and all these uploads are stored as EBS snapshots.

Cached Gateway

  • It is a way of storing the most recently accessed data on site, and the rest of the data is stored in aws.
  • Cached Volume allows using the Amazon Simple Storage service as your primary data storage while keeping the copy of the recently accessed data locally in your storage gateway.
  • Cached Volume minimizes the need to scale your on-premises storage infrastructure while still providing the low-latency access to their frequently accessed data.
  • Cached Gateway stores the data that you write to the volume and retains only recently read data in on-premises storage gateway.
  • The size of the cached volume is 1GB - 32 TB.

Architecture of Cached Gateway

AWS Storage Gateway
  • A client is connected to the Application server, and an application server is having an iSCSI connection with the Gateway.
  • The data send by the client is stored in the cache storage and then uploaded in an upload buffer.
  • The data from the upload buffer is transferred to the virtual disks, i.e., volume storage which sits inside the Amazon S3.
  • Volume storage is block-based storage which cannot be stored in S3 as S3 is object-based storage. Therefore, the snapshots, i.e., the flat files are taken, and these flat files are then stored in S3.
  • The most recently read data is stored in the Cache Storage.

Tape Gateway

  • Tape Gateway is mainly used for taking backups.
  • It uses a Tape Gateway Library interface.
  • Tape Gateway offers a durable, cost-effective solution to archive your data in AWS cloud.
  • The VTL interface provides a tape-based backup application infrastructure to store data on virtual tape cartridges that you create on your tape Gateway.
  • It is supported by NetBackup, Backup Exec, Veeam, etc. Instead of using physical tape, they are using virtual tape, and these virtual tapes are further stored in Amazon S3.

Architecture of Tape Gateway

AWS Storage Gateway
  • Servers are connected to the Backup Application, and the Backup Application can be NetBackup, Backup Exec, Veeam, etc.
  • Backup Application is connected to the Storage Gateway over the iSCSI connection.
  • Virtual Gateway is represented as a virtual appliance connected over iSCSI to the Backup application.
  • Virtual tapes are uploaded to an Amazon S3.
  • Now, we have a Lifecycle Management policy where we can archive to the virtual tape shelf in Amazon Glacier.

Important points to remember:

  • File Gateway is used for object-based storage in which all the flat files such as word files, pdf files, etc, are stored directly on S3.
  • Volume Gateway is used for block-based storage, and it is using an iSCSI protocol.
  • Stored Volume is a volume gateway used to store the entire dataset on site and backed up to S3.
  • Cached volume is a volume gateway used to store the entire dataset in a cloud (Amazon S3) and only the most frequently accessed data is kept on site.
  • Tape Gateway is used for backup and uses popular backup applications such as NetBackup, Backup Exec, Veeam, etc.

AWS Gateway offers two types of storage, i.e. volume based and tape based.

AWS Gateway

Volume Gateways

This storage type provides cloud-backed storage volumes which can be mount as Internet Small Computer System Interface (iSCSI) devices from on-premises application servers.

GATEWAY-CACHED VOLUMES

AWS Storage Gateway stores all the on-premises application data in a storage volume in Amazon S3. Its storage volume ranges from 1GB to 32 TB and up to 20 volumes with a total storage of 150TB. We can attach these volumes with iSCSI devices from on-premises application servers. It is of two categories −

CACHE STORAGE DISK

Every application requires storage volumes to store their data. This storage type is used to initially store data when it is to be written to the storage volumes in AWS. The data from the cache storage disk is waiting to be uploaded to Amazon S3 from the upload buffer. The cache storage disk keeps the most recently accessed data for low-latency access. When the application needs data, the cache storage disk is first checked before checking Amazon S3.

There are few guidelines to determine the amount of disk space to be allocated for cache storage. We should allocate at least 20% of the existing file store size as cache storage. It should be more than the upload buffer.

Upload buffer disk − This type of storage disk is used to store the data before it is uploaded to Amazon S3 over SSL connection. The storage gateway uploads the data from the upload buffer over an SSL connection to AWS.

Snapshots − Sometimes we need to back up storage volumes in Amazon S3. These backups are incremental and are known as snapshots. The snapshots are stored in Amazon S3 as Amazon EBS snapshots. Incremental backup means that a new snapshot is backing up only the data that has changed since the last snapshot. We can take snapshots either at a scheduled interval or as per the requirement.

GATEWAY-STORED VOLUMES

When the Virtual Machine (VM) is activated, gateway volumes are created and mapped to the on-premises direct-attached storage disks. Hence, when the applications write/read the data from the gateway storage volumes, it reads and writes the data from the mapped on-premises disk.

A gateway-stored volume allows to store primary data locally and provides on-premises applications with low-latency access to entire datasets. We can mount them as iSCSI devices to the on-premises application servers. It ranges from 1 GB to 16 TB in size and supports up to 12 volumes per gateway with a maximum storage of 192 TB.

Gateway-Virtual Tape Library (VTL)

This storage type provides a virtual tape infrastructure that scales seamlessly with your business needs and eliminates the operational burden of provisioning, scaling, and maintaining a physical tape infrastructure. Each gateway-VTL is preconfigured with media changer and tape drives, that are available with the existing client backup applications as iSCSI devices. Tape cartridges can be added later as required to archive the data.

Few terms used in Architecture are explained below.

Virtual Tape − Virtual tape is similar to a physical tape cartridge. It is stored in the AWS cloud. We can create virtual tapes in two ways: by using AWS Storage Gateway console or by using AWS Storage Gateway API. The size of each virtual tape is from 100 GB to 2.5 TB. The size of one gateway is up to 150 TB and can have maximum 1500 tapes at a time.

Virtual Tape Library (VTL) − Each gateway-VTL comes with one VTL. VTL is similar to a physical tape library available on-premises with tape drives. The gateway first stores data locally, then asynchronously uploads it to virtual tapes of VTL.

Tape Drive − A VTL tape drive is similar to a physical tape drive that can perform I/O operations on tape. Each VTL consists of 10 tape drives that are used for backup applications as iSCSI devices.

Media Changer − A VTL media changer is similar to a robot that moves tapes around in a physical tape library's storage slots and tape drives. Each VTL comes with one media changer that is used for backup applications as iSCSI device.

Virtual Tape Shelf (VTS) − A VTS is used to archive tapes from gateway VTL to VTS and vice-a-versa.

Archiving Tapes − When the backup software ejects a tape, the gateway moves the tape to the VTS for storage. It is used data archiving and backups.

Retrieving Tapes − Tapes archived to the VTS cannot be read directly so to read an archived tape, we need to retrieve the tape from gateway VTL either by using the AWS Storage Gateway console or by using the AWS Storage Gateway API.

Comments

Popular posts from this blog

Amazon Route 53-AWS Blog Info

Introduction To Amazon Web services-AWS Blog Info

What is DNS(Domain name Services)-AWS Blog Info