top of page

File Systems on AWS: EFS vs FSx Selection Guide

Week 6 - AWS Storage Deep Dive Series


Choosing the right file system on AWS can significantly impact your application's performance, cost, and scalability. This comprehensive guide will help you navigate the decision between Amazon Elastic File System (EFS) and Amazon FSx, understand their key features, and provide hands-on experience setting up EFS for multiple EC2 instances.


Understanding AWS File System Options

AWS offers several managed file system services, each designed for specific use cases:

  • Amazon EFS: A fully managed NFS file system for Linux-based workloads

  • Amazon FSx for Windows File Server: A fully managed Windows-based file system

  • Amazon FSx for Lustre: A high-performance file system for compute-intensive workloads


Amazon EFS Deep Dive

EFS File System Modes

Amazon EFS operates in different modes to optimize for various use cases:


Performance Modes

General Purpose Mode (Default)

  • Lowest latency per operation

  • Up to 7,000 file operations per second

  • Best for latency-sensitive applications

  • Suitable for most use cases


Max I/O Mode

  • Higher levels of aggregate throughput and operations per second

  • Slightly higher latencies for file operations

  • Can scale to higher performance levels

  • Best for applications that need higher levels of aggregate throughput


Throughput Modes

Bursting Throughput (Default)

  • Throughput scales with file system size

  • Baseline performance of 50 MiB/s per TiB stored

  • Can burst up to 100 MiB/s for extended periods

  • Credit system similar to EBS GP2


Provisioned Throughput

  • Fixed throughput independent of storage amount

  • Can provision up to 4 GiB/s

  • Pay for provisioned throughput regardless of usage

  • Ideal for high-throughput applications with smaller storage needs


Elastic Throughput

  • Automatically scales throughput up and down based on workload

  • Pay only for the throughput you use

  • No need to pre-provision capacity

  • Recommended for unpredictable workloads


Storage Classes

EFS offers multiple storage classes to optimize costs:

  • Standard: For frequently accessed files

  • Infrequent Access (IA): Lower-cost storage for files accessed less frequently

  • Archive: Lowest-cost storage for long-term archival

  • One Zone: Single AZ storage for non-critical workloads


Mount Targets and Security Groups

Mount Targets

  • Network interfaces that allow EC2 instances to connect to EFS

  • One mount target per Availability Zone

  • Must be created in each AZ where you want to access the file system

  • Each mount target has its own IP address and DNS name


Security Group Configuration

Mount targets require proper security group configuration:

# Allow NFS traffic (port 2049) from EC2 security groups
Type: NFS
Protocol: TCP
Port Range: 2049
Source: EC2 security group ID

Amazon FSx Options

FSx for Windows File Server


Key Features:

  • Fully managed Windows-based file system

  • Built on Windows Server

  • Supports SMB protocol and Windows NTFS

  • Active Directory integration

  • Deduplication and compression

  • VSS (Volume Shadow Copy Service) support


Use Cases:

  • Windows-based applications

  • Enterprise applications requiring SMB

  • Content repositories and web serving

  • Media workflows


Performance Characteristics:

  • SSD and HDD storage options

  • Throughput: 64 MiB/s to 2 GiB/s per TiB

  • Sub-millisecond latencies

  • Multi-AZ deployments for high availability


FSx for Lustre

Key Features:

  • High-performance parallel file system

  • Optimized for fast processing of workloads

  • Tight integration with S3

  • POSIX-compliant

  • Scales to hundreds of GiB/s of throughput


Use Cases:

  • High Performance Computing (HPC)

  • Machine learning training

  • Electronic Design Automation (EDA)

  • Media processing workflows

  • Financial modeling


Performance Characteristics:

  • SSD and HDD storage options

  • Up to hundreds of GiB/s of throughput

  • Sub-millisecond latencies

  • Persistent and scratch file systems


EFS vs FSx: Use Case Comparisons

Choose Amazon EFS When:


Web Serving and Content Management

  • Multiple web servers need shared access to content

  • WordPress, Drupal, or other CMS applications

  • Static website assets shared across instances


Application Data Sharing

  • Multiple application instances need shared configuration

  • Log aggregation across multiple servers

  • Shared application state


Data Analytics

  • Analytics workloads requiring shared data access

  • ETL processes with shared intermediate data

  • Business intelligence applications


Development and Testing

  • Shared code repositories

  • Test data accessible across environments

  • CI/CD pipelines requiring shared artifacts


Choose FSx for Windows When:

Windows Enterprise Applications

  • Microsoft SQL Server file shares

  • .NET applications requiring SMB

  • Windows-based content management systems


File Shares and Collaboration

  • Department shared drives

  • User home directories

  • Document collaboration platforms


Choose FSx for Lustre When:

High Performance Computing

  • Scientific computing workloads

  • Weather modeling and simulation

  • Genomics analysis


Machine Learning

  • Large-scale ML training jobs

  • Deep learning with large datasets

  • Model serving with high throughput requirements


Media and Entertainment

  • Video rendering and processing

  • Visual effects workflows

  • Content distribution


Cost Comparison Framework

EFS Pricing Factors:

  • Storage amount (GB-month)

  • Throughput mode (bursting vs provisioned)

  • Storage class (Standard, IA, Archive)

  • Data transfer costs


FSx Pricing Factors:

  • Storage capacity

  • Throughput capacity

  • Storage type (SSD vs HDD)

  • Backup storage

  • Data transfer costs


Performance Comparison

MetricEFSFSx WindowsFSx LustreMax Throughput4+ GiB/s2 GiB/s1+ TiB/sLatencyLowSub-millisecondSub-millisecondIOPS7,000+100K+MillionsProtocolNFS v4.1SMBPOSIXConcurrent ClientsThousandsThousandsThousands

Hands-on: Setting Up EFS for Multiple EC2 Instances

Let's walk through setting up an EFS file system and mounting it on multiple EC2 instances.


Step 1: Create the EFS File System

# Using AWS CLI
aws efs create-file-system \
    --creation-token my-efs-token-$(date +%s) \
    --performance-mode generalPurpose \
    --throughput-mode provisioned \
    --provisioned-throughput-in-mibps 100 \
    --enable-backup \
    --tags Key=Name,Value=MySharedEFS

Step 2: Create Mount Targets

# Get subnet IDs for each AZ
SUBNET_1=$(aws ec2 describe-subnets --filters "Name=availability-zone,Values=us-west-2a" --query 'Subnets[0].SubnetId' --output text)
SUBNET_2=$(aws ec2 describe-subnets --filters "Name=availability-zone,Values=us-west-2b" --query 'Subnets[0].SubnetId' --output text)

# Create security group for EFS
EFS_SG=$(aws ec2 create-security-group \
    --group-name efs-mount-target-sg \
    --description "Security group for EFS mount targets" \
    --query 'GroupId' --output text)

# Allow NFS access from EC2 instances
aws ec2 authorize-security-group-ingress \
    --group-id $EFS_SG \
    --protocol tcp \
    --port 2049 \
    --source-group $EC2_SG

# Create mount targets
aws efs create-mount-target \
    --file-system-id $EFS_ID \
    --subnet-id $SUBNET_1 \
    --security-groups $EFS_SG

aws efs create-mount-target \
    --file-system-id $EFS_ID \
    --subnet-id $SUBNET_2 \
    --security-groups $EFS_SG

Step 3: Launch EC2 Instances

# Create EC2 security group
EC2_SG=$(aws ec2 create-security-group \
    --group-name ec2-efs-client-sg \
    --description "Security group for EC2 instances accessing EFS" \
    --query 'GroupId' --output text)

# Allow SSH access
aws ec2 authorize-security-group-ingress \
    --group-id $EC2_SG \
    --protocol tcp \
    --port 22 \
    --cidr 0.0.0.0/0

# Launch instances in different AZs
aws ec2 run-instances \
    --image-id ami-0c02fb55956c7d316 \
    --count 1 \
    --instance-type t3.micro \
    --key-name my-key \
    --security-group-ids $EC2_SG \
    --subnet-id $SUBNET_1 \
    --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=EFS-Client-1}]'

aws ec2 run-instances \
    --image-id ami-0c02fb55956c7d316 \
    --count 1 \
    --instance-type t3.micro \
    --key-name my-key \
    --security-group-ids $EC2_SG \
    --subnet-id $SUBNET_2 \
    --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=EFS-Client-2}]'

Step 4: Install EFS Utilities on EC2 Instances

# SSH into each EC2 instance and run:

# For Amazon Linux 2
sudo yum update -y
sudo yum install -y amazon-efs-utils

# For Ubuntu
sudo apt-get update
sudo apt-get install -y amazon-efs-utils

# For other distributions, install manually
git clone https://github.com/aws/efs-utils
cd efs-utils
./build-deb.sh
sudo apt install ./build/amazon-efs-utils*deb

Step 5: Mount the EFS File System

# Create mount point
sudo mkdir /mnt/efs

# Mount using EFS mount helper (recommended)
sudo mount -t efs $EFS_ID:/ /mnt/efs

# Or mount using traditional NFS
sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 \
    $EFS_DNS_NAME:/ /mnt/efs

# Verify mount
df -h /mnt/efs

Step 6: Configure Automatic Mounting

# Add to /etc/fstab for automatic mounting at boot
echo "$EFS_ID.efs.us-west-2.amazonaws.com:/ /mnt/efs efs defaults,_netdev" | sudo tee -a /etc/fstab

# Test fstab entry
sudo umount /mnt/efs
sudo mount -a

Step 7: Test File Sharing

# On instance 1, create a test file
echo "Hello from instance 1" | sudo tee /mnt/efs/test-file.txt

# On instance 2, verify the file is accessible
cat /mnt/efs/test-file.txt
# Should output: Hello from instance 1

# Test concurrent write access
# Instance 1
echo "Data from instance 1" | sudo tee /mnt/efs/shared-data.txt

# Instance 2 (simultaneously)
echo "Data from instance 2" | sudo tee -a /mnt/efs/shared-data.txt

# Verify both writes
cat /mnt/efs/shared-data.txt

Step 8: Performance Testing

# Install performance testing tools
sudo yum install -y fio

# Test sequential write performance
sudo fio --name=sequential-write \
    --directory=/mnt/efs \
    --rw=write \
    --bs=1M \
    --size=1G \
    --numjobs=1 \
    --time_based \
    --runtime=60 \
    --group_reporting

# Test random read/write performance
sudo fio --name=random-rw \
    --directory=/mnt/efs \
    --rw=randrw \
    --bs=4K \
    --size=100M \
    --numjobs=4 \
    --time_based \
    --runtime=60 \
    --group_reporting

Advanced EFS Configurations

Encryption in Transit

# Mount with encryption in transit
sudo mount -t efs -o tls $EFS_ID:/ /mnt/efs

# Add to fstab with encryption
echo "$EFS_ID.efs.us-west-2.amazonaws.com:/ /mnt/efs efs defaults,_netdev,tls" | sudo tee -a /etc/fstab

Access Points

# Create an access point for application-specific access
aws efs create-access-point \
    --file-system-id $EFS_ID \
    --posix-user Uid=1000,Gid=1000 \
    --root-directory Path=/app-data,CreationInfo="{OwnerUid=1000,OwnerGid=1000,Permissions=755}" \
    --tags Key=Name,Value=AppDataAccessPoint

Lifecycle Management

# Configure lifecycle policy to transition files to IA storage
aws efs create-lifecycle-configuration \
    --file-system-id $EFS_ID \
    --lifecycle-policies TransitionToIA=AFTER_30_DAYS,TransitionToPrimaryStorageClass=AFTER_1_ACCESS

Monitoring and Troubleshooting

CloudWatch Metrics

Key EFS metrics to monitor:

  • DataReadIOBytes - Total bytes read

  • DataWriteIOBytes - Total bytes written

  • MetadataIOBytes - Metadata operations

  • TotalIOTime - Total time for I/O operations

  • PercentIOLimit - Percentage of I/O limit used


Common Issues and Solutions

Mount Timeout Issues:

# Check security group rules
aws ec2 describe-security-groups --group-ids $EFS_SG

# Verify mount target state
aws efs describe-mount-targets --file-system-id $EFS_ID

# Test connectivity
telnet $EFS_DNS_NAME 2049

Performance Issues:

# Check throughput mode and utilization
aws efs describe-file-systems --file-system-id $EFS_ID

# Monitor CloudWatch metrics
aws cloudwatch get-metric-statistics \
    --namespace AWS/EFS \
    --metric-name TotalIOTime \
    --dimensions Name=FileSystemId,Value=$EFS_ID \
    --start-time 2024-01-01T00:00:00Z \
    --end-time 2024-01-01T23:59:59Z \
    --period 300 \
    --statistics Average

Best Practices and Recommendations

Security Best Practices

  1. Use VPC Security Groups: Configure restrictive security groups for mount targets

  2. Enable Encryption: Use encryption at rest and in transit for sensitive data

  3. Access Points: Use access points for fine-grained access control

  4. IAM Policies: Implement proper IAM policies for EFS access


Performance Optimization

  1. Choose the Right Mode: Select appropriate performance and throughput modes

  2. Regional Mounting: Mount from instances in the same region

  3. Concurrent Access: Design applications for concurrent file system access

  4. Caching: Implement application-level caching when appropriate


Cost Optimization

  1. Storage Classes: Use IA or Archive storage classes for infrequently accessed data

  2. Lifecycle Policies: Implement automatic transitions to lower-cost storage classes

  3. Throughput Right-sizing: Monitor and adjust provisioned throughput based on usage

  4. Regular Cleanup: Remove unnecessary files and directories


Conclusion

Choosing between EFS and FSx depends on your specific requirements:

  • Choose EFS for Linux-based workloads requiring shared NFS storage with high availability and scalability

  • Choose FSx for Windows for Windows-based applications requiring SMB protocol and Active Directory integration

  • Choose FSx for Lustre for high-performance computing workloads requiring extreme throughput and low latency


The hands-on setup demonstrates how EFS provides seamless file sharing across multiple EC2 instances with minimal configuration. By understanding the performance characteristics, use cases, and cost implications of each option, you can make informed decisions about which AWS file system best fits your architecture.


In the next blog post, we'll explore AWS backup strategies and disaster recovery patterns for your storage infrastructure.


This is part of our AWS Storage Deep Dive series. Follow along for comprehensive coverage of AWS storage services and best practices.

Related Posts

See All

Comments


bottom of page