This article discusses some of the common file size limitations for PCs, and gives some brief examples. This discussion is limited to Microsoft products, and does not include the HPFS file system. If you want to skip the technical discussion and just see the bottom-line, there is a table of limits in the Summary.
1. The maximum disk drive
size supported by the hardware and OS (Operating System).
2. The maximum volume size
supported by the OS.
3. The maximum file size
supported by the OS
4. The maximum file size
supported by the application program.
Each of these is discussed in detail below. In most cases the last two impose the smallest limits.
The maximum file size supported by the OS may also be limited by the method used to access the file, the number of files open, the amount of physical memory in the system, buffer and virtual memory settings, the number of processes running, etc., etc. For servers, some of these limits are the sum of all users who are active on the system. For simplicity, this article will not deal with limitations caused by individual installations or by multiple users on a server. We will deal with the fundamental limits of the system.
Unless noted, this article uses binary values for KB, MB, and GB. That is, 1 KB = 1024 bytes, 1 MB = 1,048,576 bytes and 1 GB = 1,073,741,824 bytes. When necessary to make the distinction between decimal and binary, we will use "GB" to refer to the binary value, and "gigabytes" to refer to the decimal value. See "What is a Gigabyte?" for a discussion of binary vs decimal notation.
There have been many drive size limits over the years, for many different reasons -- too many to discuss in detail here. Common limits were 32 MB, 504 MB, 7.8 GB, 32 GB, and more recently 128 GB. In the past these drive size issues limited file size, but that's seldom the case today, and this consideration has all but vanished. Consequently, we will only give two brief examples to demonstrate these two points.
An example of an OS limit is the MSDOS 7.8 GB limit (commonly called the "8.4 gigabyte" limit). No matter how large a drive you connect, MSDOS won't see it as larger than 7.8 GB. This is due to the INT 13 CHS addressing used in all versions of MSDOS, and installing a new motherboard or BIOS won't help.
The only hardware boundary encountered on current systems is the 128 GB IDE limit caused by the 28 bits of LBA addressing of some BIOSes. Breaking this barrier requires a new 48 bit addressing method. Most hardware made since about 2000 can address IDE drives up to 128 GB, and most new (2003) designs can handle drives far larger than 128 GB, using 48 bit addressing. A firmware upgrade will allow many older controllers to access the new large drives.
For FAT volumes, the limit is determined by the maximum number of clusters that can be addressed, times the cluster size. A FAT-16 volume, which uses a 16 bit pointer for the File Allocation Table (FAT), can have almost 64K clusters (2^16 = 64K). The maximum cluster size is usually 32K, so the maximum volume size is 64K clusters times 32K bytes, or 2 GB. Some operating systems permit 64K clusters, so their FAT-16 disks can support 4 GB volumes. FAT-32 volumes use a 32 bit pointer, and can generally be up to 2 TB in size.
NTFS volumes are structured in
such a way that they are not limited by a fixed-size map like the FAT volumes
are. Microsoft claims an NTFS volume can be up to 16 Exabytes, but in
any case, it can be at least 2 TB in size. Some systems such as NT and
W2K allow you to combine multiple disk drives to form one NTFS volume.
So even if your disk controller BIOS has a 128 GB limit, you can combine,
say, four 120 GB drives into one 480 GB volume.
1. Fundamental OS limits
(the code in the kernel)
2. The disk formatting scheme.
Now things start to get more complex. The "fundamental OS limits" means the limits imposed by the code in the OS. These include such things as pointer size, which determines how much space can be addressed, access methods, buffering schemes, buffer sizes, etc. The "disk formatting scheme" refers to the disk's file structure, such as FAT or NTFS, which impose various limits on both volume size and file size.
The technical details are lengthy, and far beyond the scope of this article, but we will present a couple different examples to illustrate these issues. Actual limits are shown in the Summary, below.
Example 1: MSDOS
MSDOS uses a file handle
that has 32 bits available for addressing the file. This would limit
MSDOS file size to 4 GB (2^32 = 4 GB), but because the FAT-16 volume size
which MSDOS uses is limited to 2 GB, effectively MSDOS is limited to a
2 GB file. This is a case where the disk formatting scheme limits
the file size. (In fact, it's possible for MSDOS to create 4 GB files
on a network server.)
(It should be noted that some third party vendors found ways around some Microsoft limits, using tricks such as increasing the cluster size. This article only deals with unmodified Microsoft products).
Example 2: NT & Windows
2000
The issues with NT become
much more complex. NT supports both FAT and NTFS file systems.
FAT file systems are limited to files of 2 or 4 GB, so the file system
becomes the limiting factor when you use a FAT volume. But NTFS supports
large files. The OS itself has various limits built-in, and it becomes
the limiting factor when using NTFS volumes.
Microsoft states for NTFS:
"Volumes much larger than 2 terabytes (TB) are possible.", and "File size
limited only by size of volume.". This implies you can have files
"much larger than 2 TB" on NT. Although this is a true statement
for the NTFS volume, files that large are not directly supported
by the NT or Windows 2000 operating systems. When using normal disk
I/O calls to NT 4, you are limited to about 92 GB for all open files,
and about twice that for Windows 2000, due to hard-coded buffer limitations
in the OS. (We suspect a Service Patch has increased the NT limit,
but can't find any documentation on it). To open larger files you
have to bypass the OS disk I/O routines and use DLLs. In the extreme
case you can only read and write raw sectors, and have to perform your
own blocking and deblocking (basically write your own disk I/O routines).
Although you can have a 1 TB file on an NT, you can't do much with it.
You can't even COPY it, or TYPE it, and you certainly can't open it with
most normal applications.
It should be noted that some operations will work (or appear to work) past these limits, but others will fail. For instance, sequential reads or writes may work past the stated limit, but if you try to reposition in the file, it will fail, and may fail when you close the file.
We once tried creating a large file on NT 4 using QuickBasic. Since QB is an MSDOS program, we expected it to fail at 4 GB. While some operations did fail at 4 GB, we were quite surprised to find we were able to create files over 100 GB using a simple "print" statement. But when we tried to close the file, we got a "Bad record number..." error. However, as we expected, QuickBasic fails immediately with a "Path/File access error...." if you try to open a file over 4 GB. All I/O operations appear to work correctly up to 4 GB.
Visual Studio (at least the version we know) on NT can open any type of file in any mode, up to 2 GB. Over 2 GB some modes or operations fail, while others will work. You can read and write text files with the "OpenTextFile" method up to 8 GB, but cannot access binary files with that method. Over 8 GB for text files, or 2 GB for binary files, you need to use DLL calls for file I/O.
With the understanding that this is a complex issue, and there are many exceptions to such statements, here is what we can say in general:
* When we say "applications will work up to 2 GB", this doesn't mean, for example, that your favorite word processor can edit files of 2 GB. It means the file I/O will work, but other code in the word processor may limit how large a file you can actually edit.
Our experience with programs that claim to handle "files of unlimited size" is that many have failed at either 2 or 4 GB, and most have failed over 8 GB. And as mentioned in the section above, NT and Windows 2000 have hard-coded buffer limits.
There has recently (2003)
been a trend towards supporting larger files. PKZIP and Winzip have
just introduced new versions that will handle big files. We have
successfully used PKZIP on files over 180 GB and expect Winzip would work
as well. But keep in mind those files cannot be unzipped to a Windows
98 system or any drive with a FAT file structure.
All Microsoft Operating Systems from MSDOS 5 to XP can access files up to 2 GB without restriction. Systems using FAT-32 can access files up to 4 GB if the application program supports it. Systems using NTFS can access files via normal disk I/O calls, up to 2 to 4 GB under all conditions, and can access larger files under certain conditions. But over 4 or 8 GB NTFS systems usually require the use of DLLs for file I/O. More details can be found in the section "Maximum OS File Size", above.
Many applications and even some languages have trouble accessing files over 2 GB, but the main barrier is at 4 GB because 32 bit pointers are common, and 2^32 = 4 GB. Only programs specifically written to access large files will break the 4 GB barrier. And, of course, only on NTFS volumes, not on FAT volumes.
Failure modes and messages vary. Most often you will get a simple read or write failure message, but you may see a less meaningful message like "Bad record number", "Error positioning in file", or something similar. Sometimes a write will report "No space left on device", and sometimes the application will appear to terminate normally, but the file size will be truncated when writing, or you will be missing records on input.
You should also be aware that accessing large files over a network has additional considerations. For example, when Windows 98 accesses an NT directory via the network, it sees file size modulo 4 GB, so a file of 5,000,000,000 bytes on the NT will appear as a file of 705,032,704 bytes (5,000,000,000 - 4 GB). The application program also sees this size, and in many cases it will read either 705,032,704 bytes, or 4 GB, then terminate "normally", giving no clue that data was lost.
The best advice we can give when working with large files is to check everything you do. Check each file you create for the correct size, and check that the number of records you read from a file is correct. If you're writing a program, remember to use data types that can handle the large values required.
The table below lists the
maximum file and volume sizes for various disk structures and operating
systems. Notice that the Volume size of the FAT structure is also
dependent on the operating system. As noted above, there are many
exceptions and complexities that could alter these limits, most notably
the application program used.
| Disk Structure | Operating Systems | Maximum Volume Size | Maximum File Size |
| FAT-16 | MSDOS 5-up, Win-95, 98, ME | 2 GB (Note 1) | 2 GB (Note 2) |
| FAT-16 | NT 3.51, NT 4, W2K, XP | 4 GB | 4 GB |
| FAT-32 | Win 95 OSR2, 98, ME, XP | 2 TB | 4 GB |
| FAT-32 | W2K | 32 GB (Note 3) | 4 GB |
| NTFS | NT 4, W2K, XP | Over 2 TB (Note 4) | See note 5. |
Notes:
For more articles on data conversion, see our TechTalk Index.
Author's note: The original draft of this article attempted to explain many more technical issues, but became unbearably long and complex. This revised version, on the other hand, is superficial in many ways. I would appreciate your feedback on what is most useful to you; would you like to see more technical details, or just a summary of the limits? Please email your comments to: Thank you.
Our Large File Conversion Services
Disc Interchange Service Company has experience handling large files. We can
perform most normal operations on files up to 100 GB, and can perform limited operations
on files up to 1 TB. Large files can be split into smaller segments
to meet your needs. The data can be written to a USB or Firewire drive, or
to DVDs if you prefer.
|
|||||
|
Disc Interchange Service
Company, Inc.
Media Conversion Specialists
15 Stony Brook Road
Westford, MA 01886
(978) 692-0050