In any computer system, there are two ways that data can be stored for retrieval at a later point in time: RAM and Hard Disk space.
The difference between RAM and data stored on a disk is that, if you were to open up, say, a Memory stick, you could not point to any part of it and say "Here be data", because the data exists only as electricity, and usually only for a few seconds or less during its terminally short, volatile existance. This data is used only to make it possible to give applications the necessary "short-term" memory needed to function, specifically, so that they can remember where they were up to, so to speak.
However, a computer can only have so much of this volatile memory, and when it runs out it must resort to making more for itself by using the hard-drive to store data, which it reads when necessary and deletes automatically when it is no longer needed.
The major problem with any form of virtual memory is that it is inherently slower than volatile RAM. It does, however, allow vast numbers of applications to be run side by side, and is particularly useful for image- and video-editors.
The ability to swap data out in this manner is accomplished with virtual memory because the MemoryManagementUnit of a processor allows an OperatingSystem Kernel to put different processes into their own virtual address spaces so it appears to the process that it has access to all of the physical memory of the machine. The kernel then maintains a mapping of these virtual addresses to the real physical ones. The kernel may also swap data out of physical memory onto disk. When a program attempts to access memory that has been swapped out, it generates a PageFault which causes the processor to call an interrupt handler installed by the kernel. The kernel then finds the memory page that was swapped out to disk and puts it back into physical memory, updating the virtual to real memory location map.
This also has the nice side effect of making processes unable to access the memory of other programs running on the same system, unless the kernel provides a way for the applications to share memory (which still requires the programs to explicitly request that some chunk of memory be shared between them, making it more secure than operating in a flat memory space).
