NVIDIA Unveils Next Generation "Fermi" GPU Architecture


NVIDIA has begun to disclose some information regarding its next generation GPU architecture, codenamed "Fermi". Actual product names or specifics were not disclosed just yet, nor was performance in 3D games, but high-level information about the architecture and its strong focus on compute performance and broader compatibility with computational applications were discussed.

The GPU codenamed Fermi will feature over 3-billion transistors and be produced using TSMC's 40nm processes. If you remember, AMD's new RV870 is 2.15 billion transistors and is also manufactured at 40nm, so Fermi will be significantly larger and more expensive to produce. Fermi will be outfitted with more than double the number of cores as the GT200, 512 in total. It will also offer 8x the peak double-precision compute performance. In addition, Fermi is the first GPU architecture to support ECC, so it can compensate for some errors and potentially scale to higher densities, and it will be able to execute C++ code.

As the foundation for NVIDIA's family of next generation GPUs namely GeForce, Quadro and Tesla - "Fermi" features a host of new technologies that are "must-have" features for the computing space, including:

*C++, complementing existing support for C, Fortran, Java, Python, OpenCL and DirectCompute.

*ECC, a critical requirement for datacenters and supercomputing centers deploying GPUs on a large scale

*512 CUDA Cores featuring the new IEEE 754-2008 floating-point standard, surpassing even the most advanced CPUs

*8x the peak double precision arithmetic performance over NVIDIA's last generation GPU. Double precision is critical for high-performance computing (HPC) applications such as linear algebra, numerical simulation, and quantum chemistry

*NVIDIA Parallel DataCache - the world's first true cache hierarchy in a GPU that speeds up algorithms such as physics solvers, raytracing, and sparse matrix multiplication where data addresses are not known beforehand

*NVIDIA GigaThread Engine with support for concurrent kernel execution, where different kernels of the same application context can execute on the GPU at the same time (eg: PhysX fluid and rigid body solvers)

*Nexus - the world's first fully integrated heterogeneous computing application development environment within Microsoft Visual Studio

Source

No comments: