TECH ONE REVIEW: GTX Titan X

New Releases of DIGITS, cuDNN to Deliver 2x Faster Neural Network Training; cuDNN to Enable More Sophisticated Models

SINGAPORE — July 8, 2015 — NVIDIA today announced updates to its GPU-accelerated deep learning software that will double deep learning training performance.

The new software will empower data scientists and researchers to supercharge their deep learning projects and product development work by creating more accurate neural networks through faster model training and more sophisticated model design.

The NVIDIA® DIGITS™ Deep Learning GPU Training System version 2 (DIGITS 2) and NVIDIA CUDA® Deep Neural Network library version 3 (cuDNN 3) provide significant performance enhancements and new capabilities.

For data scientists, DIGITS 2 now delivers automatic scaling of neural network training across multiple high-performance GPUs. This can double the speed of deep neural network training for image classification compared to a single GPU.

For deep learning researchers, cuDNN 3 features optimized data storage in GPU memory for the training of larger, more sophisticated neural networks. cuDNN 3 also provides higher performance than cuDNN 2, enabling researchers to train neural networks up to two times faster on a single GPU.

The new cuDNN 3 library is expected to be integrated into forthcoming versions of the deep learning frameworks Caffe, Minerva, Theano and Torch, which are widely used to train deep neural networks.

“High-performance GPUs are the foundational technology powering deep learning research and product development at universities and major web-service companies,” said Ian Buck, vice president of Accelerated Computing at NVIDIA. “We’re working closely with data scientists, framework developers and the deep learning community to apply the most powerful GPU technologies and push the bounds of what is possible.”

DIGITS 2 – Up to 2x Faster Training with Automatic Multi-GPU Scaling

DIGITS 2 is the first all-in-one graphical system that guides users through the process of designing, training and validating deep neural networks for image classification.

The new automatic multi-GPU scaling capability in DIGITS 2 maximises the available GPU resources by automatically distributing the deep learning training workload across all of the GPUs in the system. Using DIGITS 2, NVIDIA engineers trained the well-known AlexNet neural network model more than two times faster on four NVIDIA Maxwell™ architecture-based GPUs, compared to a single GPU1. Initial results from early customers are demonstrating better results.

“Training one of our deep nets for auto-tagging on a single NVIDIA GeForce GTX TITAN X takes about sixteen days, but using the new automatic multi-GPU scaling on four TITAN X GPUs the training completes in just five days,” said Simon Osindero, A.I. architect at Yahoo's Flickr. “This is a major advantage and allows us to see results faster, as well letting us more extensively explore the space of models to achieve higher accuracy.”

cuDNN 3 – Train Larger, More Sophisticated Models Faster

cuDNN is a GPU-accelerated library of mathematical routines for deep neural networks that developers integrate into higher-level machine learning frameworks.

cuDNN 3 adds support for 16-bit floating point data storage in GPU memory, doubling the amount of data that can be stored and optimising memory bandwidth. With this capability, cuDNN 3 enables researchers to train larger and more sophisticated neural networks.

“We believe FP16 GPU storage support in NVIDIA’s libraries will enable us to scale our models even further, since it will increase effective memory capacity of our hardware and improve efficiency as we scale training of a single model to many GPUs,” said Bryan Catanzaro, senior researcher at Baidu Research. “This will lead to further improvements in the accuracy of our models.”

cuDNN 3 also delivers significant performance speedups compared to cuDNN 2 for training neural networks on a single GPU. It enabled NVIDIA engineers to train the AlexNet model two times faster on a single NVIDIA GeForce® GTX™ TITAN X GPU.2

Availability

The DIGITS 2 Preview release is available today as a free download for NVIDIA registered developers. To learn more or download, visit the DIGITS website.

The cuDNN 3 library is expected to be available in major deep learning frameworks in the coming months. To learn more visit the cuDNN website.

MSI, world leading manufacturer in motherboard and graphics card, is proud to officially announces that top worldwide reconignized overclokers Benchbros, Wizerty, Vivi and Zzolio broke 7 Top Global Scores & Hardware Records during G.Skill Overclocking World Record Stage 2015. To achieve this astonishing performance during the 4th annual G.Skill OC Event, those professional extreme overclockers used the brand new high-end MSI X99A GODLIKE GAMING and several times worldwide review awarded MSI X99A XPOWER AC motherboards.

One Board to Rull All

MSI X99A GODLIKE GAMING

Thanks to MSI hero products and G.Skill DDR4 Memory, worldwide top overclockers Wizerty, Benchbros and Vivi, respectively from France, Germany, South Africa and Denmark have been able to enhance their skills and succesfully reach strong edge performance breaking scores live on the very first day of Computex, proving once again that MSI products not only deliver the best of both OC and GAMING worlds to users, but also exclusive features, OC and performace to everyone’s budget. The brand new MSI high-end X99A GODLIKE GAMING motherboard alone achieved not less than 7 Global Top Scores in 2D and 3D with MSI GeForce GTX Titan X Graphic cards, while X99A XPOWER AC established a new milestone in memory overclocking as it is the first motherboard to ever exceed the DDR4-4400 barrier.

Check the links below for more information about those astonishing scores:

BenchBros
http://hwbot.org/submission/2881141_benchbros_hwbot_prime_core_i7_5960x_9761.78_pps

http://hwbot.org/submission/2881138_benchbros_cinebench___r11.5_core_i7_5960x_25.82_points

http://hwbot.org/submission/2881139_benchbros_unigine_heaven___xtreme_preset_2x_geforce_gtx_titan_x_8813.06_dx11_marks

Wizerty

http://hwbot.org/submission/2881124_wizerty_gpupi_for_cpu___1b_core_i7_5960x_2min_16sec_938ms

http://hwbot.org/submission/2881132_wizerty_catzilla___1440p_2x_geforce_gtx_titan_x_32408_marks

Vivi

http://hwbot.org/submission/2881125_

Zzolio

http://hwbot.org/submission/2881159_zzolio_cinebench___r15_core_i7_5960x_2381_cb

http://hwbot.org/submission/2881149_zzolio_catzilla___720p_2x_geforce_gtx_titan_x_81947_marks

TECH ONE REVIEW

NVIDIA Doubles Performance for Deep Learning Training

G.Skill OC WR Stage 2015 – MSI Day Extreme Overclockers break 7 Global Top Scores on the brand new X99A GODLIKE GAMING!