My Viral Box Logo
Ad image
  • Funny Stories
  • Weird Stories
  • Scary Stories
  • Ghost Stories
  • Funny Riddles
  • Short Jokes
Reading: The specialized ASIC from Google for machine learning is tens of times faster than the GPU
Share
MYVIRALBOX MYVIRALBOX
Font ResizerAa
  • Funny Stories
  • Weird Stories
  • Funny Riddles
  • Ghost Stories
  • Scary Stories
Search
  • Funny Stories
  • Weird Stories
  • Scary Stories
  • Ghost Stories
  • Funny Riddles
  • Short Jokes
Have an existing account? Sign In
Follow US
© Foxiz News Network. Ruby Design Company. All Rights Reserved.
MYVIRALBOX > Weird Stories > The specialized ASIC from Google for machine learning is tens of times faster than the GPU
Weird Stories

The specialized ASIC from Google for machine learning is tens of times faster than the GPU

MVB Staff
Last updated: April 25, 2024 3:37 pm
MVB Staff
Published: April 7, 2017
Share
8 Min Read
SHARE

Google

Four years ago, Google realized the real potential of using neural networks in its applications. At the same time she began to implement them everywhere – in translation of texts, voice search with speech recognition, etc. But it immediately became clear that the use of neural networks greatly increases the load on Google servers. Roughly speaking, if every person performed a voice search on Android (or dictated the text with speech recognition) only three minutes a day, then Google would have to double the number of data centers (!) Just to allow the neural network to process such amount of voice traffic.

I had to do something – and Google found a solution. In 2015, she developed her own hardware architecture for machine learning (Tensor Processing Unit, TPU), which exceeds the traditional GPU and CPU by 70 times and up to 196 times by the number of calculations per watt. The traditional GPU / CPU refers to the general-purpose processors Xeon E5 v3 (Haswell) and the graphics processors Nvidia Tesla K80.

For the first time, the architecture of the TPU is described this week in scientific work (pdf), which will be presented at the 44th International Symposium on Computer Architectures (ISCA), June 26, 2017 in Toronto. The lead author of more than 70 authors of this scientific work, the outstanding engineer Norman Juppi, known as one of the creators of the MIPS processor, in his interview to the publication The Next Platform explained in his own words the features of the unique TPU architecture that actually represents A specialized ASIC, that is, an integrated circuit for a special purpose.

Unlike conventional FPGAs or highly specialized ASICs, TPUs are programmed in the same way as a GPU or CPU, it is not a narrow-purpose device for a single neuron Ti. Norman Yuppie says that the TPU supports CISC instructions for different types of neural networks: convolutional neural networks, LSTM models and large, fully connected models. So it remains still programmable, only uses the matrix as a primitive, and not vector or scalar primitives.

Google emphasizes that while other developers optimize their microchips for convolutional neural networks, such neural networks give only 5 % Of the load in Google data centers. The main part of Google applications uses multi-layered Rumelhardt perceptrons, so it was so important to create a more universal architecture that is not “sharpened” only for convolutional neural networks.


One of the elements of architecture is the systolic flow engine Data, an array of 256 × 256, which receives the activation (weight) from the neurons on the left, and then everything is shifted step by step, multiplying by the weights in the cell. It turns out that the systolic matrix produces 65 536 calculations per cycle. This architecture is ideal for neural networks

According to Yuppie, the architecture of the TPU is more like a FPU coprocessor than a conventional GPU, although multiple matrices for multiplication do not store any programs, they simply follow the instructions received From the host.


The entire architecture of the TPU except for DDR3 memory. Instructions are sent from the host (left) to the queue. Then, the control logic, depending on the instruction, can repeatedly launch each of them

It is not yet known how much such architecture is scaled. Juppie says that there will always be a kind of bottleneck in a system with this kind of host.

Compared to conventional CPUs and GPUs, Google’s engine architecture exceeds them in dozens of times. For example, the Haswell Xeon E5-2699 v3 processor with 18 cores at 2.3 GHz with 64-bit floating point performs 1.3 ter-operations per second (TOPS) and shows a memory exchange rate of 51 GB / s. In this case, the chip itself consumes 145 W, and the entire system on it with 256 GB of memory – 455 W.

For comparison, TPU on 8-bit operations with 256 GB of external memory and 32 GB of internal memory demonstrates the exchange rate with The memory is 34 GB / s, but the card performs 92 TOPS, that is approximately 91 times more than the Haswell processor. The power consumption of the server on the TPU is 384 W.

The following graph compares the relative performance per watt server with the GPU (blue column), the server on the TPU (red) relative to the server On the CPU. Also compares the relative performance per watt server with the TPU in relation to the server on the GPU (orange) and the improved version of the TPU relative to the server on the CPU (green) and the server on the GPU (lilac).

It should be noted that Google conducted comparisons in application tests on TensorFlow with the relative old version of Haswell Xeon, while in the newer version of Broadwell Xeon E5 v4, the number of instructions per cycle increased by 5% due to architectural improvements, and in Version of Skylake Xeon E5 v5, which is expected in summer the number of instructions on the The cycle can increase by another 9-10%. And with the increase in the number of cores from 18 to 28 in Skylake, the overall performance of Intel processors in Google tests can improve by 80%. But even so, there will be a huge difference in performance with the TPU. In the 32-bit floating point test version, the TPU difference from the CPU is reduced to about 3.5 times. But most models are perfectly quantized to 8 bits.

Google thought how to use the GPU, FPGA and ASIC in its data centers since 2006, but did not find them to use until recently, when it introduced machine learning for a number of practical Tasks, and on these neural networks the load began to grow with billions of requests from users. Now the company has no choice but to leave the traditional CPU.

The company does not plan to sell its processors to anyone, but hopes that the scientific work with the ASIC of 2015 will allow others to improve the architecture and create improved versions of ASIC , Which “will raise the bar even higher.” Google itself is already probably working on a new version of ASIC.

MVB Staff
MVB Staff

You Might Also Like

How does dark matter interact with black holes? / Geektimes
Search for the missing antimatter in the universe remain in the annoying state of uncertainty translation
Registration for social networks is to be made under the passport, and children are not allowed at all / Geektimes
Filling the firmware in STM32 via USB / Geektimes
The first snapshot of a black hole can reconcile the theory of relativity and quantum physics
Leave a Comment Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search Posts

‎‎‎‎‎Explore Our Categories

  • Funny Riddles9
  • Funny Stories533
  • Ghost Stories4
  • News1
  • Scary Stories31
  • Short Jokes4
  • uncategorized1
  • Weird Stories487
Ad image

Latest added

scary demon
Scary Stories

10 Terrifying Demons and the Dark Legends That Made Them Famous

October 24, 2025
Short People Jokes
Short Jokes

150 Short People Jokes That’ll Make You Laugh (No Ladder Required)

October 20, 2025
best scary movies 2024
Scary Stories

10 Scariest Horror Movies of 2024 That’ll Haunt You for Days

October 13, 2025
Weirdest Casino Myths
Weird Stories

10 Casino Myths People Still Fall For — And Why They’re Totally Wrong

October 6, 2025
Weird Venus Structures Baffle Scientists
Weird Stories

Venus’ Mysterious Coronae May Reveal Earth’s Ancient Secrets

October 1, 2025
Burning Man Bro Gets a Dad
News

Burning Man Bro Gets Schooled on Fatherhood in Hilarious Comedy Short

September 19, 2025

Explore More

  • Privacy Policy
  • Submit Your Silly Stories

Follow US on Social Media

Facebook Instagram Pinterest Envelope-open

My Viral Box Logo

About My Viral Box

MyViralBox brings together all the weird, wacky, scary and funny news from around the web in one place to brighten your day. You might scratch your head; you might laugh out loud; you might glance over your shoulder; but you’re gonna have fun whenever you drop by. Funny news, weird news, chill-inducing spookiness, jokes and riddles of all kinds, plus whatever else we come across that we think just has to go viral; you’ll find it all right here!

© My Viral Box. All Rights Reserved.

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?