outlet-logo

DeepMind's AlphaFold now contains structure of 200M proteins. What does that mean for drug R&D?

  • 4 min read
  • 28th July, 2022
  • Media Highlights
  • External Writer

When Al­phaFold first came out, it de­buted with the promise of in­creas­ing the num­ber of pro­tein struc­tures that could be pre­dict­ed, a move that had elud­ed re­searchers for decades and could prove to be­come a cru­cial step to mov­ing drug de­vel­op­ment to a new height.

Now, Al­phaFold says its data­base of pro­tein struc­tures has been mas­sive­ly ex­pand­ed.

Google’s AI out­fit and the Eu­ro­pean Mol­e­c­u­lar Bi­ol­o­gy Lab­o­ra­to­ry’s Eu­ro­pean Bioin­for­mat­ics In­sti­tute (EM­BL-EBI) an­nounced Thurs­day that Deep­Mind’s Al­phafold data­base now con­tains the struc­tures of more than 200 mil­lion pro­teins. It’s a sub­stan­tial jump from where it was a year ago when Deep­Mind an­nounced that it had pre­dict­ed the struc­ture of on­ly about 350,000 pro­teins.

And ac­cord­ing to the pair, the range of sci­en­tif­ic pos­si­bil­i­ties can go far be­yond dis­eases and drug de­vel­op­ment, in­clud­ing sus­tain­abil­i­ty and food in­se­cu­ri­ty.

The two com­pa­nies said in a state­ment an­nounc­ing the data­base ex­pan­sion that it now con­tains the struc­ture of es­sen­tial­ly every pro­tein that has been se­quenced — and is de­signed to func­tion es­sen­tial­ly like a Google search.

On top of that: the com­pa­nies are keep­ing it free for use for the sci­en­tif­ic com­mu­ni­ty at large.

“Our hope is that this ex­pand­ed data­base will aid count­less more sci­en­tists and their im­por­tant work and open up com­plete­ly new av­enues of sci­en­tif­ic dis­cov­ery,” Deep­Mind CEO Demis Has­s­abis told re­porters ear­li­er this week.

Pro­tein struc­ture — and the abil­i­ty to pre­dict it — had his­tor­i­cal­ly been elu­sive to re­searchers. As a pro­tein’s func­tion is, for the most part, dic­tat­ed by its struc­ture, the end re­sult was that peo­ple were left in the dark on what ex­act­ly cer­tain pro­teins can and can­not do as it re­lates to hu­man health and bod­i­ly func­tion. Be­fore Deep­Mind’s launch, ex­perts es­ti­mat­ed on­ly a third of hu­man pro­tein struc­tures were known for re­search, lim­it­ing drug de­vel­op­ment op­tions.

For bi­ol­o­gists, a data­base of this size could show new pock­ets/av­enues for po­ten­tial drug tar­gets that can go af­ter can­cer tu­mors or cor­rect gene mu­ta­tions — or even bet­ter un­der­stand an­tibi­ot­ic re­sis­tance, for ex­am­ple.

Google's Deep­Mind opens its pro­tein da­ta­base to sci­ence — po­ten­tial­ly crack­ing drug R&D wide open

1910 Ge­net­ics CEO Jen Nwankwo told End­points News that the data­base in­crease can be a game chang­er for cer­tain tar­gets in the drug dis­cov­ery space. How­ev­er, the ques­tion re­mains of how it will work in pro­tein mo­tion and more dif­fi­cult/more elu­sive tar­gets.

Here are her thoughts:

This lat­est up­date that ex­pands it to over 200 mil­lion pro­teins is tru­ly a hero­ic mo­ment for sci­ence, and the open-source na­ture of it would en­able even broad­er adop­tion of the tool. From our per­spec­tive, we think there’s two ways to go to im­prove the ac­cu­ra­cy and ex­pand the ca­pa­bil­i­ty of Al­phaFold 2. Be­cause while it per­forms well in pre­dict­ing the 3D struc­ture of monomer­ic pro­teins, it still re­quires sig­nif­i­cant re­search and do­main knowl­edge to ap­ply it to com­plex struc­tures like mem­brane pro­teins, al­losteric re­gions, con­for­ma­tion­al dy­nam­ics, etc. But we sus­pect that for chal­leng­ing dis­ease tar­gets, Al­phafold per­haps isn’t the an­swer just yet.

And while the 200 mil­lion size is im­pres­sive, it’s im­por­tant to note that the fine de­tails, the small de­tails are im­por­tant.

Nwankwo elab­o­rat­ed that “beg­gars aren’t choosers,” re­it­er­at­ing that Google, from her view, is do­ing the en­tire world a big ser­vice by ex­pand­ing the li­brary. How­ev­er, “It can get bet­ter to get us to some of these fin­er minu­tia around the bio­physics of pro­tein struc­ture that we re­al­ly need, to un­lock some of these cryp­tic pock­ets for nov­el drug dis­cov­ery,” the CEO fur­ther not­ed.

Ex­sci­en­tia’s Chris Radoux, an as­so­ciate di­rec­tor in struc­tur­al bioin­for­mat­ics, agreed with Nwankwo, telling End­points in an email:

We no longer have to ask the ques­tion “Is there a struc­ture?” but rather “How use­ful is the struc­ture we have?” Right now, not every AF2 mod­el has high enough con­fi­dence to be used in struc­ture-based drug de­sign. As a com­mu­ni­ty, we can now fo­cus on strate­gi­cal­ly solv­ing ex­per­i­men­tal struc­tures to feed Al­phaFold2 the da­ta it re­quires to pre­dict all known pro­tein struc­tures with high con­fi­dence.

Radoux added that “This data­base…al­lows us to ex­pand the known drug­gable genome through po­ten­tial­ly re­veal­ing pre­vi­ous­ly un­known drug bind­ing sites — put an­oth­er way, it could sig­nif­i­cant­ly ex­pand the op­tions sci­en­tists have to find new, nov­el med­i­cines for pre­vi­ous­ly un­solved med­ical chal­lenges.”

Deep­Mind and the EBI al­so point­ed out sev­er­al case stud­ies, with one, in par­tic­u­lar, fo­cus­ing on an­tibi­ot­ic re­sis­tance out of the Uni­ver­si­ty of Col­orado, Boul­der. Ac­cord­ing to the com­pa­nies, two re­searchers had been try­ing to ver­i­fy, with­out much luck, pro­tein struc­tures us­ing crys­tal­log­ra­phy, a method to de­fine pro­tein struc­tures in an ex­per­i­men­tal set­ting. Part of the is­sue, ac­cord­ing to the case study, was that no sim­i­lar pro­tein struc­ture was avail­able to be de­fined as a start­ing point.

Af­ter Al­phaFold pro­vid­ed a mod­el that crys­tal­log­ra­phy could ver­i­fy, the re­searchers “were able to iden­ti­fy a bac­te­r­i­al pro­tein struc­ture in half an hour that had been elu­sive for 10 years,” the com­pa­nies said.

Er­ic Topol of Scripps Re­search said in a state­ment that the newest de­vel­op­ment with Al­phaFold will now al­low “more bi­o­log­i­cal mys­ter­ies to be solved each day.”

By: Paul Schloesser

Read more here.