MIT-Ingenieure entwickeln AI-Modell zur Vorhersage der Löslichkeit von Molekülen in organischen Lösungsmitteln

Ein Team von Chemieingenieuren am MIT hat mit seinem neuartigen Modell für maschinelles Lernen einen großen Fortschritt bei der Rationalisierung der Synthese von chemischen Verbindungen, einschließlich Arzneimitteln, erzielt. Dieses leistungsstarke Werkzeug kann die Löslichkeit eines Moleküls in verschiedenen organischen Lösungsmitteln vorhersagen - ein entscheidender Schritt in der Arzneimittelentwicklung. Dies kann nicht nur die Herstellung neuer Behandlungen beschleunigen, sondern bietet auch die Möglichkeit, sicherere und nachhaltigere Optionen für industrielle Lösungsmittel zu finden.

Die Optimierung der Lösungsmittelauswahl

The importance of selecting the most suitable solvent in chemical synthesis cannot be overstated. With a plethora of organic solvents like ethanol and acetone available, each differing in environmental impact and effectiveness, the ability to make an accurate and efficient choice is paramount. This is where the innovative model, spearheaded by graduate students Lucas Attia and Jackson Burns, comes into play. “Predicting solubility really is a rate-limiting step in synthetic planning and manufacturing of chemicals, especially drugs. There’s been a longstanding interest in being able to make better predictions of solubility,” explains Attia.

Das Modell, liebevoll genannt FastSolv, is freely accessible and is already being utilized by several research labs and corporations. One significant benefit it provides is the ability to spot less hazardous substitutes to the commonly employed industrial solvents. “There are some solvents which are known to dissolve most things. They’re useful, but they’re damaging to the environment and to people. Our model is extremely useful in identifying the next-best solvent, which is hopefully much less damaging,” Burns clarifies.

Geburt und Entwicklung eines spielverändernden Modells

Interessanterweise war diese Erfindung das Ergebnis eines Lehrplanprojekts am MIT, bei dem maschinelles Lernen mit Chemieingenieurwesen kombiniert wurde. Zuvor war das Abraham-Solvationsmodell das Mittel der Wahl, um die Löslichkeit auf der Grundlage der Molekülstruktur abzuschätzen, auch wenn seine Genauigkeit begrenzt war.

In a bid to overcome such constraints, MIT’s Green Lab introduced SolProp in 2022. Although it utilized thermodynamic properties to predict solubility, it faltered when confronting unfamiliar molecules, a common occurrence in drug development. The breakthrough, it turned out, was just around the corner with the release of a comprehensive dataset—BigSolDB in 2023. Comprising solubility data from nearly 800 scientific papers, it enveloped close to 800 molecules and upwards of 100 solvents. Harnessing this vast reservoir, Attia and Burns built and trained two models—FastProp and ChemProp—with over 40,000 data points that even incorporated temperature impacts.

Beeindruckende Ergebnisse und Zukunftspotenzial

The duo was pleasantly surprised to note that both models excelled in performance, offering predictions that were two to three times more accurate than SolProp, notably capturing temperature-dependent solubility changes. “We were blown away to see that the static and learned embeddings were statistically indistinguishable in performance. That indicates the data quality is the main bottleneck, not the model architecture,” Burns shared.

Indeed, there’s scope for even better outcomes with more consistent experimental data, as variations in the solubility tests performed by different labs create noise and data variability. Mitigating this could further enhance the model’s effectiveness. “One of the big limitations of using these kinds of compiled datasets is that different labs use different methods and experimental conditions,” Attia highlighted. Despite its limitations, FastSolv, known for its speed and user-friendly nature, is already proving its universal applicability across pharmaceutical development, materials science, and green chemistry initiatives. Burns adds, “There are applications throughout the drug discovery pipeline. We’re also excited to see, outside of formulation and drug discovery, where people may use this model.”

Diese bahnbrechende Entwicklung wurde durch die Finanzierung des US-Energieministeriums ermöglicht. Eine ausführlichere Untersuchung dieses Durchbruchs finden Sie im Originalartikel unter MIT-Nachrichten.

Max Krawiec

Teilen Sie
Herausgegeben von
Max Krawiec

Diese Website verwendet Cookies.