← Back to Innovations
Biomedical Sciences Model Scientific Experimental

About TamGen

TamGen is a transformer-based chemical language model for target-aware drug design. Conditioned on a protein target, it generates novel small-molecule candidates or optimizes existing ones using target-aware molecular fragments and is trained on large curated chemical databases. The model is tuned to produce synthesizable candidates with desirable potency profiles and reduced off-target liabilities, and it has been applied to high-impact infectious diseases including tuberculosis.

TamGen attacks the molecule-optimization bottleneck in drug discovery, where most of the time and cost lies not in screening known compounds but in iteratively exploring chemical space around promising scaffolds. Conditioning generation on the target steers the search toward compounds with the right pharmacology while preserving synthetic feasibility. As part of Microsoft AI for Science’s biomedical portfolio, TamGen is explicitly aimed at diseases of high global burden, including those underserved by traditional pharma R&D, and pairs naturally with structural predictors such as RosettaFold3 and ensemble models such as BioEmu.

Key capabilities

  • Target-aware molecular fragment generation
  • Transformer-based chemical language model
  • Generates novel compounds or optimizes existing molecules
  • Accelerates drug discovery for infectious diseases like tuberculosis
  • Foundry-hosted with an open PyTorch implementation
Technology Stack
PyTorch Chemical Language Models
Technology Stack
PyTorch Chemical Language Models