Skip to content

Changelog

2024-09-26

Workers AI Birthday Week 2024 announcements
  • Meta Llama 3.2 1B, 3B, and 11B vision is now available on Workers AI
  • @cf/black-forest-labs/flux-1-schnell is now available on Workers AI
  • Workers AI is fast! Powered by new GPUs and optimizations, you can expect faster inference on Llama 3.1, Llama 3.2, and FLUX models.
  • No more neurons. Workers AI is moving towards unit-based pricing
  • Model pages get a refresh with better documentation on parameters, pricing, and model capabilities
  • Closed beta for our Run Any* Model feature, sign up here
  • Check out the product announcements blog post for more information
  • And the technical blog post if you want to learn about how we made Workers AI fast

2024-07-23

Meta Llama 3.1 now available on Workers AI

Workers AI now suppoorts Meta Llama 3.1.

2024-07-11

New community-contributed tutorial

2024-06-27

Introducing embedded function calling

2024-06-19

Added support for traditional function calling
  • Function calling is now supported on enabled models
  • Properties added on models page to show which models support function calling

2024-06-18

Native support for AI Gateways

Workers AI now natively supports AI Gateway.

2024-06-11

Deprecation announcement for `@cf/meta/llama-2-7b-chat-int8`

We will be deprecating @cf/meta/llama-2-7b-chat-int8 on 2024-06-30.

Replace the model ID in your code with a new model of your choice:

If you do not switch to a different model by June 30th, we will automatically start returning inference from @cf/meta/llama-3-8b-instruct-awq.

2024-05-29

Add new public LoRAs and note on LoRA routing
  • Added documentation on new public LoRAs.
  • Noted that you can now run LoRA inference with the base model rather than explicitly calling the -lora version

2024-05-17

Add OpenAI compatible API endpoints

Added OpenAI compatible API endpoints for /v1/chat/completions and /v1/embeddings. For more details, refer to Configurations.

2024-04-11

Add AI native binding
  • Added new AI native binding, you can now run models with const resp = await env.AI.run(modelName, inputs)
  • Deprecated @cloudflare/ai npm package. While existing solutions using the @cloudflare/ai package will continue to work, no new Workers AI features will be supported. Moving to native AI bindings is highly recommended