Streaming Output

Following the tutorial on QuickStart, instead of waiting the whole reasoning to be completed, you can stream intermediate tokens.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.activeloop.ai/",
    api_key=os.getenv('ACTIVELOOP_TOKEN')
)

stream = client.chat.completions.create(
    model="activeloop-l0",
    messages=[{"role": "user", "content": "what is the AIME score of DeepSeek R1?"}],
    stream=True,
)

for event in stream:
    print(event)

Last updated