Cerebras has the world’s fastest inference product that runs on its Wafer scale engines. It is 20 times faster than GPU solutions based on Nvidia chips.
Cerebras provides a SDK to use its inference solution. To use the JavaScript SDK, you need an API key from Cerebras. You can get an API key from Cerebras Inference site.
Create a new JavaScript project and install the SDK.
npm install @cerebras/cerebras_cloud_sdk
Create a new JavaScript file - index.js. Copy the following code into it.
const Cerebras = require('@cerebras/cerebras_cloud_sdk');
const cerebras = new Cerebras({
apiKey: 'your api key'
});
const messages = [
{
"role": "system",
"content": "Provide helpful answers"
},
{
role: "user",
content: "what is the meaning of life and the universe?"
}
];
const resp = await cerebras.chat.completions.create({
messages,
model: 'llama3.1-70b',
stream: false,
max_completion_tokens: 1024,
temperature: 0.2,
top_p: 1
});
console.log(resp.choices[0].message || '');
The SDK is Open AI compatible. So, you should be familiar with the above code. The code creates a prompt and get a response from the SDK. It finally prints out the response. To run, use the below command.
node index.js
Here is one nice output after I tweaked the prompt and max_completion_tokens:
{
content: "The meaning of life and the universe is a subject of ongoing debate and inquiry. Many theories exist, but no consensus. Some believe it's to seek happiness, while others think it's to find purpose or fulfill a higher calling.",
role: 'assistant'
}
If you want to try out Cerebras inference without an API key, visit the Cerebras chat app. While you are at it, try the voice mode as well.