Bolt-on governance with Pulumi¶
When DataRobot refers to "Bolt on Governance", it means wrapping a large language model endpoint in a DataRobot deployment, allowing you to engage with it through a DataRobot API token. In exchange for creating a deployment around the LLM, you get all of the benefits of DataRobot MLOps such as text drift monitoring, request history, and usage statistics.
This notebook outlines how to use pulumi to create a deployment endpoint that interfaces with a large language model.
Initialize the environment¶
As a preliminary step, initialize our environment and make sure the LLM credentials work.
import os
import datarobot as dr
from openai import AzureOpenAI
os.environ["PULUMI_CONFIG_PASSPHRASE"] = "default"
assert (
"DATAROBOT_API_TOKEN" in os.environ
), "Please set the DATAROBOT_API_TOKEN environment variable"
assert "DATAROBOT_ENDPOINT" in os.environ, "Please set the DATAROBOT_ENDPOINT environment variable"
assert "OPENAI_API_BASE" in os.environ, "Please set the OPENAI_API_BASE environment variable"
assert "OPENAI_API_KEY" in os.environ, "Please set the OPENAI_API_KEY environment variable"
assert "OPENAI_API_VERSION" in os.environ, "Please set the OPENAI_API_VERSION environment variable"
dr_client = dr.Client()
def test_azure_openai_credentials():
"""Test the provided OpenAI credentials."""
model_name = os.getenv("OPENAI_API_DEPLOYMENT_ID")
try:
client = AzureOpenAI(
api_key=os.getenv("OPENAI_API_KEY"),
azure_endpoint=os.getenv("OPENAI_API_BASE"),
api_version=os.getenv("OPENAI_API_VERSION"),
)
client.chat.completions.create(
messages=[{"role": "user", "content": "hello"}],
model=model_name, # type: ignore[arg-type]
)
except Exception as e:
raise ValueError(
f"Unable to run a successful test completion against model '{model_name}' "
"with provided Azure OpenAI credentials. Please validate your credentials."
) from e
test_azure_openai_credentials()
Set up a project¶
Configure the functions below to create and build or destroy the Pulumi stack.
from pulumi import automation as auto
def stack_up(project_name: str, stack_name: str, program: callable) -> auto.Stack:
# create (or select if one already exists) a stack that uses our inline program
stack = auto.create_or_select_stack(
stack_name=stack_name, project_name=project_name, program=program
)
stack.refresh(on_output=print)
stack.up(on_output=print)
return stack
def destroy_project(stack: auto.Stack):
"""Destroy pulumi project"""
stack_name = stack.name
stack.destroy(on_output=print)
stack.workspace.remove_stack(stack_name)
print(f"stack {stack_name} in project removed")
Declarative LLM deployment¶
Deploying a bolt-on governance model isn't complicated, but it is more involved than creating a custom deployment for a standard classification model for three reasons. First, you want to set runtime parameters around the deployment specifying the LLM endpoint and other metadata. Second, you want to set up and apply a credential for model metadata that should be hidden, such as the API Token. The hidden credential you create will actually end up as one of our runtime parameters. Finally, you want to use a special environment called a Serverless Prediction Environment that works well for sending API calls through a deployment. You need to set up one of these specifically for this model.
Once you set up your credentials and runtime parameters, put your source code onto DataRobot, register the model and then initialize the deployment.
import pulumi
import pulumi_datarobot as datarobot
def setup_runtime_parameters(
credential: datarobot.ApiTokenCredential,
) -> list[datarobot.CustomModelRuntimeParameterValueArgs]:
"""Setup runtime parameters for bolt on goverance deployment.
Each runtime parameter is a tuple trio with the key, type, and value.
Args:
credential (datarobot.ApiTokenCredential):
The DataRobot credential representing the LLM api token
"""
return [
datarobot.CustomModelRuntimeParameterValueArgs(
key=key,
type=type_,
value=value, # type: ignore[arg-type]
)
for key, type_, value in [
("OPENAI_API_KEY", "credential", credential.id),
("OPENAI_API_BASE", "string", os.getenv("OPENAI_API_BASE")),
("OPENAI_API_VERSION", "string", os.getenv("OPENAI_API_VERSION")),
(
"OPENAI_API_DEPLOYMENT_ID",
"string",
os.getenv("OPENAI_API_DEPLOYMENT_ID"),
),
]
]
def make_bolt_on_governance_deployment():
"""
Deploy a trained model onto DataRobot's prediction environment.
Upload source code to create a custom model version.
Then create a registered model and deploy it to a prediction environment.
"""
# ID for Python 3.11 Moderations Environment
python_environment_id = "65f9b27eab986d30d4c64268"
custom_model_name = "App Template Minis - OpenAI LLM"
registered_model_name = "App Template Minis - OpenAI Registered Model"
deployment_name = "App Template Minis - Bolt on Goverance Deployment"
prediction_environment = datarobot.PredictionEnvironment(
resource_name="App Template Minis - Serverless Environment",
platform=dr.enums.PredictionEnvironmentPlatform.DATAROBOT_SERVERLESS,
)
llm_credential = datarobot.ApiTokenCredential(
resource_name="App Template Minis - OpenAI LLM Credentials",
api_token=os.getenv("OPENAI_API_KEY"),
)
runtime_parameters = setup_runtime_parameters(llm_credential)
deployment_files = [
("./model_package/requirements.txt", "requirements.txt"),
("./model_package/custom.py", "custom.py"),
("./model_package/model-metadata.yaml", "model-metadata.yaml"),
]
custom_model = datarobot.CustomModel(
resource_name=custom_model_name,
runtime_parameter_values=runtime_parameters,
files=deployment_files,
base_environment_id=python_environment_id,
target_type=dr.enums.TARGET_TYPE.TEXT_GENERATION,
target_name="content",
language="python",
replicas=2,
)
registered_model = datarobot.RegisteredModel(
resource_name=registered_model_name,
custom_model_version_id=custom_model.version_id,
)
deployment = datarobot.Deployment(
resource_name=deployment_name,
label=deployment_name,
registered_model_version_id=registered_model.version_id,
prediction_environment_id=prediction_environment.id,
)
pulumi.export("serverless_environment_id", prediction_environment.id)
pulumi.export("custom_model_id", custom_model.id)
pulumi.export("registered_model_id", registered_model.id)
pulumi.export("deployment_id", deployment.id)
Run the stack¶
Running the stack takes the files that are in the model_package
directory, puts them onto DataRobot as a custom model, registers that model, and deploys the result.
project_name = "AppTemplateMinis-BoltOnGovernance"
stack_name = "MarshallsExtraSpecialLargeLanguageModel"
stack = stack_up(project_name, stack_name, program=make_bolt_on_governance_deployment)
Interact with outputs¶
Now that you have a bolt-on goverance deployment, you can interact with it directly through the OpenAI SDK. The only difference is that you pass the DataRobot API Token instead of your LLM credentials.
from pprint import pprint
from openai import OpenAI
deployment_id = stack.outputs().get("deployment_id").value
deployment_chat_base_url = dr_client.endpoint + f"/deployments/{deployment_id}/"
client = OpenAI(api_key=dr_client.token, base_url=deployment_chat_base_url)
messages = [
{"role": "user", "content": "Why are ducks called ducks?"},
]
response = client.chat.completions.create(messages=messages, model="gpt-4o")
pprint(response.choices[0].message.content)
Clear your work¶
Use the following cell to shut down the stack, thereby deleting any assets created in DataRobot.
destroy_project(stack)
How does scoring code work?¶
The following cell contains code used to upload so that DataRobot knows how to interact with our model. The bolt-on goverance model only requires you to define hooks for load_model
and chat1
, but you can add others too.
from IPython.display import Code
Code(filename="./model_package/custom.py", language="python")