Most LLM integrations start with a prompt string in the code. You write the instructions, define a JSON schema for structured output, parse the response, and map the fields to your domain objects. It works fine — until you need to change what the LLM looks at.
In our system, cameras analyze different things depending on their environment. A restaurant camera needs to detect table status, cleanliness, and staff presence. A retail camera tracks foot traffic and shelf stock levels. An office camera monitors desk occupancy.
Our first version had all of this hardcoded. The prompt was a string literal. The JSON schema was a hand-written map. The response parser had a case statement that knew about each observation type. It worked for the first three customers, but we had new types coming in weekly as different environments onboarded. Every new observation meant:
- Adding the field to the prompt template
- Adding the field to the JSON schema
- Adding parsing logic for the new field type
- Adding a database column or schema for the output
- Deploying
That’s five places to update for every new observation type. More concerning, steps 1–3 had to stay perfectly in sync — the prompt tells the LLM what to look for, the schema constrains the output, and the parser maps the result. Any drift between them meant silent failures: the LLM returns a field the parser doesn’t expect, or the schema constrains a field the prompt doesn’t mention.
We needed to make all three derive from the same source of truth.
Observation blocks as data
The idea is to encode observation types in the database instead of the code. The core unit is an ObservationBlock — a reusable component that defines what to look for:
schema "observation_blocks" do
field(:name, :string) # e.g., "stock_levels", "queue_status"
field(:description, :string)
field(:instruction_template, :string) # Prompt instructions for the LLM
belongs_to(:organization, Organization)
has_many(:outputs, ObservationOutput, on_delete: :delete_all)
end
A block’s instruction_template is the natural language instruction — “Count the number of people visible in the frame” or “Assess the cleanliness of visible surfaces on a scale from 1-5.”
Each block has one or more ObservationOutputs — the individual data points to extract, each with a JSON Schema definition:
schema "observation_outputs" do
field(:name, :string) # e.g., "person_count", "cleanliness_score"
field(:description, :string)
field(:schema, :map) # JSON Schema: {"type": "integer", "minimum": 0}
belongs_to(:observation_block, ObservationBlock)
end
The schema field is a standard JSON Schema map, validated at write time with ExJsonSchema.Schema.resolve/1. A stock_levels block might have outputs like:
stock_count—{"type": "integer", "minimum": 0, "maximum": 100}stock_status—{"type": "string", "enum": ["critical", "low", "adequate", "full"]}
Blocks are composed into AnalysisProfiles through a join table with ordering:
schema "analysis_profiles" do
field(:name, :string)
field(:active, :boolean, default: true)
has_many(:profile_blocks, ProfileObservationBlock, preload_order: [asc: :position])
has_many(:observation_blocks, through: [:profile_blocks, :observation_block])
end
A camera is assigned one profile. Every 30 seconds when a new capture arrives, the pipeline reads the profile’s blocks and generates everything from them — the prompt, the schema, and the response mapping. The same block definitions drive all three, so they can’t drift.
Building the JSON schema
This was the first piece we built — SchemaBuilder converts a profile’s blocks into the JSON Schema that constrains the LLM’s structured output. The key decision here was to derive everything from the block and output definitions, rather than maintaining a separate schema:
def build_gemini_request_schema(%AnalysisProfile{} = profile) do
profile = Repo.preload(profile, profile_blocks: [observation_block: :outputs])
observation_properties = build_profile_properties(profile.profile_blocks)
properties =
observation_properties
|> Map.put("summary", %{"type" => "string", "maxLength" => 500, ...})
|> Map.put("health_score", %{"type" => "integer", "minimum" => 0, "maximum" => 100, ...})
schema = %{
"type" => "object",
"properties" => properties,
"required" => required_blocks(profile.profile_blocks) ++ ["summary", "health_score"]
}
{:ok, schema}
end
Each observation block becomes a nested object in the schema, with its outputs as required properties:
defp build_profile_properties(profile_blocks) do
Enum.into(profile_blocks, %{}, fn pb ->
block = pb.observation_block
{block.name, %{
"type" => "object",
"description" => block.instruction_template,
"properties" => build_output_properties(block.outputs),
"required" => required_outputs(block.outputs)
}}
end)
end
defp build_output_properties(outputs) do
Enum.into(outputs, %{}, fn output ->
{output.name, strip_scout_extensions(output.schema)}
end)
end
For a profile with person_detection and activity_tracking blocks, the generated schema looks like:
{
"type": "object",
"properties": {
"person_detection": {
"type": "object",
"description": "Detect if any people are present in the image",
"properties": {
"person_present": {"type": "boolean"},
"person_count": {"type": "integer", "minimum": 0}
},
"required": ["person_present", "person_count"]
},
"activity_tracking": {
"type": "object",
"description": "Track the primary activity being performed",
"properties": {
"primary_activity": {"type": "string", "enum": ["working", "idle", "away", "meeting"]},
"activity_details": {"type": "string"}
},
"required": ["primary_activity", "activity_details"]
},
"summary": {"type": "string", "maxLength": 500},
"health_score": {"type": "integer", "minimum": 0, "maximum": 100}
},
"required": ["person_detection", "activity_tracking", "summary", "health_score"]
}
The block names become the JSON keys. The output schemas become the nested properties. The LLM is constrained to return exactly this structure via Gemini’s structured output mode — no parsing ambiguity.
One detail worth noting: observation outputs can carry Scout-specific metadata in x-scout-* schema extensions (used for scoring rules downstream). strip_scout_extensions/1 recursively removes these before sending the schema to Gemini, since the LLM API would reject unknown schema keys. We went back and forth on whether to keep this metadata in the same schema or in a separate column — having it inline is convenient for the downstream consumers, but the stripping step feels a bit messy. It’s worked so far though.
Generating the prompt
PromptGenerator assembles the full prompt from the same block definitions. This is where things get interesting — the prompt isn’t just block instructions, it also includes the camera’s active alert monitors:
def generate_profile_prompt(%AnalysisProfile{} = profile, opts \\ []) do
profile_blocks = Observations.get_profile_blocks_internal(profile.id)
alert_monitors = Keyword.get(opts, :alert_monitors, [])
prompt_parts = [
generate_header(profile, opts),
generate_monitored_conditions(alert_monitors),
generate_instructions(profile_blocks),
generate_output_format(profile_blocks),
generate_footer()
]
{:ok, Enum.join(Enum.reject(prompt_parts, &(&1 == "" or is_nil(&1))), "\n\n")}
end
The instructions section iterates blocks and formats them with output names:
defp generate_instructions(profile_blocks) do
profile_blocks
|> Enum.with_index(1)
|> Enum.map(fn {pb, index} ->
block = pb.observation_block
output_names = Enum.map_join(block.outputs, ", ", & &1.name)
"#{index}. **#{humanize_block_name(block.name)}**: #{block.instruction_template} (Output: #{output_names})"
end)
|> Enum.join("\n")
end
The alert monitor feedback loop is one of the things I’m most pleased with in this design. If a customer has set up an alert like “trigger HIGH severity when person_count > 5”, the prompt tells the LLM about it:
MONITORED CONDITIONS (these are scored as problems when they occur):
- person_count: Alert when is greater than 5 → HIGH severity
When these monitored conditions are present, they should heavily influence
the health_score downward.
This means the LLM knows which observations matter most for this specific camera’s business context — it weights its health score accordingly. The alert monitors are themselves data-driven, referencing observation outputs by ID, so they automatically work with any block type. It’s data all the way down.
Parsing the response back to blocks
This was the part that validated the whole approach for me. ResponseParser maps the LLM’s structured output back to observation records without any block-specific code. It uses the same block definitions that generated the schema:
def parse_llm_response(parsed_json, profile, analysis_id) when is_map(parsed_json) do
with {:ok, schema} <- SchemaBuilder.build_gemini_request_schema(profile),
:ok <- validate_with_schema(parsed_json, schema),
{:ok, observations} <- build_observations(parsed_json, profile, analysis_id) do
summary = Map.get(parsed_json, "summary")
health_score = Map.get(parsed_json, "health_score")
{:ok, %{observations: observations, summary: summary, health_score: health_score}}
end
end
First, it rebuilds the schema and validates the response against it using ExJsonSchema. Then build_observations iterates through the profile’s blocks and looks up each value by name:
defp build_observations(validated_json, profile, analysis_id) do
observations =
profile.profile_blocks
|> Enum.flat_map(fn pob ->
block = pob.observation_block
block_results = validated_json[block.name] || %{}
Enum.map(block.outputs, fn output ->
raw_value = Map.get(block_results, output.name)
%{
observation_block_id: block.id,
observation_output_id: output.id,
value: raw_value,
metadata: %{"raw_value" => raw_value, "extracted_at" => DateTime.utc_now()}
}
end)
end)
{:ok, observations}
end
The key insight: block.name is used to find the block’s data in the JSON response, and output.name is used to find each value within it. The same names that generated the schema are used to parse the response. No mapping table, no case statements, no block-specific parser code.
Each observation record stores foreign keys to both the observation_block and observation_output it came from, plus metadata with the raw value and extraction timestamp. This makes it trivial to query downstream — “show me all stock_status observations for camera X in the last hour.”
Adding a new observation type
This is the payoff. Say a retail customer needs shelf stock monitoring:
- Create the block:
%ObservationBlock{
name: "stock_levels",
instruction_template: "Assess the stock levels of visible shelves. Look for empty sections, sparse areas, and fully stocked displays.",
organization_id: org.id
}
- Create the outputs:
%ObservationOutput{
observation_block_id: block.id,
name: "stock_status",
schema: %{"type" => "string", "enum" => ["critical", "low", "adequate", "full"]}
}
%ObservationOutput{
observation_block_id: block.id,
name: "empty_shelf_count",
schema: %{"type" => "integer", "minimum" => 0}
}
- Attach to the camera’s profile:
%ProfileObservationBlock{
profile_id: profile.id,
observation_block_id: block.id,
position: 3
}
That’s it. No code changes, no deployment. The next capture from any camera using this profile will automatically:
- Include “stock levels” in the generated prompt
- Include
stock_statusandempty_shelf_countin the JSON schema - Validate the LLM response includes them
- Create observation records linking back to the block and outputs
- Evaluate against any alert monitors set on those outputs
The output’s JSON Schema ({"type": "string", "enum": [...]}) constrains the LLM response at the API level — Gemini’s structured output mode enforces it. If you need a new output type tomorrow (a float, an array, a nested object), you define the schema in the database record and the pipeline handles it.
When things go wrong
Even with structured output constraints, the LLM occasionally returns something unexpected. Our first approach was to reject the entire response on any validation failure — but that meant a single bad output in a 10-output analysis threw away the other 9 valid observations. That felt too aggressive.
ResponseParser now validates the complete response against the generated JSON Schema before extracting observations:
defp validate_with_schema(json, schema) do
resolved_schema = ExJsonSchema.Schema.resolve(schema)
case ExJsonSchema.Validator.validate(resolved_schema, json) do
:ok -> :ok
{:error, validation_errors} -> {:error, {:validation_failed, validation_errors}}
end
end
If validation fails, the error includes the specific field paths that didn’t match — "person_detection.person_count: Expected integer, got string" — making it straightforward to diagnose whether the issue is a bad schema definition or an LLM hallucination.
For partial success, the observation saving layer is resilient — if some observations fail changeset validation, it saves the valid ones and logs the failures rather than rejecting the entire analysis.
The full pipeline
The Executor orchestrates everything in prepare_and_execute_llm/1:
defp prepare_and_execute_llm(%Analysis{} = analysis) do
with {:ok, profile} <- get_analysis_profile(analysis),
{:ok, camera} <- get_camera(analysis),
{:ok, image_binary} <- get_analysis_image_binary(analysis),
{:ok, prompt} <- PromptGenerator.generate_profile_prompt(profile,
camera: camera,
alert_monitors: Alerts.list_alert_monitors(camera.id, active_only: true)),
{:ok, schema} <- SchemaBuilder.build_gemini_request_schema(profile),
{:ok, response} <- AI.analyze_image(image_binary, prompt, schema: schema),
{:ok, %{observations: observations, summary: summary, health_score: health_score}} <-
ResponseParser.parse_llm_response(response, profile, analysis.id) do
{:ok, observations, summary, health_score}
end
end
The flow:
Profile (with blocks + outputs)
→ SchemaBuilder generates JSON Schema from block definitions
→ PromptGenerator assembles prompt from block instructions + alert context
→ AI.analyze_image sends image + prompt + schema to Gemini
→ Gemini returns structured JSON matching schema
→ ResponseParser validates response, maps values back to blocks by name
→ Executor saves observations + summary + health_score
→ Alert Evaluator checks observations against monitors
Each step in the pipeline is generic. SchemaBuilder doesn’t know what “person detection” means — it just knows how to turn blocks with outputs into a JSON Schema. ResponseParser doesn’t know what a “stock level” is — it just iterates blocks, looks up values by name, and creates observation records.
Trade-offs
This approach has some honest downsides:
- The instruction templates are hard to iterate on. When you’re tuning a prompt, you want to tweak wording and immediately see the effect. Having the instructions in the database means you need a UI or a console session to edit them — not a text file you can quickly change and redeploy. We built an admin UI for this, but it’s still more friction than editing a string in your codebase.
- Schema complexity is limited by JSON Schema. Some observation types would benefit from conditional logic (“if
person_presentis true, requireperson_count”) — JSON Schema supports this withif/then/else, but it gets unwieldy fast. We’ve kept our schemas simple and relied on the LLM’s structured output mode to handle most of the constraint enforcement. - Debugging is harder. When an observation comes back wrong, you need to check the block definition, the generated prompt, the generated schema, and the LLM response to figure out where the issue is. With hardcoded prompts, you’d just look at the code. The data-driven approach trades visibility for flexibility.
- Performance overhead of generating schemas on every call. We rebuild the schema and prompt for every analysis rather than caching them. The generation is fast (sub-millisecond), but if profiles changed rarely, caching would be more efficient. We chose simplicity — always regenerate — so we never have to worry about cache invalidation when a block is updated.
The only place that knows the semantics is the database: the block names, the instruction templates, the output schemas. Change the data, change the behavior.
The entire pipeline — schema generation, prompt assembly, LLM API calls, response parsing, observation saving, alert evaluation — runs in a single Elixir application. There’s no Python microservice for the LLM integration, no polyglot bridge, no separate prompt management tool. Ecto gives us the data layer, with chains give us the pipeline composition, and the LLM client is just another HTTP call via Req. It’s one codebase, one deploy, one set of patterns. For our use case — where new observation types are a weekly occurrence — the flexibility has been worth the trade-offs.