Roles in AI-Native Engineering

Understanding the key roles and responsibilities in building trustworthy AI-native systems

We Are Not AI Engineers. We Are Expectation & Integration Engineers.

This statement captures the mindset shift we want in the team: our primary job isn’t inventing new AI models, but engineering the expectations, context, and integrations around AI to solve real problems. Each team member will wear multiple hats to ensure the system is successful and reliable. Here are the key roles (often overlapping) that team members will take on, with checklists for onboarding into each role:

Expectations Engineer

Gather and Document User Requirements: Work with stakeholders to understand what problems the AI should solve. Write these in clear user story format (e.g., “As a customer support rep, I want to quickly find the refund policy details by asking the AI”).
Define Acceptance Criteria: For each requirement, specify how we’ll know if the AI solution works. Write UAT test cases or “given-when-then” scenarios before implementation.
Use AI to Brainstorm Edge Cases: Leverage GPT-4 or similar to suggest additional scenarios or test questions users might ask, to ensure completeness. Review and refine these – don’t blindly accept, but use them to not miss hidden requirements.
Collaborate with QA/Users: Ensure non-developers review the expectations document and UAT plan. Facilitate feedback sessions to adjust expectations to what end-users actually need.
Prioritize Clarity Over Technicality: Write all expectation documents in simple language (easy for non-English speakers to understand), focusing on outcomes. Avoid or explain jargon.
Continuous Revalidation: Whenever scope changes or new insights are gained during development, update the expectations and UAT docs and get buy-in again.

Source Document Preparation Engineer

Master Markdown and Conversion Tools: Be comfortable with Markdown syntax and tools like MarkItDown. Practice converting various file types to Markdown and cleaning up the output.
Master semantic HTML and Conversion Tools: Be comfortable with HTML syntax and typical CSS selectors for querying HTML. Practice converting various file types to HTML and cleaning up the output.
Establish Document Structure: For each source, determine the logical sections (headings, subheadings, lists). Ensure the Markdown or HTML reflects this hierarchy accurately. If unsure, ask domain experts about the document’s structure.
Apply YAML Frontmatter (or HTML <meta> tags): Create a consistent template for metadata (e.g., always include title, category, version, last_updated). Ensure every Markdown or HTML file starts with this and is filled in properly.
Manual Cleanup and Verification: After conversion, manually read through the Markdown or HTML (in a browser). Fix any broken lists, misrecognized characters, or formatting issues. Confirm that important information (like tables or images) is preserved in text form (e.g., images might need alt-text or transcribed text).
Chunking Strategy: Decide how to break documents into referenceable chunks. This could be by heading (each top-level section is a chunk) or by paragraph. Ensure each chunk can stand alone to some extent and has an identifier (like a slug or number).
Quality Check Each Document: Before adding to the knowledge base, double-check that the document is complete (no missing pages or sections), correct (no OCR errors or typos introduced), and concise (remove any pages not relevant, like a blank appendix). Get a second pair of eyes if possible (peer review the Markdown or HTML).
Document the Process: Keep notes on how each type of document was processed (e.g., “Playbooks: used custom script to split steps into list items.”). This builds institutional knowledge and helps onboarding others.

Chunking & Sectioning Expert

Design Deterministic Chunking Rules: Create rules for how large a text chunk should be (e.g., ~200 tokens) and how to split content (by sentences, by semantic unit, etc.). Ensure the same input always yields the same chunks – this aids caching and citation stability.
Implement Chunking Scripts: Write scripts (in Python/Deno/Node) to automate chunking according to the rules. For example, a script might read a Markdown or HTML file and output a JSON with sections and subsections each as separate entries.
Include Section References: Make sure each chunk carries metadata linking it to the source document and section (e.g., a file name and section heading). This could be an ID like “PolicyX_sec_3.2”. These IDs will be used in citations.
Optimize for Retrieval: Consider how the chunks will be retrieved. They should be neither too large (to avoid irrelevant info in a chunk) nor too small (to avoid losing context). Fine-tune the size by testing: e.g., if a question about a section yields two half-overlapping chunks, maybe they should have been one.
Prevent Overlap & Gaps: Ensure chunks cover the whole document without omissions, and consider slight overlaps if needed for context (but be careful with overlapping text – it can confuse the AI if duplicates appear). A common strategy is overlapping windows for text, which you might implement if necessary.
Manual Review of Chunks: Even after scripting, spot-check the chunk outputs for a few documents. Do the chunk boundaries make sense? Would a user understand each chunk on its own? Adjust rules as needed and rerun.
Logging and Traceability: When integrating with the backend, ensure that if a chunk is fetched, you can trace it back to the source easily. For example, keep a mapping of chunk ID to original document file and line range. This is important when debugging why a certain piece of text was retrieved.

Ontology Engineer

Identify Key Entities and Relationships: As you familiarize with documents, note important entities (people, products, processes, etc.) and how they relate. E.g., “Product A is part of Category X”, “Department Y handles Process Z”.
Cross-Reference Content: Build a simple ontology or knowledge graph that links related information across documents. For example, if “Client Onboarding Process” is mentioned in both a sales guide and a support manual, that’s a link.
Use AI to Suggest Connections: You can prompt an LLM to extract entities and relations from text, but verify them yourself. The result might be a set of triples like (Process Z – defined_in – Doc3 Section 2).
Map Ontology to Implementation: If feasible, incorporate this structure in the system. This could be as simple as tags in the YAML frontmatter (“tags: [onboarding, sales]”) that connect documents on similar topics. Or, if using a graph database, input these relations there.
Leverage Ontology in Context Engineering: Use the ontology to improve retrieval. For instance, if a question mentions “Client Onboarding”, your system could know to pull not just the section on onboarding from the policy, but also a related FAQ from another doc due to the relationship. Plan how the MCP agent or retrieval logic can utilize these links (this might come later as an enhancement).
Guardrail via Knowledge Graph: Recognize that a knowledge graph can act as a fact-check layer. For example, if the AI says “Product A was launched in 2020”, and your ontology knows launch dates, you can verify and correct the AI. Think about where such validations might be critical for your domain (dates, numbers, hierarchical info).
Document the Ontology: Clearly document the schema of your ontology (even if informal). List the entity types and relation types you considered. This helps future team members and also in discussions with domain experts to validate that your understanding of the domain is correct.

Context Engineer

Master Prompt Design: Be skilled at writing effective system and user prompts. Include clear instructions, role definitions, and formatting requirements in the system prompt. E.g., “You are a corporate Q&A assistant. Always answer with a polite tone and provide a source citation for each fact you state.”
Define Few-Shot Examples if Needed: Sometimes giving the model an example QA pair improves reliability. Design a few, if appropriate, using real document content. For instance, provide an example question about a document and a well-structured answer with a citation as a guide.
Manage Context Window Usage: Understand token limits and plan what to include. As a context engineer, you decide, for each query, which document chunks (and how many) to include. Develop strategies such as: always include the introduction section of a doc for context, or always include document title in the prompt, etc., to help the model.
Avoid Irrelevant Context: Too much context can confuse the model. Ensure your retrieval (or manual selection) is precise. If you see the model outputting information from an unrelated chunk, tighten the retrieval filters or adjust the prompt to ignore certain content.
Prompt Guardrails: Include instructions to handle uncertainties (e.g., “If you do not find an answer in the provided content, say you don’t know”). This is vital to prevent hallucinations when context is insufficient.
Iterative Prompt Tuning: Test the prompts with various sample questions (from UAT) and refine. Change one thing at a time (like adding a sentence “Answer in bullet points if applicable.”) and observe. Keep prompts as general as possible but as specific as necessary.
Stay Updated on LLM Behavior: Different models (GPT-4, Claude, local LLMs) have quirks. As we might switch providers for cost or privacy, ensure prompts are compatible or easily tweakable for different LLMs. Document any model-specific prompt adjustments.

Trust Engineer

Establish Verification Steps: Design how the system will verify its answers. For instance, after generating an answer, programmatically check that each citation actually supports the sentence it’s attached to (perhaps by re-matching keywords). This could be a simple regex or a semantic similarity check between answer and source.
Implement “Citation Needed” Checks: If the AI returns an answer without a citation or with an obviously incorrect citation, catch that in the backend. Decide on a policy: either refuse to answer (“I’m sorry, I don’t have sufficient information.”) or attempt a second pass.
Prevent Out-of-Scope Answers: Use a content filter for the AI’s own responses. If the user asks something unrelated to the documents (e.g., “Who won the game last night?”), the system should not answer with made-up info from nowhere. A trust engineer might implement a check: if retrieval finds nothing and the question is out-of-domain, the AI should politely decline.
Add Rule-Based Corrections: Some facts can be verified via external APIs or known constants. For example, if internal policy documents have specific versions, ensure the AI cites the correct version number (maybe cross-check the YAML metadata). Build small rules or use existing guardrail libraries to handle these.
User Feedback Loop: Provide a mechanism for users to flag answers as incorrect or unhelpful. This could be as simple as a thumbs-down button on the UI. As trust engineer, ensure this feedback is logged and results in an action (like adding the question to a test case regression suite, or notifying an admin to update documents).
Auditing and Logs: Keep comprehensive logs of AI outputs and the source content used. Periodically review a sample for accuracy. This is especially important initially to catch any hallucination the model might slip in. (For example, if you see an answer with a citation that doesn’t actually support the answer, that's a red flag to tighten the system.)
Transparency to Users: Implement features that increase user trust, such as highlighting which part of the document the answer came from when they hover on a citation. Also, consider a brief disclaimer in UI (e.g., “Answers are based on company documentation; always verify critical decisions with the source.”). As trust engineer, you advocate for these transparency measures in the product.

Integration Engineer

Master the Toolchain: Be familiar with all moving parts – LLM APIs, AnythingLLM platform, vector DB, front-end components, etc. Your job is to make them talk to each other smoothly.
Use APIs, Don’t Reinvent: Wherever possible, use existing APIs or SDKs. For example, use AnythingLLM’s REST API to query documents, use Vanna’s SDK for SQL, etc. Avoid writing low-level code if an integration exists.
Orchestrate Workflows: Write the glue code that connects user input to the AI pipeline. For example, ensure the sequence: UI call -> backend receives -> calls AnythingLLM (or our agent function) -> gets answer -> returns to UI. Handle async or streaming if we want token streaming to the UI for a better experience.
Error Handling and Retries: Integrate robust error handling. If the LLM API fails or times out, have a strategy (retry once, then give user a graceful error message). If the vector DB is down, maybe fall back to keyword search as a degraded mode. Users should see a friendly message rather than a crash.
Monitoring and Metrics: Integrate logging of performance (response times, etc.) and usage metrics (#queries per day, which docs are referenced most). This might involve linking to an analytics or just printing logs that can be parsed. It’s important for scaling and demonstrating value.
Security and Access Control: Since this is for enterprise internal use, ensure the integrations respect any access controls. For instance, if some documents are confidential and only certain users can query them, enforce that either at retrieval time or by separating workspaces. Implement API authentication for the backend if needed (even simple API keys or OAuth if integrating with corporate SSO).
Configuration Management: Use config files or environment variables for things like API keys, model choices, etc., rather than hardcoding. Integrators should set up a .env or config system and document how to switch (say from OpenAI to Azure OpenAI).
Stay Modular: Make each integration loosely coupled. E.g., the UI calls a backend route – whether that route uses AnythingLLM or some other library inside is abstracted from the UI. This modularity means if we swap a component, other parts don’t break. The integration engineer ensures clean interfaces between components (maybe following MCP standards or RESTful principles).

User Interface Integrator

Apply UI Kits Correctly: Get hands-on with the assistant-ui or Kibo UI documentation. Practice adding the components to a sample React project. Ensure you know how to customize things like the theme or minor layout tweaks without hacking the library (usually they allow className overrides or have context providers).
Connect UI to Backend API: Implement the API calls in the frontend. Likely using fetch/Axios to call our ask endpoint. Handle loading states – e.g., disable the input and show “Thinking…” while waiting for response.
Render AI Responses Safely: The AI will return markdown. Use a safe markdown renderer to display it (to avoid any malicious HTML if it ever came through). Many UI kits or libraries like React Markdown can do this. Enable the features we need (e.g., rendering footnote links).
Implement Citation UI: Make the citations clickable. For example, if the answer text has [1] and [2] as footnotes, you can map those to a list of source links below or a popover. This likely involves capturing the sources from the backend JSON (the backend might return the answer and an array of sources). Ensure when user clicks, it either opens the document or shows the snippet.
Responsive and Accessible Design: Ensure the chat UI works on different screen sizes (maybe people will use it on their phones or small laptop screens). Use the responsive classes from Tailwind or the UI kit’s grid system. Also ensure basic accessibility – e.g., screen reader labels, focus order (this is usually handled by shadcn/ui best practices, but keep an eye out).
Test with Real Users: As the UI integrator, do some UAT yourself from the front end perspective. Does the flow make sense? Is it clear when the system is loading? Are errors handled (e.g., if backend 500s, show a “Sorry, something went wrong” in the chat). Let a few users try it and incorporate their feedback on UI/UX.
Iterate UI Features: Maybe users want the ability to upload a file for context, or save a conversation, etc. Gather these enhancement requests and see if our chosen UI kit supports them (for example, assistant-ui might support multi-conversation sessions). Plan these for future sprints without bloating the initial release.
Maintain Clean UI Code: Even though we rely on a kit, our own UI code (pages, integration logic) should be clean and commented. Follow the project conventions (if using Next.js, follow its file structuring, etc.). This makes it easier for new team members to jump in, especially since we might scale the team to maintain the UI separately.

By rotating through or combining these roles, our small team (3-5 people) covers all necessary skills to deliver the product. One person might be primarily backend (MCP, integration) and another frontend, but everyone should understand the whole picture. The list above serves as a checklist for each area of responsibility – it can be used to onboard new team members or to self-assess if we’ve covered all bases.

As AI-native systems require precision in prompting and context design, refer to the Everyone Is an Individual Contributor Manifesto for key principles on context engineering and how every role becomes an IC in the AI-native world.

On this page