Page cover image

Bots

Bots

Bots are the foundational object of the Patterns platform and can be broken down into two layers:

  1. The brain (aka the LLM)

  2. The body (aka the data)

The brain is an LLM powered by OpenAI's GPT-4 with a context management system that's responsible for retrieving relevant bits of information to aid in completing it's task such as generating correct SQL to answer a question. When asked a question about a table, the Analyst will retrieve the tables schema, a sample of the data, and the tables documentation readme.

The body is all of the data and metadata connected to the Analyst, comprehensively this includes the connected database, the databases metadata such as schemas, data samples, and definitions, and added documents in the form of markdown files.

You can create multiple Analysts that have different data and context providing different behaviors and functionality. For example, a company my have an Analyst that specializes in analytics for marketing data, while the finance team may have a separate Analyst who only has access to sensitive financial data and specializes in queries against that dataset.

System Prompt

The base Analyst has a system prompt that instructs it how to execute database analytics within the environment we provide for it:

The platform works exactly as described above. The AI Analyst will generate code, that gets sent to our proprietary execution system for retrieving data and generating data visualizations.

Context Metadata

When you add extra context, your Analyst will generate correct answers more often.

Implied Table Context

When you connect data and load tables, your Analyst automatically adds the table schema and a sample of the data to the Analysts Context.

Depending on the structure of your data, with implied context alone, your Analyst should have good performance for databases with up to ~30 tables. If your data is in a raw format and hasn't yet been modeled for analytics, you may need to add more context and/or model more interpretable tables.

  • Schemas are automatically pulled from the database, including column names, types, and foreign key relationships.

  • Table Readme’s include optional column definitions, or any tips and tricks that someone should know when querying a table. For example, you can tell your Analyst that when querying the funding_rounds table to use seed, series_a, series_b, or series_c. Or say if you were calculating MRR and needed to filter out one-time payments be sure to only select payments where is_recurring=1.

Documents as Context

You can add arbitrary text in the form of markdown documents to the Agent's Context. Define metrics, document APIs, or load an entire knowledge base. Patterns will index the context to retrieve relevant Documents when answering a question.

Custom Global Context

  • A custom prompt that is appended to every conversation. For example, here you can tell Patterns:

    • about your business — we're a VC fund and you are a venture capital data analyst

    • tell it how you prefer it to write SQL — order all of your data lists and charts in descending order with the biggest first, always exclude nulls.

    • hints for how to answer analytical questions like — To search for similar companies within an industry or sector, use ILIKE with category_tags from the companies table. Use either/or to generate a union of matches to any of the category_tags.

    • give it hints for how you prefer it to retrieve data such as — when users ask generic questions like "what investments did a16z make this year" in your response provide a comprehensive list of columns relevant columns

    • or even give it personality and tell it how to respond to users — respond like yoda

DBT and other metadata repositories

We support native integrations with tools like DBT. You can copy/paste DBT models and other similar assets into Table Readme's and Documents, in the future you will be able to automatically sync Patterns with your DBT cloud or core project.

Analyses as Context

When answering a question, the system consults previously saved queries to provide more efficient and contextually relevant solutions. For instance, to calculate a specific customer's Lifetime Value (LTV), the system can refer to an older query that performed a similar calculation, and then modify it by updating the 'where' clause to focus on the desired customer.

An Analysis is a comprehensive package that includes a SQL file, a data table, a Vega-Lite chart, and a descriptive file. Storing these Analyses boosts the system's performance, especially when users ask questions that pertain to already saved Analyses.

The system also benefits from hints on how to approach analytical questions. For instance, if you're looking to identify companies within a specific sector, the hint would suggest using the ILIKE SQL operator in conjunction with category_tags from the 'companies' table. The use of 'either/or' is recommended to create a union, thereby expanding the scope of matches based on any of the specified category_tags.

Last updated