Comment on page
The AI Analyst is the foundation of the Patterns data platform and can be broken down into two layers:
- 1.The brain (aka the LLM)
- 2.The body (aka the data)
The brain is an LLM powered by OpenAI's GPT-4 with a context management system that's responsible for retrieving relevant bits of information to aid in completing it's task such as generating correct SQL to answer a question. When asked a question about a table, the Analyst will retrieve the tables schema, a sample of the data, and the tables documentation readme.
The body is all of the data and metadata connected to the Analyst, comprehensively this includes the connected database, the databases metadata such as schemas, data samples, and definitions, and added documents in the form of markdown files.
You can create multiple Analysts that have different data and context providing different behaviors and functionality. For example, a company my have an Analyst that specializes in analytics for marketing data, while the finance team may have a separate Analyst who only has access to sensitive financial data and specializes in queries against that dataset.
The base Analyst has a system prompt that instructs it how to execute database analytics within the environment we provide for it:
The platform works exactly as described above. The AI Analyst will generate code, that gets sent to our proprietary execution system for retrieving data and generating data visualizations.
When you add extra context, your Analyst will generate correct answers more often.
When you connect data and load tables, your Analyst automatically adds the table schema and a sample of the data to the Analysts Context.
Depending on the structure of your data, with implied context alone, your Analyst should have good performance for databases with up to ~30 tables. If your data is in a raw format and hasn't yet been modeled for analytics, you may need to add more context and/or model more interpretable tables.
- Schemas are automatically pulled from the database, including column names, types, and foreign key relationships.
- Table Readme’s include optional column definitions, or any tips and tricks that someone should know when querying a table. For example, you can tell your Analyst that when querying the
funding_roundstable to use
series_c. Or say if you were calculating MRR and needed to filter out one-time payments be sure to
only select payments where is_recurring=1.
You can add arbitrary text in the form of markdown documents to the Agent's Context. Define metrics, document APIs, or load an entire knowledge base. Atoma will index the context to retrieve relevant Documents when answering a question.
- A custom prompt that is appended to every conversation. For example, here you can tell Atoma:
- about your business —
we're a VC fund and you are a venture capital data analyst
- tell it how you prefer it to write SQL —
order all of your data lists and charts in descending order with the biggest first, always exclude nulls.
- hints for how to answer analytical questions like —
To search for similar companies within an industry or sector, use ILIKE with category_tags from the companies table. Use either/or to generate a union of matches to any of the category_tags.
- give it hints for how you prefer it to retrieve data such as —
when users ask generic questions like "what investments did a16z make this year" in your response provide a comprehensive list of columns relevant columns
- or even give it personality and tell it how to respond to users —
respond like yoda
We support native integrations with tools like DBT. You can copy/paste DBT models and other similar assets into Table Readme's and Documents, in the future you will be able to automatically sync Patterns with your DBT cloud or core project.
When answering a question, the system consults previously saved queries to provide more efficient and contextually relevant solutions. For instance, to calculate a specific customer's Lifetime Value (LTV), the system can refer to an older query that performed a similar calculation, and then modify it by updating the 'where' clause to focus on the desired customer.
An Analysis is a comprehensive package that includes a SQL file, a data table, a Vega-Lite chart, and a descriptive file. Storing these Analyses boosts the system's performance, especially when users ask questions that pertain to already saved Analyses.
The system also benefits from hints on how to approach analytical questions. For instance, if you're looking to identify companies within a specific sector, the hint would suggest using the
ILIKESQL operator in conjunction with
category_tagsfrom the 'companies' table. The use of 'either/or' is recommended to create a union, thereby expanding the scope of matches based on any of the specified