Skip to Content
GuidesData Model

Data Model

The NAICS MCP Server uses DuckDB with six tables supporting semantic search, hierarchical navigation, and classification workflows.

naics_nodes

The core table containing all 2,125 NAICS 2022 codes with hierarchical references and embedding source text.

ColumnTypeDescription
node_codevarcharPrimary key. 2-6 digit NAICS code
levelvarcharsector, subsector, industry_group, naics_industry, national_industry
titlevarcharOfficial classification title
descriptiontextFull description (nullable)
sector_codevarcharFK to parent sector (2-digit)
subsector_codevarcharFK to parent subsector (3-digit)
industry_group_codevarcharFK to parent industry group (4-digit)
naics_industry_codevarcharFK to parent NAICS industry (5-digit)
raw_embedding_texttextConcatenated text used for embedding generation
change_indicatorvarcharChanges from NAICS 2017 version
is_trilateralbooleanTrue if code is common across US, Canada, Mexico

naics_embeddings

384-dimensional vector embeddings for semantic similarity search, generated using sentence-transformers (all-MiniLM-L6-v2).

ColumnTypeDescription
node_codevarcharPrimary key, FK to naics_nodes
embeddingfloat[384]384-dimensional vector
embedding_textvarcharText that was embedded

naics_index_terms

20,398 official index terms from the NAICS specification, enabling broad search coverage.

ColumnTypeDescription
term_idintegerPrimary key
naics_codevarcharFK to naics_nodes
index_termvarcharOfficial index term (e.g., “Pizza delivery services”)
term_normalizedvarcharLowercase version for case-insensitive search

naics_cross_references

4,601 cross-references linking related codes with exclusions, see-also links, and includes.

ColumnTypeDescription
ref_idintegerPrimary key
source_codevarcharFK to naics_nodes (code containing reference)
reference_typevarcharexcludes, see_also, includes
reference_texttextOriginal reference text
target_codevarcharFK to naics_nodes (nullable)
excluded_activityvarcharSpecific activity that is excluded

sic_naics_crosswalk

Mapping between legacy SIC codes and NAICS codes for migration support.

ColumnTypeDescription
sic_codevarcharStandard Industrial Classification code
naics_codevarcharFK to naics_nodes
relationship_typevarchardirect (1:1), partial, or split

classification_workbook

Session-based storage for classification decisions, supporting audit trails and iterative refinement.

ColumnTypeDescription
entry_idvarcharPrimary key
form_typevarcharType of classification form
labelvarcharHuman-readable label
contentjsonStructured form content
metadatajsonAdditional metadata
session_idvarcharSession identifier
parent_entry_idvarcharFK to parent entry for threading
tagsjsonCategorization tags
confidence_scorefloatClassification confidence (0-1)
created_attimestampEntry creation time
search_textvarcharFull-text search field