Stand up a RAG pipeline from a SharePoint folder to Pinecone
Когда использовать: You have a corporate doc dump and want a Claude-queryable index.
Предварительные требования
- Server/skill installed and authenticated — See repo README
Поток
-
Define the pipelineCreate an Unstructured pipeline: source SharePoint folder X, partition by_title with 1024 token max, embed with text-embedding-3-small, target Pinecone index 'corp-docs'.✓ Скопировано→ Pipeline id
-
Run and monitorRun it and tell me when it's done. Report any failed documents.✓ Скопировано→ Status updates + final summary
Итог: Production-grade ingest with proper chunking — not naive PDF text dumps.
Подводные камни
- Default chunkers can split tables across chunks. For dense tabular docs, use 'by_title' with
combine_text_under_n_chars. — Default chunkers can split tables across chunks. For dense tabular docs, use 'by_title' withcombine_text_under_n_chars.